Bloch sphere: Difference between revisions

Revision as of 12:53, 15 January 2014

Template:More footnotes The Hamilton–Jacobi–Bellman (HJB) equation is a partial differential equation which is central to optimal control theory. The solution of the HJB equation is the 'value function', which gives the optimal cost-to-go for a given dynamical system with an associated cost function.

When solved locally, the HJB is a necessary condition, but when solved over the whole of state space, the HJB equation is a necessary and sufficient condition for an optimum. The solution is open loop, but it also permits the solution of the closed loop problem. The HJB method can be generalized to stochastic systems as well.

Classical variational problems, for example the brachistochrone problem, can be solved using this method.

The equation is a result of the theory of dynamic programming which was pioneered in the 1950s by Richard Bellman and coworkers.^[1] The corresponding discrete-time equation is usually referred to as the Bellman equation. In continuous time, the result can be seen as an extension of earlier work in classical physics on the Hamilton-Jacobi equation by William Rowan Hamilton and Carl Gustav Jacob Jacobi.

Optimal control problems

Consider the following problem in deterministic optimal control over the time period $[0,T]$ :

V(x(0),0)=\min _{u}\left\{\int _{0}^{T}C[x(t),u(t)]\,dt+D[x(T)]\right\}

where C[ ] is the scalar cost rate function and D[ ] is a function that gives the economic value or utility at the final state, x(t) is the system state vector, x(0) is assumed given, and u(t) for 0 ≤ t ≤ T is the control vector that we are trying to find.

The system must also be subject to

{\dot {x}}(t)=F[x(t),u(t)]\,

where F[ ] gives the vector determining physical evolution of the state vector over time.

The partial differential equation

For this simple system, the Hamilton Jacobi Bellman partial differential equation is

{\dot {V}}(x,t)+\min _{u}\left\{\nabla V(x,t)\cdot F(x,u)+C(x,u)\right\}=0

subject to the terminal condition

V(x,T)=D(x),\,

where the $a\cdot b$ means the dot product of the vectors a and b and $\nabla$ is the gradient operator.

The unknown scalar $V(x,t)$ in the above PDE is the Bellman 'value function', which represents the cost incurred from starting in state $x$ at time $t$ and controlling the system optimally from then until time $T$ .

Deriving the equation

Intuitively HJB can be "derived" as follows. If $V(x(t),t)$ is the optimal cost-to-go function (also called the 'value function'), then by Richard Bellman's principle of optimality, going from time t to t + dt, we have

V(x(t),t)=\min _{u}\left\{C(x(t),u(t))\,dt+V(x(t+dt),t+dt)\right\}.

Note that the Taylor expansion of the last term is

V(x(t+dt),t+dt)=V(x(t),t)+{\dot {V}}(x(t),t)\,dt+\nabla V(x(t),t)\cdot {\dot {x}}(t)\,dt+o(dt),

where o(dt) denotes the terms in the Taylor expansion of higher order than one. Then if we cancel V(x(t), t) on both sides, divide by dt, and take the limit as dt approaches zero, we obtain the HJB equation defined above.

Solving the equation

The HJB equation is usually solved backwards in time, starting from $t=T$ and ending at $t=0$ .

When solved over the whole of state space, the HJB equation is a necessary and sufficient condition for an optimum.^[2] If we can solve for $V$ then we can find from it a control $u$ that achieves the minimum cost.

In general case, the HJB equation does not have a classical (smooth) solution. Several notions of generalized solutions have been developed to cover such situations, including viscosity solution (Pierre-Louis Lions and Michael Crandall), minimax solution (Andrei Izmailovich Subbotin), and others.

Extension to stochastic problems

The idea of solving a control problem by applying Bellman's principle of optimality and then working out backwards in time an optimizing strategy can be generalized to stochastic control problems. Consider similar as above

\min \left\{\int _{0}^{T}C(t,X_{t},u_{t})\,dt+D(X_{T})\right\}

now with $(X_{t})_{t\in [0,T]}\,\!$ the stochastic process to optimize and $(u_{t})_{t\in [0,T]}\,\!$ the steering. By first using Bellman and then expanding $V(X_{t},t)$ with Itô's rule, one finds the stochastic HJB equation

\min _{u}\left\{{\mathcal {A}}V(x,t)+C(t,x,u)\right\}=0,

where ${\mathcal {A}}$ represents the stochastic differentiation operator, and subject to the terminal condition

V(x,T)=D(x)\,\!.

Note that the randomness has disappeared. In this case a solution $V\,\!$ of the latter does not necessarily solve the primal problem, it is a candidate only and a further verifying argument is required. This technique is widely used in Financial Mathematics to determine optimal investment strategies in the market (see for example Merton's portfolio problem).

Application to LQG Control

As an example, we can look at a system with linear stochastic dynamics and quadratic cost. If the system dynamics is given by

dx_{t}=(ax_{t}+bu_{t})dt+\sigma dw_{t},

and the cost accumulates at rate $C(x_{t},u_{t})=r(t)u_{t}^{2}/2+q(t)x_{t}^{2}/2$ , the HJB equation is given by

-{\frac {\partial V(x,t)}{\partial t}}={\frac {1}{2}}q(t)x^{2}+{\frac {\partial V(x,t)}{\partial x}}ax-{\frac {b^{2}}{2r(t)}}\left({\frac {\partial V(x,t)}{\partial x}}\right)^{2}+\sigma {\frac {\partial ^{2}V(x,t)}{\partial x^{2}}}.

Assuming a quadratic form for the value function, we obtain the usual Riccati equation for the Hessian of the value function as is usual for Linear-quadratic-Gaussian control.

References

43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro.

R.E Bellman: Dynamic Programming and a new formalism in the calculus of variations. Proc. Nat. Acad. Sci. 40 1954 231-235.
R.E Bellman: Dynamic Programming, Princeton 1957.
R. Bellman & S. Dreyfus: An application of dynamic programming to the determination of optimal satellite trajectories. J. Brit.Interplanet. Soc. 17 1959 78-83.

@@ Line 1: / Line 1: @@
+{{more footnotes|date=October 2010}}
+The '''Hamilton–Jacobi–Bellman (HJB) equation''' is a [[partial differential equation]] which is central to [[optimal control]] theory. The solution of the HJB equation is the 'value function', which gives the optimal cost-to-go for a given [[dynamical system]] with an associated cost function.
+When solved locally, the HJB is a necessary condition, but when solved over the whole of state space, the HJB equation is a [[necessary and sufficient condition]] for an optimum. The solution is open loop, but it also permits the solution of the closed loop problem. The HJB method can be generalized to [[stochastic]] systems as well.
-Most states wait until June, July or even August to host their family backyard get togethers and festivities. Here in Tucson, Arizona, though the weather excellent for by mid-March for outdoor activities, and Tucsonans are ready to begin grilling!<br><br><br><br>On determining the type, make sure you pay particular attention to the relationship of your interior floor intend. Obviously you'll want try advantage of existing via. If your intended position doesn't grant this, consider making an opening and installing a sliding patio door that permits for greater access nicely more roomier traffic brook. Check of course along with local building inspector for your relevant building code an individual are cut an opening. A new header for the doorway opening will need to be in their place before getting rid of of a structural wall.<br><br>Look for ways the patient can be useful---Maybe an individual love can't mow the lawn or trim the hedges, but sit him down in a chair having a task bringing in that exhausting. For example, get him to fold paper bags or sort papers. Encourage them think about part in church outreach, such as singing going at a nursing personal. It's so vital that let them know they're still useful and a blessing to others.<br><br>In this process, some branches may situated at the crown on the tree are removed. This is done for the trees in the footpaths and roads. Assist the better light penetration and raises the scenery just about.<br><br>The four-stroke engines take four steps to convert fuel to energy, hence the name. These steps are intake, compression, combustion and dimish. They are not as noisy for the two-stroke engines and of course, website mix fuel for them.<br><br>Your lawn can be used care of once full week with a landscaper. You hire these do a multitude of different designs in your front design. Cross mowing, checkering, lines or maybe just plain buttoning a shirt are all different designs so that you can pick producing your grass look in top physical structure. A landscaper will rake all of the chopped grass for you so your roots don't die.<br><br>When purchasing a brush cutter always understand it and swing it intending to gauge the weight. Not everybody can wield a brush cutter satisfactorily. And please remember it often be somewhat heavier when around the globe full of fuel. You will also should try to wear goggles and suitable protective clothing such as gloves and boots built the brush cutter. Be certain you appear at instructions carefully before starting it to guarantee that you hold it in the correct point of view.<br><br>We Actually take on calculated risk or the market will not pay us for services. In addition, car has to relocate far enough to develop a profit without letting the expense of protection eat us up. Excessive protection (risk avoidance) arrive in the structure of option premiums, too close-in stop loss orders - and overdone, complex spread strategies. Matching a forecast to a strategy is one particular skill be successful in commodity trading.<br><br>In case you loved this article and you want to receive more information regarding [http://www.hedgingplants.com/ hedgingplants hedges] kindly visit the webpage.
+Classical variational problems, for example the [[brachistochrone problem]], can be solved using this method.
+The equation is a result of the theory of [[dynamic programming]] which was pioneered in the 1950s by [[Richard Bellman]] and coworkers.<ref>R. E. Bellman. Dynamic Programming. Princeton, NJ, 1957.</ref> The corresponding discrete-time equation is usually referred to as the [[Bellman equation]]. In continuous time, the result can be seen as an extension of earlier work in [[classical physics]] on the [[Hamilton-Jacobi equation]] by [[William Rowan Hamilton]] and [[Carl Gustav Jacob Jacobi]].
+==Optimal control problems==
+Consider the following problem in deterministic optimal control over the time period <math>[0,T]</math>:
+:<math>V(x(0), 0) = \min_u \left\{ \int_0^T C[x(t),u(t)]\,dt + D[x(T)] \right\}</math>
+where C[ ] is the scalar cost rate function and ''D''[ ] is a function that gives the economic value or utility at the final state, ''x''(''t'') is the system state vector, ''x''(0) is assumed given, and ''u''(''t'') for 0&nbsp;&le;&nbsp;''t''&nbsp;&le;&nbsp;''T'' is the control vector that we are trying to find.
+The system must also be subject to
+:<math> \dot{x}(t)=F[x(t),u(t)] \, </math>
+where ''F''[ ] gives the vector determining physical evolution of the state vector over time.
+==The partial differential equation==
+For this simple system, the Hamilton Jacobi Bellman partial differential equation is
+:<math>
+\dot{V}(x,t) + \min_u \left\{  \nabla V(x,t) \cdot F(x, u) + C(x,u) \right\} = 0
+</math>
+subject to the terminal condition
+:<math>
+V(x,T) = D(x),\,
+</math>
+where the <math>a \cdot b</math> means the [[dot product]] of the vectors a and b and <math>\nabla</math> is the [[gradient]] operator.
+The unknown scalar <math>V(x, t)</math> in the above PDE is the Bellman '[[value function]]', which represents the cost incurred from starting in state <math>x</math> at time <math>t</math> and controlling the system optimally from then until time <math>T</math>.
+==Deriving the equation==
+Intuitively HJB can be "derived" as follows. If <math>V(x(t), t)</math> is the optimal cost-to-go function (also called the 'value function'), then by Richard Bellman's [[principle of optimality]], going from time ''t'' to ''t''&nbsp;+&nbsp;''dt'', we have
+:<math> V(x(t), t) = \min_u \left\{ C(x(t), u(t)) \, dt  + V(x(t+dt), t+dt) \right\}. </math>
+Note that the [[Taylor expansion]] of the last term is
+:<math> V(x(t+dt), t+dt) = V(x(t), t) + \dot{V}(x(t), t) \, dt + \nabla V(x(t), t) \cdot \dot{x}(t) \, dt + o(dt),</math>
+where o(''dt'') denotes the terms in the Taylor expansion of higher order than one. Then if we cancel ''V''(''x''(''t''),&nbsp;''t'') on both sides, divide by ''dt'', and take the limit as ''dt'' approaches zero, we obtain the HJB equation defined above.
+==Solving the equation==
+The HJB equation is usually [[Backward induction|solved backwards in time]], starting from <math>t = T</math> and ending at <math>t = 0</math>.
+When solved over the whole of state space, the HJB equation is a [[necessary and sufficient condition]] for an optimum.<ref>Dimitri P Bertsekas. Dynamic programming and optimal control. Athena Scientific, 2005.</ref> If we can solve for <math>V</math> then we can find from it a control <math>u</math> that achieves the minimum cost.
+In general case, the HJB equation does not have a classical (smooth) solution. Several notions of generalized solutions have been developed to cover such situations, including [[viscosity solution]] ([[Pierre-Louis Lions]] and [[Michael Crandall]]), [[minimax solution]] ([[Andrei Izmailovich Subbotin]]), and others.
+==Extension to stochastic problems==
+The idea of solving a control problem by applying Bellman's principle of optimality and then working out backwards in time an optimizing strategy can be generalized to stochastic control problems. Consider similar as above
+:<math> \min \left\{ \int_0^T C(t,X_t,u_t)\,dt + D(X_T) \right\}</math>
+now with <math>(X_t)_{t \in [0,T]}\,\!</math> the stochastic process to optimize and <math>(u_t)_{t \in [0,T]}\,\!</math> the steering. By first using Bellman and then expanding <math>V(X_t,t)</math> with [[Itō_calculus#It.C5.8D.27s_lemma|Itô's rule]], one finds the stochastic HJB equation
+:<math>
+\min_u \left\{ \mathcal{A} V(x,t) + C(t,x,u) \right\} = 0,
+</math>
+where <math>\mathcal{A}</math> represents the stochastic differentiation operator, and subject to the terminal condition
+:<math>
+V(x,T) = D(x)\,\!.
+</math>
+Note that the randomness has disappeared. In this case a solution <math>V\,\!</math> of the latter does not necessarily solve the primal problem, it is a candidate only and a further verifying argument is required. This technique is widely used in Financial Mathematics to determine optimal investment strategies in the market (see for example [[Merton's portfolio problem]]).
+===Application to LQG Control===
+As an example, we can look at a system with linear stochastic dynamics and quadratic cost. If the system dynamics is given by
+:<math>
+dx_t = (a x_t + b u_t) dt + \sigma dw_t,
+</math>
+and the cost accumulates at rate <math>C(x_t,u_t) = r(t) u_t^2/2 + q(t) x_t^2/2</math>, the HJB equation is given by
+:<math>
+-\frac{\partial V(x,t)}{\partial t} = \frac{1}{2}q(t) x^2 + \frac{\partial V(x,t)}{\partial x} a x - \frac{b^2}{2 r(t)} \left(\frac{\partial V(x,t)}{\partial x}\right)^2 + \sigma \frac{\partial^2 V(x,t)}{\partial x^2}.
+</math>
+Assuming a quadratic form for the value function, we obtain the usual [[Riccati equation]] for the Hessian of the value function as is usual for [[Linear-quadratic-Gaussian control]].
+==See also==
+* [[Bellman equation]], discrete-time counterpart of the Hamilton–Jacobi–Bellman equation
+* [[Pontryagin's minimum principle]], necessary but not sufficient condition for optimum, by minimizing a Hamiltonian, but this has the advantage over HJB of only needing to be satisfied over the single trajectory being considered.
+== References ==
+{{Reflist}}
+* R.E Bellman: Dynamic Programming and a new formalism in the calculus of variations.  Proc. Nat. Acad. Sci. 40 1954 231-235.
+* R.E Bellman: Dynamic Programming, Princeton 1957.
+* R. Bellman & S. Dreyfus: An application of dynamic programming to the determination of optimal satellite trajectories. J. Brit.Interplanet. Soc. 17 1959 78-83.
+==Further reading==
+* {{cite book
+ | author = [[Dimitri P. Bertsekas]]
+ | year = 2005
+ | title = Dynamic programming and optimal control
+ | publisher = Athena Scientific
+ | isbn =
+}}
+{{DEFAULTSORT:Hamilton-Jacobi-Bellman equation}}
+[[Category:Partial differential equations]]
+[[Category:Optimal control]]
+[[Category:Dynamic programming]]
+[[Category:Stochastic control]]

Bloch sphere: Difference between revisions

Revision as of 12:53, 15 January 2014

Contents

Optimal control problems

The partial differential equation

Deriving the equation

Solving the equation

Extension to stochastic problems

Application to LQG Control

See also

References

Further reading

Navigation menu

Bloch sphere: Difference between revisions

Revision as of 12:53, 15 January 2014

Optimal control problems

The partial differential equation

Deriving the equation

Solving the equation

Extension to stochastic problems

Application to LQG Control

See also

References

Further reading

Navigation menu

Search