Stochastic optimal control problems are usually solved by writing the dynamic programming equation and subsequent evaluation of the associated value function using various tools, depending on the context. In this talk, we discuss the possibility of arriving at the value function by solving a deterministic control problem on individual sample paths. We base our discussion on Rogers[2008]^, in which the means of doing deterministic optimization independently on every sample path for solving a general stochastic control problem, was first established. This amazing connection between stochastic and deterministic optimal control by means of writing out a dual for value function can lead to new computational techniques.
In this talk, we consider a discrete-time Markov process that can be controlled; we introduce the optimization problem involved and discuss the usual terminologies associated with the control of such a process. Then, we proceed to represent the objective function, in a dual form, very similar to how we do in Lagrange multipliers way of solving constrained-optimization problem. We also discuss the implications of this dual representation on arriving at new computational procedures.
*Nothing more than familiarity with discrete-time Markov processes is assumed.*
Rogers, L. C. G. Pathwise stochastic optimal control. SIAM J. Control Optim. 46 (2007)