Risk-Constrained Markov Decision Processes

Borkar V; Jain R

IEEE Transactions on Automatic Control, Vol.59, No.9, 2574-2579, 2014

DOI10.1109/TAC.2014.2309262 Export Citation

Risk-Constrained Markov Decision Processes

We propose a new constrained Markov decision process framework with risk-type constraints. The risk metric we use is Conditional Value-at-Risk (CVaR), which is gaining popularity in finance. It is a conditional expectation but the conditioning is defined in terms of the level of the tail probability. We propose an iterative offline algorithm to find the risk-contrained optimal control policy. A two time-scale stochastic approximation-inspired 'learning' variant is also sketched, and its convergence proved to the optimal risk-constrained policy.

Keywords:Constrained Markov decision processes;risk measures;stochastic approximations