On terminating Markov decision processes with a risk-averse objective function

Patek SD

Automatica, Vol.37, No.9, 1379-1386, 2001

DOI10.1016/S0005-1098(01)00084-X Export Citation

On terminating Markov decision processes with a risk-averse objective function

We consider a class of terminating Markov decision processes with an exponential risk-averse objective function and compact constraint sets. We assume the existence of an absorbing cost-free terminal state Omega. positive transition costs, and continuity of the transition probability and cost functions. Without discounting future costs in the argument of the exponential utility function, we establish(i) the existence of a real-valued optimal cost function which can be achieved by a stationary policy and (ii) the convergence of value iteration and policy iteration to the unique solution of Bellman's equation. We illustrate the results with two computational examples.

Keywords:risk sensitivity;stochastic shortest paths;dynamic programming