Average cost dynamic programming equations for controlled Markov chains with partial observations

Borkar VS

SIAM Journal on Control and Optimization, Vol.39, No.3, 673-681, 2000

DOI10.1137/S0363012998345172 Export Citation

Average cost dynamic programming equations for controlled Markov chains with partial observations

The value function for the average cost control of class of partially observed Markov chains is derived as the vanishing discount limit, in suitable sense, of the value functions for the corresponding discounted cost problems. The limiting procedure is justified by bounds derived using a simple coupling argument.

Keywords:average cost control;controlled Markov chains;partial observations;dynamic programming;vanishing discount limit