Constrained dynamic programming with two discount factors: Applications and an algorithm

Feinberg EA; Shwartz A

IEEE Transactions on Automatic Control, Vol.44, No.3, 628-631, 1999

We consider a discrete time Markov Decision Process, where the objectives are linear combinations of standard discounted rewards, each with a different discount factor, We describe several applications that motivate the recent interest in these criteria, For the special case where a standard discounted cost is to be minimized, subject to a constraint on another standard discounted cast but with a different discount factor, we provide an implementable algorithm for computing an optimal policy.

Keywords:MARKOV DECISION-MODELS