화학공학소재연구정보센터
IEEE Transactions on Automatic Control, Vol.44, No.3, 628-631, 1999
Constrained dynamic programming with two discount factors: Applications and an algorithm
We consider a discrete time Markov Decision Process, where the objectives are linear combinations of standard discounted rewards, each with a different discount factor, We describe several applications that motivate the recent interest in these criteria, For the special case where a standard discounted cost is to be minimized, subject to a constraint on another standard discounted cast but with a different discount factor, we provide an implementable algorithm for computing an optimal policy.