IEEE Transactions on Automatic Control, Vol.59, No.5, 1244-1257, 2014
Optimal Control of Markov Decision Processes With Linear Temporal Logic Constraints
In this paper, we develop a method to automatically generate a control policy for a dynamical system modeled as a Markov Decision Process (MDP). The control specification is given as a Linear Temporal Logic (LTL) formula over a set of propositions defined on the states of the MDP. Motivated by robotic applications requiring persistent tasks, such as environmental monitoring and data gathering, we synthesize a control policy thatminimizes the expected cost between satisfying instances of a particular proposition over all policies that maximize the probability of satisfying the given LTL specification. Our approach is based on the definition of a novel optimization problem that extends the existing average cost per stage problem. We propose a sufficient condition for a policy to be optimal, and develop a dynamic programming algorithm that synthesizes a policy that is optimal for a set of LTL specifications.