Computers & Chemical Engineering, Vol.33, No.12, 2111-2122, 2009
Controlled exploration of state space in off-line ADP and its application to stochastic shortest path problems
This paper addresses the problem of finding a control policy that drives a generic discrete event stochastic system from an initial state to a set of goal states with a specified probability. The control policy is iteratively constructed via an approximate dynamic programming (ADP) technique over a small subset of the state space that is evolved via Monte Carlo simulations. The effect of certain user-chosen parameters on the performance of the algorithm is investigated The method is evaluated on several stochastic shortest path (SSP) examples and on a manufacturing job shop problem. We solve SSP problems that contain up to one million states to illustrate the scaling of computational and memory benefits with respect to the problem size. In the case of the manufacturing job shop example. the proposed ADP approach outperforms a traditional rolling horizon math programming approach. (C) 2009 Elsevier Ltd. All rights reserved.
Keywords:Approximate dynamic programming;Markov decision process;Discrete time stochastic systems;Simulation;Controlled exploration