Automatica, Vol.79, 108-114, 2017
A stability criterion for two timescale stochastic approximation schemes
We present the first sufficient conditions that guarantee stability of two-timescale stochastic approximation schemes. Our analysis is based on the ordinary differential equation (ODE) method and is an extension of the results in Borkar and Meyn (2000) for single-timescale schemes. As an application of our result, we show the stability of iterates in a two-timescale stochastic approximation scheme arising in reinforcement learning. (C) 2016 Published by Elsevier Ltd.
Keywords:Simulation;Two-timescale stochastic approximation;Stability of iterates;Limiting ODE;Reinforcement learning