화학공학소재연구정보센터
검색결과 : 6건
No. Article
1 Renewal Monte Carlo: Renewal Theory-Based Reinforcement Learning
Subramanian J, Mahajan A
IEEE Transactions on Automatic Control, 65(8), 3663, 2020
2 GLOBAL CONVERGENCE OF POLICY GRADIENT METHODS TO (ALMOST) LOCALLY OPTIMAL POLICIES
Zhang KQ, Koppel A, Zhu H, Basar T
SIAM Journal on Control and Optimization, 58(6), 3586, 2020
3 Deep reinforcement learning of energy management with continuous control strategy and traffic information for a series-parallel plug-in hybrid electric bus
Wu YK, Tan HC, Peng JK, Zhang HL, He HW
Applied Energy, 247, 454, 2019
4 Sequential Decision Making With Coherent Risk
Tamar A, Chow Y, Ghavamzadeh M, Mannor S
IEEE Transactions on Automatic Control, 62(7), 3323, 2017
5 Optimization of Markov decision processes under the variance criterion
Xia L
Automatica, 73, 269, 2016
6 Natural actor-critic algorithms
Bhatnagar S, Sutton RS, Ghavamzadeh M, Lee M
Automatica, 45(11), 2471, 2009