화학공학소재연구정보센터
학회 한국화학공학회
학술대회 2021년 가을 (10/27 ~ 10/29, 광주 김대중컨벤션센터)
권호 27권 2호, p.1482
발표분야 공정시스템
제목 Q-MPC: Integrated Algorithm of Model-free Reinforcement Learning and Model Predictive Control
초록 Optimizing the substrate feeding strategy of a bioreactor is one of the challenging tasks. In this work, we propose an integrated algorithm of model-free RL and model predictive control (MPC) that improves the initial control policy only with small number of data points. Similar to MPC, the proposed algorithm adopts the receding horizon principle and assigns the action-value function, which learns from the plant data, as the terminal cost. In this way, the adaptation of system dynamics can be achieved without modifying the model. On the other hand, the learning of the action-value function is performed with the conventional deep reinforcement learning with double Q-learning (DDQN) algorithm in off-policy fashion. The proposed method is the generalization of the DDQN and MPC. For the simulation study, the proposed method is applied to the penicillin product semi-batch bioprocess where the system dynamics are structurally different from the model used in MPC. For the comparison, DDQN, deep deterministic policy gradient (DDPG), and differential dynamic programming (DDP) algorithms are applied to the bioprocess with same conditions.
저자 오태훈, 이종민
소속 서울대
키워드 공정제어
E-Mail
원문파일 초록 보기