TY - JOUR
T1 - Linear quadratic tracking control of unknown systems
T2 - A two-phase reinforcement learning method
AU - Zhao, Jianguo
AU - Yang, Chunyu
AU - Gao, Weinan
AU - Modares, Hamidreza
AU - Chen, Xinkai
AU - Dai, Wei
N1 - Funding Information:
Wei Dai received the M.S. and Ph.D. degrees in control theory and control engineering from Northeastern University, Shenyang, China, in 2009 and 2015, respectively. From 2013 to 2015, he was as a Teaching Assistant with the State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University. He is currently a Professor and an Outstanding Young Backbone Teacher with the China University of Mining and Technology, Xuzhou, China. He acted as the project leader in several funded research projects (funds from the Natural Science Foundation of China, the Nature Science Foundation of Jiangsu Province, the Postdoctoral Science Foundation of China, and so on). His current research interests include modeling, optimization, and control of the complex industrial process, data mining, and machine learning.
Funding Information:
This work was supported by the National Natural Science Foundation of China under Grant 61873272 , Grant 62073327 , and Grant 62273350 , in part by the Natural Science Foundation of Jiangsu Province under Grant BK20200086 and Grant BK20200631 . The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor Kyriakos G. Vamvoudakis under the direction of Editor Miroslav Krstic.
Publisher Copyright:
© 2022
PY - 2023/2
Y1 - 2023/2
N2 - This paper considers the problem of linear quadratic tracking control (LQTC) with a discounted cost function for unknown systems. The existing design methods often require the discount factor to be small enough to guarantee the closed-loop stability. However, solving the discounted algebraic Riccati equation (ARE) may lead to ill-conditioned numerical issues if the discount factor is too small. By singular perturbation theory, we decompose the full-order discounted ARE into a reduced-order ARE and a Sylvester equation, which facilitate designing the feedback and feedforward control gains. The obtained controller is proved to be a stabilizing and near-optimal solution to the original LQTC problem. In the framework of reinforcement learning, both on-policy and off-policy two-phase learning algorithms are derived to design the near-optimal tracking control policy without knowing the discount factor. The advantages of the developed results are illustrated by comparative simulation results.
AB - This paper considers the problem of linear quadratic tracking control (LQTC) with a discounted cost function for unknown systems. The existing design methods often require the discount factor to be small enough to guarantee the closed-loop stability. However, solving the discounted algebraic Riccati equation (ARE) may lead to ill-conditioned numerical issues if the discount factor is too small. By singular perturbation theory, we decompose the full-order discounted ARE into a reduced-order ARE and a Sylvester equation, which facilitate designing the feedback and feedforward control gains. The obtained controller is proved to be a stabilizing and near-optimal solution to the original LQTC problem. In the framework of reinforcement learning, both on-policy and off-policy two-phase learning algorithms are derived to design the near-optimal tracking control policy without knowing the discount factor. The advantages of the developed results are illustrated by comparative simulation results.
KW - Discounted cost function
KW - Linear quadratic tracking control
KW - Reinforcement learning
KW - Singular perturbation theory
UR - http://www.scopus.com/inward/record.url?scp=85142830992&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85142830992&partnerID=8YFLogxK
U2 - 10.1016/j.automatica.2022.110761
DO - 10.1016/j.automatica.2022.110761
M3 - Article
AN - SCOPUS:85142830992
SN - 0005-1098
VL - 148
JO - Automatica
JF - Automatica
M1 - 110761
ER -