Power, Control, and Data Processing Systems

Power, Control, and Data Processing Systems

The Formulation of a Linear Quadratic Regulator (LQR) for a System Characterized by Partial Differential Equations (PDE) Utilizing Reinforcement Learning Technique

Document Type : Original Research

Authors
Electrical and Computer Engineering Department, Qom University of Technology
Abstract
Reinforcement learning has emerged as a valuable tool in control theory and related areas, such as robotics and process control, facilitating prediction, identification, and management of complex systems. A key advantage of using reinforcement learning is its ability to derive optimal control policies without requiring a comprehensive understanding of the underlying system dynamics. In essence, RL develops control strategies through interaction with the environment. A common approach in controller design for these systems involves the use of approximation methods. Given the complexities associated with solving PDEs analytically, numerical techniques like the finite element method are typically employed for approximation. In this research, we begin by discretizing the PDE that characterizes the system's dynamics using an appropriate discretization technique. Following this step, we extract the discrete dynamics of the system. Since these discrete dynamics exhibit the Markov property where the future state depends only on the current state and the action taken the next phase involves designing a controller based on these derived dynamics. Recognizing that reinforcement learning focuses on optimizing future actions through data analysis, feedback from the optimal mode can be utilized as a viable option for controller design. This study will further explore LQR controller design for one-dimensional heat equation heat flow as a system whose dynamic is described with PDE.
Keywords
Subjects

[1]      R. Singh and B. Bhushan, "Reinforcement Learning-Based Model-Free Controller for Feedback Stabilization of Robotic Systems," in IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 10, pp. 7059-7073, Oct. 2023, doi: 10.1109/TNNLS.2021.3137548. 
[2]      K. Iqbal and M. Haras, "Reinforcement learning of LQR control policy by a double inverted-pendulum biomechanical model," 2023 IEEE International Conference on Industrial Technology (ICIT), Orlando, FL, USA, 2023, pp. 1-6, doi: 10.1109/ICIT58465.2023.10143150.
[3]      S.A. Asad rizvi and Z. Lin, " Reinforcement Learning-Based Linear Quadratic Regulation of Continuous-Time Systems Using Dynamic Output Feedback," IEEE Transactions on Cybernetics PP(99):1-10, doi: 10.1109/TCYB.2018.2886735.
[4]      S. Mukherjee and T. L. Vu, "Reinforcement Learning of Structured Stabilizing Control for Linear Systems With Unknown State Matrix," in IEEE Transactions on Automatic Control, vol. 68, no. 3, pp. 1746-1752, March 2023, doi: 10.1109/TAC.2022.3155384.
[5]      Hwang, R., Lee, J. Y., Shin, J. Y., & Hwang, H. J. (2022). Solving pde-constrained control problems using operator learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36, pp. 4504-4512.
[6]      Benosman, M., Chakrabarty, A., & Borggaard, J. (2020). Reinforcement learning-based model reduction for partial differential equations. IFAC-PapersOnLine, 7704-7709.
[7]      Farahmand, A. M., Nabi, S., & Nikovski, D. N. (2017). Deep reinforcement learning for partial differential Equation control. In 2017 American Control Conference (ACC), (pp. 3120-3127).
[8]      Voropai, R., Geletu, A., & Li, P. (2023). Model Predictive Control of Parabolic PDE Systems under Chance Constraints. Mathematics.
[9]      Dodhia, A. W. (2021). Machine learning-based model predictive control of diffusion-reaction processes. Chemical Engineering Research and Design, 173, 129-139
[10]   Yang, Y., Dubljevic, S., & Li, S. (2021). Economic model predictive control for transport-reaction systems with target profiles. Control Engineering Practice, 107, 104684.
[11]   Lamare, P. O., & Bekiaris-Liberis, N. (2015). Control of 2× 2 linear hyperbolic systems: Backstepping-based trajectory generation and PI-based tracking. Systems & Control Letters, 86, 24-33.
[12]   Luo, B., Wu, H. N., & Li, H. X. (2014). Adaptive optimal control of highly dissipative nonlinear spatially distributed processes with neuro-dynamic programming. IEEE transactions on neural networks and learning systems, 26(4), 684-696.
[13]   Krstic, M. (2013). Adaptive control of anti-stable wave PDE systems: theory and applications in oil drilling. IFAC Proceedings Volumes, 46(11), 432-439.
[14]   Wang, J. W., Tsai, S. H., Li, H. X., & Lam, H. K. (2018). Spatially piecewise fuzzy control design for sampled-data exponential stabilization of semilinear parabolic PDE systems. IEEE Transactions on Fuzzy Systems, 26(5), 2967-2980.
[15]   Luo, B., Wu, H. N., & Li, H. X. (2014). Adaptive optimal control of highly dissipative nonlinear spatially distributed processes with neuro-dynamic programming. IEEE transactions on neural networks and learning systems, 26(4), 684-696.
[16]   Yu, H., Park, S., Bayen, A., & Krstic, M. (2021). Reinforcement Learning versus PDE Backstepping and PI Control for Congested Freeway Traffic. IEEE Transactions on Control Systems Technology, 30(4). Retrieved from https://arxiv.org/abs/1904.12957
[17]   Peitz, S., Stenner, J., Chidananda, V., Wallscheid, O., Brunton, S., & Taira, K. (2024). Distributed control of partial differential equations using convolutional reinforcement learning. Physica D: Nonlinear Phenomena, 461.
[18]   Pirmorad, E., Khoshbakhtian, F., Mansouri, F., & Farahmand, A. (2021). Deep reinforcement learning for online control of stochastic partial differential equations. arXiv preprint, arXiv:2110.11265.
[19]   Kim, J. W., Park, B. J., & Yoo, H. (2020). A model-based deep reinforcement learning method applied to finite-horizon optimal control of nonlinear control-affine system. Journal of Process Control, 166-178.
[20]   He, W., H. Gao, C. Zhou, C. Yang, & Z. Li. (2021, Dec). Reinforcement Learning Control of a Flexible Two-Link Manipulator: An Experimental Investigation. vol. 51.
[21]   Phuong Nam Dao; Yen-Chen Liu;. (2022). Adaptive reinforcement learning in control design for cooperating manipulator systems. Asian Journal of Control.
[22]   I. Aksikas, A. Fuxman, J. F. Forbes, and J. J. Winkin, "LQ control design of a class of hyperbolic PDE systems: Application to fixed-bed reactor," Automatica, vol. 45, pp. 1542-1548, 2009.
[23]   I. Aksikas, J. J. Winkin, and D. Dochain, "Optimal LQ-feedback regulation of a nonisothermal plug flow reactor model by spectral factorization," IEEE Transactions on Automatic Control, vol. 52, pp. 1179-1193, 2007
[24]   J. Choi and K. S. Lee, "Model predictive control of concurrent first-order hyperbolic PDE systems," Industrial & engineering chemistry research, vol. 44, pp. 1812-1822, 2005
[25]   S. Dubljevic, P. Mhaskar, N. H. El-Farra, and P. D. Christofides, "Predictive control of transport reaction processes," Computers & chemical engineering, vol. 29, pp. 2335-2345, 2005.
[26]   P. D. Christofides and P. Daoutidis, "Feedback control of hyperbolic PDE systems," AIChE Journal, vol. 42, pp. 3063-3086, 1996.
[27]   J.-W. Wang, H.-N. Wu, and H.-X. Li, "Stochastically exponential stability and stabilization of uncertain linear hyperbolic PDE systems with Markov jumping parameters," Automatica, vol. 48, pp. 569-576, 2012.
[28]   E. M. Hanczyc and A. Palazoglu, "Sliding mode control of nonlinear distributed parameter chemical processes," Industrial & engineering chemistry research, vol. 34, pp. 557-566, 1995
[29]   H. Sira-Ramirez, "Distributed sliding mode control in systems described by quasilinear partial differential equations," Systems & Control Letters, vol. 13, pp. 177-181, 1989.
[30]   Lewis, Frank L. (2012). Reinforcement learning and feedback control using natural decision methods to design optimal adaptive controllers. IEEE CONTROL SYSTEM MAGAZINE.
[31]   Sutton, R. S., and A. G. Barto, Reinforcement Learning—An Introduction, Cambridge, MA: MIT Press, 2018
Volume 3, Issue 1
Winter 2026
Pages 1-12

  • Receive Date 12 July 2025
  • Revise Date 23 November 2025
  • Accept Date 02 December 2025
  • First Publish Date 02 December 2025
  • Publish Date 01 March 2026