A Self-adaptive LSAC-PID Approach based on Lyapunov Reward Shaping for Mobile Robots
Xinyi Yu, Siyu Xu, Yuehai Fan, Linlin Ou

TL;DR
This paper introduces a self-adaptive RL-based PID control method for mobile robots that uses Lyapunov reward shaping to enhance convergence and stability, demonstrating superior performance in simulations and real environments.
Contribution
It proposes a novel RL-based MIMO PID control strategy with Lyapunov reward shaping for real-time parameter tuning in complex mobile robot environments.
Findings
Improved convergence speed and stability of mobile robots.
Effective real-time PID parameter tuning without mathematical models.
Successful validation through simulations and real-world tests.
Abstract
To solve the coupling problem of control loops and the adaptive parameter tuning problem in the multi-input multi-output (MIMO) PID control system, a self-adaptive LSAC-PID algorithm is proposed based on deep reinforcement learning (RL) and Lyapunov-based reward shaping in this paper. For complex and unknown mobile robot control environment, an RL-based MIMO PID hybrid control strategy is firstly presented. According to the dynamic information and environmental feedback of the mobile robot, the RL agent can output the optimal MIMO PID parameters in real time, without knowing mathematical model and decoupling multiple control loops. Then, to improve the convergence speed of RL and the stability of mobile robots, a Lyapunov-based reward shaping soft actor-critic (LSAC) algorithm is proposed based on Lyapunov theory and potential-based reward shaping method. The convergence and optimality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExtremum Seeking Control Systems · Adaptive Dynamic Programming Control · Iterative Learning Control Systems
