TL;DR
This paper compares various reinforcement learning algorithms for controlling the cartpole system without prior knowledge, introduces a novel RL-swing-up integration, and evaluates their performance against traditional LQR control.
Contribution
It provides a comparative analysis of RL algorithms for cartpole control and proposes a new method combining RL with swing-up controllers.
Findings
RL algorithms perform comparably to LQR in control tasks
The proposed RL-swing-up integration improves control efficiency
Different RL methods show varying strengths in nonlinear control
Abstract
Designing optimal controllers continues to be challenging as systems are becoming complex and are inherently nonlinear. The principal advantage of reinforcement learning (RL) is its ability to learn from the interaction with the environment and provide optimal control strategy. In this paper, RL is explored in the context of control of the benchmark cartpole dynamical system with no prior knowledge of the dynamics. RL algorithms such as temporal-difference, policy gradient actor-critic, and value function approximation are compared in this context with the standard LQR solution. Further, we propose a novel approach to integrate RL and swing-up controllers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
