NFQ2.0: The CartPole Benchmark Revisited
Sascha Lange, Roland Hafner, Martin Riedmiller

TL;DR
This paper revisits the classic NFQ algorithm on the CartPole benchmark, introduces NFQ2.0 to improve stability and reproducibility, and demonstrates its effectiveness on a real-world industrial control system.
Contribution
The paper presents NFQ2.0, a modernized variant of NFQ, with ablation studies and practical insights for applying deep RL in industrial settings.
Findings
NFQ2.0 improves stability over the original NFQ.
Key hyperparameters significantly affect performance.
Reproducibility is enhanced with the proposed modifications.
Abstract
This article revisits the 20-year-old neural fitted Q-iteration (NFQ) algorithm on its classical CartPole benchmark. NFQ was a pioneering approach towards modern Deep Reinforcement Learning (Deep RL) in applying multi-layer neural networks to reinforcement learning for real-world control problems. We explore the algorithm's conceptual simplicity and its transition from online to batch learning, which contributed to its stability. Despite its initial success, NFQ required extensive tuning and was not easily reproducible on real-world control problems. We propose a modernized variant NFQ2.0 and apply it to the CartPole task, concentrating on a real-world system build from standard industrial components, to investigate and improve the learning process's repeatability and robustness. Through ablation studies, we highlight key design decisions and hyperparameters that enhance performance and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Robot Manipulation and Learning
