Chebyshev Policies and the Mountain Car Problem: Reinforcement Learning for Low-Dimensional Control Tasks

Stefan Huber; Hannes Unger; Georg Sch\"afer; Jakob Rehrl

arXiv:2605.22305·cs.LG·May 22, 2026

Chebyshev Policies and the Mountain Car Problem: Reinforcement Learning for Low-Dimensional Control Tasks

Stefan Huber, Hannes Unger, Georg Sch\"afer, Jakob Rehrl

PDF

TL;DR

This paper analytically solves the Mountain Car problem, introduces Chebyshev policies as efficient RL alternatives, and demonstrates their superior performance and simplicity across multiple control tasks.

Contribution

It provides the first optimal control solution for Mountain Car and introduces Chebyshev policies as a universal, efficient, and explainable alternative to neural networks in RL.

Findings

01

Optimal control solution for Mountain Car derived after 36 years

02

Chebyshev policies reduce regret by 4.18 times and need 277 times fewer parameters

03

Chebyshev policies outperform neural nets on various RL tasks

Abstract

We analytically solve the Mountain Car problem, a canonical benchmark in RL, and derive an optimal control solution, closing a gap after 36 years. This enables us to reveal two surprising insights: The optimal control is quite simple, yet modern RL agents display a large gap to optimality. Motivated by the analysis of the optimal control, we introduce Chebyshev policies as a universal (i.e. dense) class of RL policies from first principles. They can be trained as drop-in replacements of neural nets, reducing the regret by a factor of 4.18, while requiring 277 times fewer parameters, fostering sample efficiency, explainability and realtime capability. Chebyshev policies are evaluated on further RL tasks, including a real-world nonlinear motion control testbed. They consistently improve performance over neural nets with PPO, ARS and REINFORCE. Our results demonstrate how Chebyshev…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.