Continuous-Time Fitted Value Iteration for Robust Policies
Michael Lutter, Boris Belousov, Shie Mannor, Dieter Fox, Animesh Garg,, Jan Peters

TL;DR
This paper introduces continuous and robust fitted value iteration algorithms for solving Hamilton-Jacobi equations in continuous control, enabling optimal and robust policies without discretization, demonstrated on physical systems like pendulums and cartpoles.
Contribution
The paper develops closed-form solutions for optimal policies and adversaries in continuous control, simplifying the solution of Hamilton-Jacobi equations and enabling real-world robustness.
Findings
Algorithms achieve optimal policies in control tasks
Robust FVI outperforms deep RL in perturbation scenarios
Methods work effectively on physical systems like pendulums
Abstract
Solving the Hamilton-Jacobi-Bellman equation is important in many domains including control, robotics and economics. Especially for continuous control, solving this differential equation and its extension the Hamilton-Jacobi-Isaacs equation, is important as it yields the optimal policy that achieves the maximum reward on a give task. In the case of the Hamilton-Jacobi-Isaacs equation, which includes an adversary controlling the environment and minimizing the reward, the obtained policy is also robust to perturbations of the dynamics. In this paper we propose continuous fitted value iteration (cFVI) and robust fitted value iteration (rFVI). These algorithms leverage the non-linear control-affine dynamics and separable state and action reward of many continuous control problems to derive the optimal policy and optimal adversary in closed form. This analytic expression simplifies the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics
