Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics
Matthias Weissenbacher, Samarth Sinha, Animesh Garg, Yoshinobu, Kawahara

TL;DR
This paper introduces Koopman Q-learning, a novel offline reinforcement learning method that leverages symmetries in system dynamics via Koopman theory to improve generalization and performance.
Contribution
It proposes a new data augmentation framework based on Koopman symmetries, enabling better generalization in offline RL by exploiting system dynamics.
Findings
Consistently outperforms state-of-the-art model-free Q-learning methods.
Effective in various benchmark offline RL tasks including D4RL, Metaworld, and Robosuite.
Provides theoretical insights into symmetries in control systems.
Abstract
Offline reinforcement learning leverages large datasets to train policies without interactions with the environment. The learned policies may then be deployed in real-world settings where interactions are costly or dangerous. Current algorithms over-fit to the training dataset and as a consequence perform poorly when deployed to out-of-distribution generalizations of the environment. We aim to address these limitations by learning a Koopman latent representation which allows us to infer symmetries of the system's underlying dynamic. The latter is then utilized to extend the otherwise static offline dataset during training; this constitutes a novel data augmentation framework which reflects the system's dynamic and is thus to be interpreted as an exploration of the environments phase space. To obtain the symmetries we employ Koopman theory in which nonlinear dynamics are represented in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Gaussian Processes and Bayesian Inference · Generative Adversarial Networks and Image Synthesis
MethodsQ-Learning
