Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning
Mohammadreza Nakhaei, Aidan Scannell, Joni Pajarinen

TL;DR
This paper introduces a residual learning and context encoding method to adapt offline reinforcement learning models to environments with changing dynamics during online fine-tuning, demonstrating improved adaptability and sample efficiency.
Contribution
The paper proposes a novel residual learning approach combined with a context encoder to handle changing dynamics in offline-to-online RL scenarios.
Findings
Effective adaptation to dynamic changes in MuJoCo environments
Outperforms comparison methods in unseen perturbations
Sample-efficient learning in variable environments
Abstract
Offline reinforcement learning (RL) allows learning sequential behavior from fixed datasets. Since offline datasets do not cover all possible situations, many methods collect additional data during online fine-tuning to improve performance. In general, these methods assume that the transition dynamics remain the same during both the offline and online phases of training. However, in many real-world applications, such as outdoor construction and navigation over rough terrain, it is common for the transition dynamics to vary between the offline and online phases. Moreover, the dynamics may vary during the online fine-tuning. To address this problem of changing dynamics from offline to online RL we propose a residual learning approach that infers dynamics changes to correct the outputs of the offline solution. At the online fine-tuning phase, we train a context encoder to learn a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Smart Grid Energy Management
