TL;DR
This paper introduces a transformer-based belief estimator within the Forward-Backward representation to enable zero-shot adaptation of behavioral foundation models to unseen dynamics, improving performance in changing environments.
Contribution
It proposes a novel FB model with a transformer belief estimator and dynamics-specific clustering, enhancing zero-shot adaptation to unseen dynamics in behavioral foundation models.
Findings
Achieves up to 2x higher zero-shot returns in changing dynamics settings.
Demonstrates improved generalization to unseen dynamics in both discrete and continuous tasks.
Addresses limitations of existing BFMs in adapting to dynamic environment changes.
Abstract
Behavioral Foundation Models (BFMs) proved successful in producing policies for arbitrary tasks in a zero-shot manner, requiring no test-time training or task-specific fine-tuning. Among the most promising BFMs are the ones that estimate the successor measure learned in an unsupervised way from task-agnostic offline data. However, these methods fail to react to changes in the dynamics, making them inefficient under partial observability or when the transition function changes. This hinders the applicability of BFMs in a real-world setting, e.g., in robotics, where the dynamics can unexpectedly change at test time. In this work, we demonstrate that Forward-Backward (FB) representation, one of the methods from the BFM family, cannot distinguish between distinct dynamics, leading to an interference among the latent directions, which parametrize different policies. To address this, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
