Finer Behavioral Foundation Models via Auto-Regressive Features and Advantage Weighting
Edoardo Cetin, Ahmed Touati, Yann Ollivier

TL;DR
This paper enhances behavior foundation models in reinforcement learning by introducing auto-regressive features to increase task encoding expressivity and combining them with offline RL techniques for improved zero-shot policy performance.
Contribution
It proposes auto-regressive features for FB models to represent nonlinear task encodings and demonstrates their effectiveness with offline RL techniques across multiple environments.
Findings
Auto-regressive features significantly improve task representation.
FB models match single-task offline agents on D4RL benchmarks.
Offline RL techniques are crucial for stable performance in complex tasks.
Abstract
The forward-backward representation (FB) is a recently proposed framework (Touati et al., 2023; Touati & Ollivier, 2021) to train behavior foundation models (BFMs) that aim at providing zero-shot efficient policies for any new task specified in a given reinforcement learning (RL) environment, without training for each new task. Here we address two core limitations of FB model training. First, FB, like all successor-feature-based methods, relies on a linear encoding of tasks: at test time, each new reward function is linearly projected onto a fixed set of pre-trained features. This limits expressivity as well as precision of the task representation. We break the linearity limitation by introducing auto-regressive features for FB, which let finegrained task features depend on coarser-grained task information. This can represent arbitrary nonlinear task encodings, thus significantly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsSparse Evolutionary Training
