Finer Behavioral Foundation Models via Auto-Regressive Features and   Advantage Weighting

Edoardo Cetin; Ahmed Touati; Yann Ollivier

arXiv:2412.04368·cs.LG·December 6, 2024

Finer Behavioral Foundation Models via Auto-Regressive Features and Advantage Weighting

Edoardo Cetin, Ahmed Touati, Yann Ollivier

PDF

Open Access

TL;DR

This paper enhances behavior foundation models in reinforcement learning by introducing auto-regressive features to increase task encoding expressivity and combining them with offline RL techniques for improved zero-shot policy performance.

Contribution

It proposes auto-regressive features for FB models to represent nonlinear task encodings and demonstrates their effectiveness with offline RL techniques across multiple environments.

Findings

01

Auto-regressive features significantly improve task representation.

02

FB models match single-task offline agents on D4RL benchmarks.

03

Offline RL techniques are crucial for stable performance in complex tasks.

Abstract

The forward-backward representation (FB) is a recently proposed framework (Touati et al., 2023; Touati & Ollivier, 2021) to train behavior foundation models (BFMs) that aim at providing zero-shot efficient policies for any new task specified in a given reinforcement learning (RL) environment, without training for each new task. Here we address two core limitations of FB model training. First, FB, like all successor-feature-based methods, relies on a linear encoding of tasks: at test time, each new reward function is linearly projected onto a fixed set of pre-trained features. This limits expressivity as well as precision of the task representation. We break the linearity limitation by introducing auto-regressive features for FB, which let finegrained task features depend on coarser-grained task information. This can represent arbitrary nonlinear task encodings, thus significantly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsSparse Evolutionary Training