Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers
Micah Carroll, Jessy Lin, Orr Paradise, Raluca Georgescu, Mingfei Sun,, David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca, Dragan, Sam Devlin

TL;DR
This paper introduces FlexiBiT, a unified bidirectional transformer framework that can be trained on various sequential decision-making tasks, achieving comparable or better performance than specialized models and benefiting from fine-tuning.
Contribution
The paper presents FlexiBiT, a novel transformer-based framework that unifies multiple sequential decision tasks into a single model, enabling flexible inference and improved performance.
Findings
FlexiBiT performs on par or better than specialized models across tasks.
A single model can handle behavior cloning, offline RL, inverse dynamics, and waypoint conditioning.
Fine-tuning FlexiBiT enhances task-specific performance.
Abstract
Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision making, where many well-studied tasks like behavior cloning, offline RL, inverse dynamics, and waypoint conditioning correspond to different sequence maskings over a sequence of states, actions, and returns. We introduce the FlexiBiT framework, which provides a unified way to specify models which can be trained on many different sequential decision making tasks. We show that a single FlexiBiT model is simultaneously capable of carrying out many tasks with performance similar to or better than specialized models. Additionally, we show that performance can be further improved by fine-tuning our general model on specific tasks of interest.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
