Learning to Generalize for Sequential Decision Making
Xusen Yin, Ralph Weischedel, Jonathan May

TL;DR
This paper introduces a teacher-student imitation learning approach that integrates large language models into sequential decision-making tasks, significantly improving generalization and learning speed in novel domains.
Contribution
The authors propose a novel imitation learning framework that converts reinforcement learning models into natural language understanding models, enabling better generalization with language models.
Findings
Models learn faster with imitation learning.
Models outperform teachers on held-out problems.
Significant generalization improvements on out-of-domain tasks.
Abstract
We consider problems of making sequences of decisions to accomplish tasks, interacting via the medium of language. These problems are often tackled with reinforcement learning approaches. We find that these models do not generalize well when applied to novel task domains. However, the large amount of computation necessary to adequately train and explore the search space of sequential decision making, under a reinforcement learning paradigm, precludes the inclusion of large contextualized language models, which might otherwise enable the desired generalization ability. We introduce a teacher-student imitation learning methodology and a means of converting a reinforcement learning model into a natural language understanding model. Together, these methodologies enable the introduction of contextualized language models into the sequential decision making problem space. We show that models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications
