Vintix: Action Model via In-Context Reinforcement Learning

Andrey Polubarov; Nikita Lyubaykin; Alexander Derevyagin; Ilya Zisman; Denis Tarasov; Alexander Nikulin; Vladislav Kurenkov

arXiv:2501.19400·cs.LG·September 30, 2025

Vintix: Action Model via In-Context Reinforcement Learning

Andrey Polubarov, Nikita Lyubaykin, Alexander Derevyagin, Ilya Zisman, Denis Tarasov, Alexander Nikulin, Vladislav Kurenkov

PDF

Open Access 1 Repo 1 Models 1 Video

TL;DR

Vintix introduces a scalable in-context reinforcement learning framework that enables a fixed, cross-domain model to learn behaviors through trial-and-error, advancing the development of generalist decision-making agents.

Contribution

This work presents the first scalable ICRL approach using Algorithm Distillation to create versatile action models across multiple domains.

Findings

01

Algorithm Distillation outperforms expert distillation in creating action models.

02

The fixed model effectively learns behaviors across diverse tasks.

03

Results demonstrate ICRL's potential for scalable generalist agents.

Abstract

In-Context Reinforcement Learning (ICRL) represents a promising paradigm for developing generalist agents that learn at inference time through trial-and-error interactions, analogous to how large language models adapt contextually, but with a focus on reward maximization. However, the scalability of ICRL beyond toy tasks and single-domain settings remains an open challenge. In this work, we present the first steps toward scaling ICRL by introducing a fixed, cross-domain model capable of learning behaviors through in-context reinforcement learning. Our results demonstrate that Algorithm Distillation, a framework designed to facilitate ICRL, offers a compelling and competitive alternative to expert distillation to construct versatile action models. These findings highlight the potential of ICRL as a scalable approach for generalist decision-making systems. Code released at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dunnolab/vintix
pytorchOfficial

Models

🤗
dunnolab/Vintix
model· 4 dl· ♡ 3
4 dl♡ 3

Videos

Vintix: Action Model via In-Context Reinforcement Learning· slideslive

Taxonomy

TopicsMental Health Research Topics

MethodsFocus