Training a Generally Curious Agent

Fahim Tajwar; Yiding Jiang; Abitha Thankaraj; Sumaita Sadia Rahman; J Zico Kolter; Jeff Schneider; Ruslan Salakhutdinov

arXiv:2502.17543·cs.LG·November 3, 2025

Training a Generally Curious Agent

Fahim Tajwar, Yiding Jiang, Abitha Thankaraj, Sumaita Sadia Rahman, J Zico Kolter, Jeff Schneider, Ruslan Salakhutdinov

PDF

Open Access 1 Repo 2 Models 2 Datasets

TL;DR

This paper introduces Paprika, a fine-tuning method that enables language models to develop general decision-making and exploration skills transferable to new tasks without further training.

Contribution

The paper presents Paprika, a novel fine-tuning approach that trains models on synthetic interaction data to enhance their ability to explore and adapt in unseen environments.

Findings

01

Models fine-tuned with Paprika transfer decision-making skills to new tasks.

02

Curriculum learning improves sample efficiency in training.

03

Paprika reduces reliance on gradient updates for adaptation.

Abstract

Efficient exploration is essential for intelligent systems interacting with their environment, but existing language models often fall short in scenarios that require strategic information gathering. In this paper, we present Paprika, a fine-tuning approach that enables language models to develop general decision-making capabilities that are not confined to particular environments. By training on synthetic interaction data from different tasks that require diverse strategies, Paprika teaches models to explore and adapt their behavior on a new task based on environment feedback in-context without more gradient updates. Experimental results show that models fine-tuned with Paprika can effectively transfer their learned decision-making capabilities to entirely unseen tasks without additional training. Unlike traditional training, our approach's primary bottleneck lies in sampling useful…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tajwarfahim/paprika
pytorchOfficial

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · AI in Service Interactions