OpenTinker: Separating Concerns in Agentic Reinforcement Learning
Siqi Zhu, Jiaxuan You

TL;DR
OpenTinker is a modular infrastructure for reinforcement learning of large language model agents, emphasizing separation of concerns to improve flexibility and scalability in agentic learning systems.
Contribution
It introduces a decomposed, component-based framework for RL with LLM agents, enabling flexible configuration and efficient resource management.
Findings
Demonstrates effective RL use cases with large language models.
Shows improved modularity and scalability in agent training.
Validates the framework's practicality in real-world scenarios.
Abstract
We introduce OpenTinker, an infrastructure for reinforcement learning (RL) of large language model (LLM) agents built around a separation of concerns across algorithm design, execution, and agent-environment interaction. Rather than relying on monolithic, end-to-end RL pipelines, OpenTinker decomposes agentic learning systems into lightweight, composable components with clearly defined abstraction boundaries. Users specify agents, environments, and interaction protocols, while inference and training are delegated to a managed execution runtime. OpenTinker introduces a centralized scheduler for managing training and inference workloads, including LoRA-based and full-parameter RL, supervised fine-tuning, and inference, over shared resources. We further discuss design principles for extending OpenTinker to multi-agent training. Finally, we present a set of RL use cases that demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Topic Modeling
