TD-MPC-Opt: Distilling Model-Based Multi-Task Reinforcement Learning Agents

Dmytro Kuzmenko; Nadiya Shvai

arXiv:2507.01823·cs.LG·July 3, 2025

TD-MPC-Opt: Distilling Model-Based Multi-Task Reinforcement Learning Agents

Dmytro Kuzmenko, Nadiya Shvai

PDF

Open Access

TL;DR

This paper introduces a distillation method that compresses a large multi-task model into a smaller, efficient model, achieving state-of-the-art performance in resource-constrained reinforcement learning environments.

Contribution

It presents a novel distillation technique for compressing large multi-task reinforcement learning models, enabling efficient deployment without significant performance loss.

Findings

01

Distilled model achieves a normalized score of 28.45 on MT30.

02

Model size reduced by approximately 50% through quantization.

03

Distillation outperforms the original smaller model in multi-task performance.

Abstract

We present a novel approach to knowledge transfer in model-based reinforcement learning, addressing the critical challenge of deploying large world models in resource-constrained environments. Our method efficiently distills a high-capacity multi-task agent (317M parameters) into a compact model (1M parameters) on the MT30 benchmark, significantly improving performance across diverse tasks. Our distilled model achieves a state-of-the-art normalized score of 28.45, surpassing the original 1M parameter model score of 18.93. This improvement demonstrates the ability of our distillation technique to capture and consolidate complex multi-task knowledge. We further optimize the distilled model through FP16 post-training quantization, reducing its size by $\sim$ 50\%. Our approach addresses practical deployment limitations and offers insights into knowledge representation in large world models,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Domain Adaptation and Few-Shot Learning