Knowledge Transfer in Model-Based Reinforcement Learning Agents for Efficient Multi-Task Learning

Dmytro Kuzmenko; Nadiya Shvai

arXiv:2501.05329·cs.LG·July 4, 2025

Knowledge Transfer in Model-Based Reinforcement Learning Agents for Efficient Multi-Task Learning

Dmytro Kuzmenko, Nadiya Shvai

PDF

Open Access

TL;DR

This paper introduces a distillation method for compressing large multi-task reinforcement learning models into smaller, efficient models suitable for resource-limited environments, achieving state-of-the-art results on the MT30 benchmark.

Contribution

It presents a novel knowledge transfer technique that condenses multi-task models into compact forms while maintaining high performance, and applies quantization for further efficiency.

Findings

01

Distilled a 317M parameter model into a 1M parameter model with improved performance.

02

Achieved a normalized score of 28.45 on MT30 benchmark, surpassing the original small model.

03

Reduced model size by 50% using FP16 quantization without performance loss.

Abstract

We propose an efficient knowledge transfer approach for model-based reinforcement learning, addressing the challenge of deploying large world models in resource-constrained environments. Our method distills a high-capacity multi-task agent (317M parameters) into a compact 1M parameter model, achieving state-of-the-art performance on the MT30 benchmark with a normalized score of 28.45, a substantial improvement over the original 1M parameter model's score of 18.93. This demonstrates the ability of our distillation technique to consolidate complex multi-task knowledge effectively. Additionally, we apply FP16 post-training quantization, reducing the model size by 50% while maintaining performance. Our work bridges the gap between the power of large models and practical deployment constraints, offering a scalable solution for efficient and accessible multi-task reinforcement learning in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics