Knowledge Transfer in Model-Based Reinforcement Learning Agents for Efficient Multi-Task Learning
Dmytro Kuzmenko, Nadiya Shvai

TL;DR
This paper introduces a distillation method for compressing large multi-task reinforcement learning models into smaller, efficient models suitable for resource-limited environments, achieving state-of-the-art results on the MT30 benchmark.
Contribution
It presents a novel knowledge transfer technique that condenses multi-task models into compact forms while maintaining high performance, and applies quantization for further efficiency.
Findings
Distilled a 317M parameter model into a 1M parameter model with improved performance.
Achieved a normalized score of 28.45 on MT30 benchmark, surpassing the original small model.
Reduced model size by 50% using FP16 quantization without performance loss.
Abstract
We propose an efficient knowledge transfer approach for model-based reinforcement learning, addressing the challenge of deploying large world models in resource-constrained environments. Our method distills a high-capacity multi-task agent (317M parameters) into a compact 1M parameter model, achieving state-of-the-art performance on the MT30 benchmark with a normalized score of 28.45, a substantial improvement over the original 1M parameter model's score of 18.93. This demonstrates the ability of our distillation technique to consolidate complex multi-task knowledge effectively. Additionally, we apply FP16 post-training quantization, reducing the model size by 50% while maintaining performance. Our work bridges the gap between the power of large models and practical deployment constraints, offering a scalable solution for efficient and accessible multi-task reinforcement learning in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
