Transfer Heterogeneous Knowledge Among Peer-to-Peer Teammates: A Model Distillation Approach
Zeyue Xue, Shuang Luo, Chao Wu, Pan Zhou, Kaigui Bian, Wei Du

TL;DR
This paper introduces LTCR, a model distillation framework for peer-to-peer knowledge transfer in multi-agent reinforcement learning, effectively reusing experiences and transferring value functions to enhance learning efficiency and team performance.
Contribution
The paper proposes a novel model distillation approach for transferring heterogeneous knowledge among agents using Categorical DQN and an efficient communication protocol.
Findings
Improved team-wide rewards in multiple environments
Enhanced learning stability and acceleration
Effective knowledge reuse among distributed agents
Abstract
Peer-to-peer knowledge transfer in distributed environments has emerged as a promising method since it could accelerate learning and improve team-wide performance without relying on pre-trained teachers in deep reinforcement learning. However, for traditional peer-to-peer methods such as action advising, they have encountered difficulties in how to efficiently expressed knowledge and advice. As a result, we propose a brand new solution to reuse experiences and transfer value functions among multiple students via model distillation. But it is still challenging to transfer Q-function directly since it is unstable and not bounded. To address this issue confronted with existing works, we adopt Categorical Deep Q-Network. We also describe how to design an efficient communication protocol to exploit heterogeneous knowledge among multiple distributed agents. Our proposed framework, namely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Simulation Techniques and Applications
