Transfer Heterogeneous Knowledge Among Peer-to-Peer Teammates: A Model   Distillation Approach

Zeyue Xue; Shuang Luo; Chao Wu; Pan Zhou; Kaigui Bian; Wei Du

arXiv:2002.02202·cs.AI·February 7, 2020·1 cites

Transfer Heterogeneous Knowledge Among Peer-to-Peer Teammates: A Model Distillation Approach

Zeyue Xue, Shuang Luo, Chao Wu, Pan Zhou, Kaigui Bian, Wei Du

PDF

Open Access

TL;DR

This paper introduces LTCR, a model distillation framework for peer-to-peer knowledge transfer in multi-agent reinforcement learning, effectively reusing experiences and transferring value functions to enhance learning efficiency and team performance.

Contribution

The paper proposes a novel model distillation approach for transferring heterogeneous knowledge among agents using Categorical DQN and an efficient communication protocol.

Findings

01

Improved team-wide rewards in multiple environments

02

Enhanced learning stability and acceleration

03

Effective knowledge reuse among distributed agents

Abstract

Peer-to-peer knowledge transfer in distributed environments has emerged as a promising method since it could accelerate learning and improve team-wide performance without relying on pre-trained teachers in deep reinforcement learning. However, for traditional peer-to-peer methods such as action advising, they have encountered difficulties in how to efficiently expressed knowledge and advice. As a result, we propose a brand new solution to reuse experiences and transfer value functions among multiple students via model distillation. But it is still challenging to transfer Q-function directly since it is unstable and not bounded. To address this issue confronted with existing works, we adopt Categorical Deep Q-Network. We also describe how to design an efficient communication protocol to exploit heterogeneous knowledge among multiple distributed agents. Our proposed framework, namely…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Simulation Techniques and Applications