Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning via Incorporating Generalized Human Expertise
Xuefei Wu, Xiao Yin, Yuanyang Zhu, and Chunlin Chen

TL;DR
This paper introduces LIGHT, a framework that incorporates human expertise into multi-agent reinforcement learning to improve exploration efficiency and performance in sparse-reward environments.
Contribution
LIGHT is a novel end-to-end framework that integrates human knowledge into MARL, guiding agents to align actions with human expertise and enhance learning.
Findings
Outperforms baseline methods in sparse-reward tasks
Improves knowledge reusability across different scenarios
Enhances exploration efficiency and learning performance
Abstract
Efficient exploration in multi-agent reinforcement learning (MARL) is a challenging problem when receiving only a team reward, especially in environments with sparse rewards. A powerful method to mitigate this issue involves crafting dense individual rewards to guide the agents toward efficient exploration. However, individual rewards generally rely on manually engineered shaping-reward functions that lack high-order intelligence, thus it behaves ineffectively than humans regarding learning and generalization in complex problems. To tackle these issues, we combine the above two paradigms and propose a novel framework, LIGHT (Learning Individual Intrinsic reward via Incorporating Generalized Human experTise), which can integrate human knowledge into MARL algorithms in an end-to-end manner. LIGHT guides each agent to avoid unnecessary exploration by considering both individual action…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Mobile Crowdsensing and Crowdsourcing
