BCR-DRL: Behavior- and Context-aware Reward for Deep Reinforcement Learning in Human-AI Coordination
Xin Hao, Bahareh Nakisa, Mohmmad Naim Rastgoo, Gaoyang Pang

TL;DR
This paper introduces BCR-DRL, a novel reward mechanism for deep reinforcement learning in human-AI coordination, which leverages behavior and context to improve exploration, exploitation, and overall performance in complex tasks.
Contribution
It proposes a behavior- and context-aware reward scheme that enhances exploration and exploitation in DRL for human-AI coordination, addressing sparse rewards and unpredictable human behaviors.
Findings
Increases cumulative sparse rewards by ~20%.
Improves sample efficiency by ~38%.
Demonstrates effectiveness in Overcooked environment.
Abstract
Deep reinforcement Learning (DRL) offers a powerful framework for training AI agents to coordinate with human partners. However, DRL faces two critical challenges in human-AI coordination (HAIC): sparse rewards and unpredictable human behaviors. These challenges significantly limit DRL to identify effective coordination policies, due to its impaired capability of optimizing exploration and exploitation. To address these limitations, we propose an innovative behavior- and context-aware reward (BCR) for DRL, which optimizes exploration and exploitation by leveraging human behaviors and contextual information in HAIC. Our BCR consists of two components: (i) A novel dual intrinsic rewarding scheme to enhance exploration. This scheme composes an AI self-motivated intrinsic reward and a human-motivated intrinsic reward, which are designed to increase the capture of sparse rewards by a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEEG and Brain-Computer Interfaces · Reinforcement Learning in Robotics
