Reinforcement Learning from Implicit Neural Feedback for Human-Aligned Robot Control
Suzie Kim

TL;DR
This paper introduces a novel reinforcement learning framework that leverages implicit EEG-based neural feedback to train robots effectively without explicit human input, improving learning efficiency and alignment.
Contribution
It presents a new RLIHF method that decodes EEG error signals into rewards, enabling human-aligned robot control without explicit feedback mechanisms.
Findings
Agents trained with EEG feedback perform comparably to those with manual rewards.
The approach reduces user cognitive load during training.
Effective policy learning is achieved with sparse external rewards.
Abstract
Conventional reinforcement learning (RL) approaches often struggle to learn effective policies under sparse reward conditions, necessitating the manual design of complex, task-specific reward functions. To address this limitation, reinforcement learning from human feedback (RLHF) has emerged as a promising strategy that complements hand-crafted rewards with human-derived evaluation signals. However, most existing RLHF methods depend on explicit feedback mechanisms such as button presses or preference labels, which disrupt the natural interaction process and impose a substantial cognitive load on the user. We propose a novel reinforcement learning from implicit human feedback (RLIHF) framework that utilizes non-invasive electroencephalography (EEG) signals, specifically error-related potentials (ErrPs), to provide continuous, implicit feedback without requiring explicit user…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEEG and Brain-Computer Interfaces · Reinforcement Learning in Robotics · Neural and Behavioral Psychology Studies
