Learning Human-Like RL Agents Through Trajectory Optimization With Action Quantization
Jian-Ting Guo, Yu-Cheng Chen, Ping-Chun Hsieh, Kuo-Hao Ho, Po-Wei Huang, Ti-Rong Wu, I-Chen Wu

TL;DR
This paper introduces Macro Action Quantization (MAQ), a novel framework that enhances the human-likeness of reinforcement learning agents by aligning their trajectories with human behavior through trajectory optimization and macro actions.
Contribution
The paper proposes MAQ, a new method that distills human demonstrations into macro actions to produce more human-like RL agents, improving interpretability and trustworthiness.
Findings
MAQ significantly increases trajectory similarity scores.
MAQ achieves the highest human-likeness rankings in evaluations.
MAQ can be integrated into various RL algorithms.
Abstract
Human-like agents have long been one of the goals in pursuing artificial intelligence. Although reinforcement learning (RL) has achieved superhuman performance in many domains, relatively little attention has been focused on designing human-like RL agents. As a result, many reward-driven RL agents often exhibit unnatural behaviors compared to humans, raising concerns for both interpretability and trustworthiness. To achieve human-like behavior in RL, this paper first formulates human-likeness as trajectory optimization, where the objective is to find an action sequence that closely aligns with human behavior while also maximizing rewards, and adapts the classic receding-horizon control to human-like learning as a tractable and efficient implementation. To achieve this, we introduce Macro Action Quantization (MAQ), a human-like RL framework that distills human demonstrations into macro…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Human Pose and Action Recognition · Multimodal Machine Learning Applications
