Actor-Critic Reinforcement Learning with Simultaneous Human Control and Feedback
Kory W. Mathewson, Patrick M. Pilarski

TL;DR
This study explores how humans deliver control and feedback simultaneously during human-robot interaction, proposing a framework and analyzing experimental results to improve collaborative reinforcement learning.
Contribution
It formalizes a general framework for concurrent human control and feedback in reinforcement learning and provides experimental insights into human behavior in such interactions.
Findings
Humans provide less feedback when controlling and giving feedback simultaneously.
Control signal quality remains unaffected during combined control and feedback.
Humans modify feedback timing and methods when delivering both control and feedback.
Abstract
This paper contributes a first study into how different human users deliver simultaneous control and feedback signals during human-robot interaction. As part of this work, we formalize and present a general interactive learning framework for online cooperation between humans and reinforcement learning agents. In many human-machine interaction settings, there is a growing gap between the degrees-of-freedom of complex semi-autonomous systems and the number of human control channels. Simple human control and feedback mechanisms are required to close this gap and allow for better collaboration between humans and machines on complex tasks. To better inform the design of concurrent control and feedback interfaces, we present experimental results from a human-robot collaborative domain wherein the human must simultaneously deliver both control and feedback signals to interactively train an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning
