Imagining In-distribution States: How Predictable Robot Behavior Can Enable User Control Over Learned Policies
Isaac Sheidlower, Emma Bethel, Douglas Lilly, Reuben M. Aronson,, Elaine Schaertl Short

TL;DR
This paper introduces IODA, an algorithm that enables users to better control robots trained with RL by aligning robot behavior with user expectations, improving task success and user satisfaction.
Contribution
The paper formalizes the problem of shared control with RL policies and proposes IODA, a novel algorithm that enhances user control and task performance in robot manipulation.
Findings
IODA improves task success rates in user-robot collaboration.
Higher alignment between robot behavior and user expectations with IODA.
Strong correlation between task performance and meeting user expectations.
Abstract
It is crucial that users are empowered to take advantage of the functionality of a robot and use their understanding of that functionality to perform novel and creative tasks. Given a robot trained with Reinforcement Learning (RL), a user may wish to leverage that autonomy along with their familiarity of how they expect the robot to behave to collaborate with the robot. One technique is for the user to take control of some of the robot's action space through teleoperation, allowing the RL policy to simultaneously control the rest. We formalize this type of shared control as Partitioned Control (PC). However, this may not be possible using an out-of-the-box RL policy. For example, a user's control may bring the robot into a failure state from the policy's perspective, causing it to act unexpectedly and hindering the success of the user's desired task. In this work, we formalize this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
