Formulating Reinforcement Learning for Human-Robot Collaboration through Off-Policy Evaluation
Saurav Singh, Rodney Sanchez, Alexander Ororbia, Jamison Heard

TL;DR
This paper introduces a novel offline reinforcement learning framework that uses off-policy evaluation to select optimal state representations and reward functions for human-robot collaboration, reducing reliance on costly real-time interactions.
Contribution
It proposes an OPE-based method for automatic selection of state spaces and reward functions in RL, validated on simulated and real-world human-robot interaction environments.
Findings
Effective in selecting high-performing policies using logged data
Reduces need for environment interaction during RL setup
Applicable to complex human-robot collaboration scenarios
Abstract
Reinforcement learning (RL) has the potential to transform real-world decision-making systems by enabling autonomous agents to learn from experience. Deploying RL in real-world settings, especially in the context of human-robot interaction, requires defining state representations and reward functions, which are critical for learning efficiency and policy performance. Traditional RL approaches often rely on domain expertise and trial-and-error, necessitating extensive human involvement as well as direct interaction with the environment, which can be costly and impractical, especially in complex and safety-critical applications. This work proposes a novel RL framework that leverages off-policy evaluation (OPE) for state space and reward function selection, using only logged interaction data. This approach eliminates the need for real-time access to the environment or human-in-the-loop…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Human-Automation Interaction and Safety · Social Robot Interaction and HRI
