A Hierarchical Bayesian model for Inverse RL in Partially-Controlled Environments
Kenneth Bogert (University of North Carolina Asheville), Prashant, Doshi (University of Georgia)

TL;DR
This paper introduces a hierarchical Bayesian model for inverse reinforcement learning that effectively filters out confounding observations in partially-controlled environments, improving learning accuracy in robotic tasks.
Contribution
It extends existing IRL algorithms to explicitly model and handle diverse, confounding observations in real-world scenarios, enhancing robustness.
Findings
Outperforms several comparative methods in simulated robotic sorting tasks
Effectively filters out confounding observations in environments with occlusion
Second only to perfect knowledge of the expert's trajectory
Abstract
Robots learning from observations in the real world using inverse reinforcement learning (IRL) may encounter objects or agents in the environment, other than the expert, that cause nuisance observations during the demonstration. These confounding elements are typically removed in fully-controlled environments such as virtual simulations or lab settings. When complete removal is impossible the nuisance observations must be filtered out. However, identifying the source of observations when large amounts of observations are made is difficult. To address this, we present a hierarchical Bayesian model that incorporates both the expert's and the confounding elements' observations thereby explicitly modeling the diverse observations a robot may receive. We extend an existing IRL algorithm originally designed to work under partial occlusion of the expert to consider the diverse observations. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Machine Learning and Algorithms · Evolutionary Algorithms and Applications
