Null Counterfactual Factor Interactions for Goal-Conditioned Reinforcement Learning
Caleb Chuck, Fan Feng, Carl Qi, Chang Shi, Siddhant Agarwal, Amy, Zhang, Scott Niekum

TL;DR
This paper introduces a novel approach combining null counterfactual interaction inference with hindsight relabeling to enhance sample efficiency in goal-conditioned reinforcement learning, especially in object-centric domains.
Contribution
It proposes a new definition of interactions based on null counterfactuals and integrates it with hindsight relabeling to improve learning in object-centric GCRL tasks.
Findings
HInt improves sample efficiency up to 4x.
NCII achieves higher interaction inference accuracy.
Method performs well in robotic and dynamic domains.
Abstract
Hindsight relabeling is a powerful tool for overcoming sparsity in goal-conditioned reinforcement learning (GCRL), especially in certain domains such as navigation and locomotion. However, hindsight relabeling can struggle in object-centric domains. For example, suppose that the goal space consists of a robotic arm pushing a particular target block to a goal location. In this case, hindsight relabeling will give high rewards to any trajectory that does not interact with the block. However, these behaviors are only useful when the object is already at the goal -- an extremely rare case in practice. A dataset dominated by these kinds of trajectories can complicate learning and lead to failures. In object-centric domains, one key intuition is that meaningful trajectories are often characterized by object-object interactions such as pushing the block with the gripper. To leverage this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Multi-Criteria Decision Making · Reinforcement Learning in Robotics
MethodsHierarchical Information Threading
