Null Counterfactual Factor Interactions for Goal-Conditioned   Reinforcement Learning

Caleb Chuck; Fan Feng; Carl Qi; Chang Shi; Siddhant Agarwal; Amy; Zhang; Scott Niekum

arXiv:2505.03172·cs.LG·May 7, 2025

Null Counterfactual Factor Interactions for Goal-Conditioned Reinforcement Learning

Caleb Chuck, Fan Feng, Carl Qi, Chang Shi, Siddhant Agarwal, Amy, Zhang, Scott Niekum

PDF

Open Access

TL;DR

This paper introduces a novel approach combining null counterfactual interaction inference with hindsight relabeling to enhance sample efficiency in goal-conditioned reinforcement learning, especially in object-centric domains.

Contribution

It proposes a new definition of interactions based on null counterfactuals and integrates it with hindsight relabeling to improve learning in object-centric GCRL tasks.

Findings

01

HInt improves sample efficiency up to 4x.

02

NCII achieves higher interaction inference accuracy.

03

Method performs well in robotic and dynamic domains.

Abstract

Hindsight relabeling is a powerful tool for overcoming sparsity in goal-conditioned reinforcement learning (GCRL), especially in certain domains such as navigation and locomotion. However, hindsight relabeling can struggle in object-centric domains. For example, suppose that the goal space consists of a robotic arm pushing a particular target block to a goal location. In this case, hindsight relabeling will give high rewards to any trajectory that does not interact with the block. However, these behaviors are only useful when the object is already at the goal -- an extremely rare case in practice. A dataset dominated by these kinds of trajectories can complicate learning and lead to failures. In object-centric domains, one key intuition is that meaningful trajectories are often characterized by object-object interactions such as pushing the block with the gripper. To leverage this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Multi-Criteria Decision Making · Reinforcement Learning in Robotics

MethodsHierarchical Information Threading