Hierarchical Entity-centric Reinforcement Learning with Factored Subgoal Diffusion

Dan Haramati; Carl Qi; Tal Daniel; Amy Zhang; Aviv Tamar; George Konidaris

arXiv:2602.02722·cs.LG·February 4, 2026

Hierarchical Entity-centric Reinforcement Learning with Factored Subgoal Diffusion

Dan Haramati, Carl Qi, Tal Daniel, Amy Zhang, Aviv Tamar, George Konidaris

PDF

Open Access 1 Models 1 Datasets 3 Reviews

TL;DR

This paper introduces a hierarchical, entity-centric reinforcement learning framework that combines subgoal decomposition with factored structures, significantly improving performance on complex, multi-entity, long-horizon tasks with sparse rewards.

Contribution

It presents a novel modular approach integrating a value-based GCRL agent with a factored diffusion subgoal generator, enhancing scalability and generalization in multi-entity environments.

Findings

01

Achieves over 150% higher success rates on complex tasks

02

Generalizes to longer horizons and more entities

03

Boosts performance of existing GCRL algorithms

Abstract

We propose a hierarchical entity-centric framework for offline Goal-Conditioned Reinforcement Learning (GCRL) that combines subgoal decomposition with factored structure to solve long-horizon tasks in domains with multiple entities. Achieving long-horizon goals in complex environments remains a core challenge in Reinforcement Learning (RL). Domains with multiple entities are particularly difficult due to their combinatorial complexity. GCRL facilitates generalization across goals and the use of subgoal structure, but struggles with high-dimensional observations and combinatorial state-spaces, especially under sparse reward. We employ a two-level hierarchy composed of a value-based GCRL agent and a factored subgoal-generating conditional diffusion model. The RL agent and subgoal generator are trained independently and composed post hoc through selective subgoal generation based on the…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 4

Strengths

1. Originality: The framework is modular, compatible with various value-based GCRL algorithms. Existing hierarchical diffusers diffuse over global subgoals without explicit entity factorization. EC-Diffuser uses entity-centric diffusion but for behavior cloning rather than subgoal generation. Hence, the proposed combination of conditional diffusion model over entity-factored subgoals with a value-based GCRL agent is novel. The paper also make it clear that their framework builds directly on two

Weaknesses

1. Figure quality and readability: The figures are difficult to read in their current form. In Figure 1, the circles and arrows are thin and low-contrast, and the text labels are very small and faint. As a result, it is hard to discern the structure and details of the illustration. Given that this is the first figure of the paper and is meant to convey the main idea, it should be redesigned with larger fonts, clearer icons, and higher contrast to make the diagram easy to understand for the reade

Reviewer 02Rating 6Confidence 3

Strengths

This paper is well written, well-constructed and easy to follow. In particular, its pictorial illustration, such as Figure 2, was very helpful in understanding the overall idea. The proposed method effectively addressed challenging long-horizon multiple-entity tasks.

Weaknesses

The proposed components are based on existing methodologies, such as entity-factored subgoals and subgoal diffuser, so the novelty of the approach itself may be limited. However, the meaningful combination of these components has yielded strong performance. Please see others in questions.

Reviewer 03Rating 4Confidence 4

Strengths

While hierarchical structures have been explored in prior work, utilizing diffusion models to generate subgoals that incorporate information from multiple entities stands out as a novel approach. Additionally, modifying the experimental environments to highlight the advantages of the proposed method and comparing it against baselines strengthens the paper's contributions.

Weaknesses

The paper would benefit from more detailed explanations of model design choices, along with ablation studies to justify them. Although the use of diffusion for subgoal generation is innovative, it remains unclear whether this component is essential for the performance gains. There is a concern that filtering subgoals—generated via the unused value function during diffusion model training—might play a more critical role than the diffusion process itself.

Code & Models

Models

🤗
DanHrmti/hecrl_visual_encoders
model

Datasets

DanHrmti/hecrl
dataset· 23 dl
23 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications