Task-Oriented Hierarchical Object Decomposition for Visuomotor Control

Jianing Qian; Yunshuang Li; Bernadette Bucher; Dinesh Jayaraman

arXiv:2411.01284·cs.RO·November 5, 2024

Task-Oriented Hierarchical Object Decomposition for Visuomotor Control

Jianing Qian, Yunshuang Li, Bernadette Bucher, Dinesh Jayaraman

PDF

Open Access 3 Reviews

TL;DR

HODOR introduces a hierarchical, scene-entity-based representation that improves sample efficiency and generalization in visuomotor control tasks by selectively assembling task-specific features.

Contribution

The paper presents HODOR, a novel hierarchical object decomposition method that scales representations with scene complexity and enhances task-specific learning.

Findings

01

HODOR outperforms prior representations in imitation learning tasks.

02

HODOR's invariances enable robust zero-shot generalization.

03

HODOR scales with scene and task complexity.

Abstract

Good pre-trained visual representations could enable robots to learn visuomotor policy efficiently. Still, existing representations take a one-size-fits-all-tasks approach that comes with two important drawbacks: (1) Being completely task-agnostic, these representations cannot effectively ignore any task-irrelevant information in the scene, and (2) They often lack the representational capacity to handle unconstrained/complex real-world scenes. Instead, we propose to train a large combinatorial family of representations organized by scene entities: objects and object parts. This hierarchical object decomposition for task-oriented representations (HODOR) permits selectively assembling different representations specific to each task while scaling in representational capacity with the complexity of the scene and the task. In our experiments, we find that HODOR outperforms prior pre-trained…

Peer Reviews

Decision·CoRL 2024

Reviewer 01Rating 2Confidence 5

Reviewer 02Rating 3Confidence 3

Reviewer 03Rating 4Confidence 5

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Human Pose and Action Recognition