Learning Semantic-Geometric Task Graph-Representations from Human Demonstrations

Franziska Herbert; Vignesh Prasad; Han Liu; Dorothea Koert; Georgia Chalvatzaki

arXiv:2601.11460·cs.RO·January 19, 2026

Learning Semantic-Geometric Task Graph-Representations from Human Demonstrations

Franziska Herbert, Vignesh Prasad, Han Liu, Dorothea Koert, Georgia Chalvatzaki

PDF

Open Access

TL;DR

This paper introduces a novel semantic-geometric task graph representation learned from human demonstrations, enabling better understanding and transfer of complex manipulation tasks involving multiple objects and actions.

Contribution

It proposes a new graph-based representation and a learning framework combining MPNN and Transformer that jointly captures task semantics and geometry from demonstrations.

Findings

01

Graph representations improve task understanding in variable scenarios.

02

Decoupled scene encoding and action reasoning enhance prediction accuracy.

03

Transferred task graphs enable effective robot action planning.

Abstract

Learning structured task representations from human demonstrations is essential for understanding long-horizon manipulation behaviors, particularly in bimanual settings where action ordering, object involvement, and interaction geometry can vary significantly. A key challenge lies in jointly capturing the discrete semantic structure of tasks and the temporal evolution of object-centric geometric relations in a form that supports reasoning over task progression. In this work, we introduce a semantic-geometric task graph-representation that encodes object identities, inter-object relations, and their temporal geometric evolution from human demonstrations. Building on this formulation, we propose a learning framework that combines a Message Passing Neural Network (MPNN) encoder with a Transformer-based decoder, decoupling scene representation learning from action-conditioned reasoning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Human Pose and Action Recognition · Reinforcement Learning in Robotics