ConceptACT: Episode-Level Concepts for Sample-Efficient Robotic Imitation Learning

Jakob Karalus; Friedhelm Schwenker

arXiv:2601.17135·cs.LG·January 27, 2026

ConceptACT: Episode-Level Concepts for Sample-Efficient Robotic Imitation Learning

Jakob Karalus, Friedhelm Schwenker

PDF

Open Access

TL;DR

ConceptACT introduces episode-level semantic concepts into robotic imitation learning using transformers, significantly improving sample efficiency and convergence speed by leveraging human-provided annotations during training.

Contribution

It presents a novel transformer-based architecture that integrates semantic concepts during training, enhancing imitation learning without requiring semantic input during deployment.

Findings

01

Faster convergence and higher sample efficiency compared to standard ACT.

02

Semantic supervision with concepts outperforms naive auxiliary or language-conditioned methods.

03

Architectural integration via attention mechanisms is crucial for performance gains.

Abstract

Imitation learning enables robots to acquire complex manipulation skills from human demonstrations, but current methods rely solely on low-level sensorimotor data while ignoring the rich semantic knowledge humans naturally possess about tasks. We present ConceptACT, an extension of Action Chunking with Transformers that leverages episode-level semantic concept annotations during training to improve learning efficiency. Unlike language-conditioned approaches that require semantic input at deployment, ConceptACT uses human-provided concepts (object properties, spatial relationships, task constraints) exclusively during demonstration collection, adding minimal annotation burden. We integrate concepts using a modified transformer architecture in which the final encoder layer implements concept-aware cross-attention, supervised to align with human annotations. Through experiments on two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Robot Manipulation and Learning · Reinforcement Learning in Robotics