Language-Conditioned Change-point Detection to Identify Sub-Tasks in Robotics Domains
Divyanshu Raj, Chitta Baral, Nakul Gopalan

TL;DR
This paper introduces a language-conditioned change-point detection method to identify sub-tasks within robot trajectories guided by natural language instructions, improving accuracy over baseline methods and analyzing sample complexity for real-world applicability.
Contribution
The work presents a novel approach that adapts video moment retrieval techniques to robot sub-task segmentation using language, with extensive experiments and sample complexity analysis.
Findings
Achieved 1.78% improvement over baseline in sub-task identification accuracy.
Demonstrated the feasibility of language-conditioned change-point detection in robotics.
Provided insights into sample complexity requirements for real robot scenarios.
Abstract
In this work, we present an approach to identify sub-tasks within a demonstrated robot trajectory using language instructions. We identify these sub-tasks using language provided during demonstrations as guidance to identify sub-segments of a longer robot trajectory. Given a sequence of natural language instructions and a long trajectory consisting of image frames and discrete actions, we want to map an instruction to a smaller fragment of the trajectory. Unlike previous instruction following works which directly learn the mapping from language to a policy, we propose a language-conditioned change-point detection method to identify sub-tasks in a problem. Our approach learns the relationship between constituent segments of a long language command and corresponding constituent segments of a trajectory. These constituent trajectory segments can be used to learn subtasks or sub-goals for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Human Pose and Action Recognition
