Grounding Predicates through Actions

Toki Migimatsu; Jeannette Bohg

arXiv:2109.14718·cs.RO·March 7, 2022

Grounding Predicates through Actions

Toki Migimatsu, Jeannette Bohg

PDF

Open Access

TL;DR

This paper introduces a weakly supervised method for automatically labeling symbolic states in videos using action pre- and post-conditions, enabling efficient training of predicate classifiers for robotic reasoning.

Contribution

It presents a novel automatic labeling approach that reduces supervision costs and applies it to train predicate classifiers for symbolic reasoning in robotics.

Findings

01

Predicate classifiers match fully supervised performance

02

Automatic labeling significantly reduces annotation effort

03

Enables real-world task planning with learned predicates

Abstract

Symbols representing abstract states such as "dish in dishwasher" or "cup on table" allow robots to reason over long horizons by hiding details unnecessary for high-level planning. Current methods for learning to identify symbolic states in visual data require large amounts of labeled training data, but manually annotating such datasets is prohibitively expensive due to the combinatorial number of predicates in images. We propose a novel method for automatically labeling symbolic states in large-scale video activity datasets by exploiting known pre- and post-conditions of actions. This automatic labeling scheme only requires weak supervision in the form of an action label that describes which action is demonstrated in each video. We use our framework to train predicate classifiers to identify symbolic relationships between objects when prompted with object bounding boxes, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Robot Manipulation and Learning