Learning Goals from Failure

Dave Epstein; Carl Vondrick

arXiv:2006.15657·cs.CV·December 17, 2020

Learning Goals from Failure

Dave Epstein, Carl Vondrick

PDF

Open Access

TL;DR

This paper presents a framework that learns to predict human goals from unintentional actions in videos, leveraging developmental psychology insights to improve goal understanding without extensive supervision.

Contribution

It introduces a novel approach to learn goal representations from unintentional actions in videos, outperforming supervised methods with minimal supervision.

Findings

01

Model accurately predicts underlying goals in unintentional videos.

02

Method can automatically correct unintentional actions using gradient signals.

03

Outperforms supervised baselines trained on large datasets.

Abstract

We introduce a framework that predicts the goals behind observable human action in video. Motivated by evidence in developmental psychology, we leverage video of unintentional action to learn video representations of goals without direct supervision. Our approach models videos as contextual trajectories that represent both low-level motion and high-level action features. Experiments and visualizations show our trained model is able to predict the underlying goals in video of unintentional action. We also propose a method to "automatically correct" unintentional action by leveraging gradient signals of our model to adjust latent trajectories. Although the model is trained with minimal supervision, it is competitive with or outperforms baselines trained on large (supervised) datasets of successfully executed goals, showing that observing unintentional action is crucial to learning about…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning