Learning Goals from Failure
Dave Epstein, Carl Vondrick

TL;DR
This paper presents a framework that learns to predict human goals from unintentional actions in videos, leveraging developmental psychology insights to improve goal understanding without extensive supervision.
Contribution
It introduces a novel approach to learn goal representations from unintentional actions in videos, outperforming supervised methods with minimal supervision.
Findings
Model accurately predicts underlying goals in unintentional videos.
Method can automatically correct unintentional actions using gradient signals.
Outperforms supervised baselines trained on large datasets.
Abstract
We introduce a framework that predicts the goals behind observable human action in video. Motivated by evidence in developmental psychology, we leverage video of unintentional action to learn video representations of goals without direct supervision. Our approach models videos as contextual trajectories that represent both low-level motion and high-level action features. Experiments and visualizations show our trained model is able to predict the underlying goals in video of unintentional action. We also propose a method to "automatically correct" unintentional action by leveraging gradient signals of our model to adjust latent trajectories. Although the model is trained with minimal supervision, it is competitive with or outperforms baselines trained on large (supervised) datasets of successfully executed goals, showing that observing unintentional action is crucial to learning about…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
