Zero-Shot Activity Recognition with Verb Attribute Induction

Rowan Zellers; Yejin Choi

arXiv:1707.09468·cs.CL·September 5, 2017

Zero-Shot Activity Recognition with Verb Attribute Induction

Rowan Zellers, Yejin Choi

PDF

2 Repos

TL;DR

This paper presents a method for zero-shot activity recognition by modeling and inferring action verb attributes from language, enabling the recognition of unseen activities based on their linguistic descriptions.

Contribution

The study introduces a novel approach that learns to infer action attributes from language, improving zero-shot activity recognition beyond prior object-focused methods.

Findings

01

Action attributes inferred from language improve zero-shot prediction.

02

The model successfully recognizes unseen activities using linguistic attribute induction.

03

Language-based attribute inference enhances activity recognition accuracy.

Abstract

In this paper, we investigate large-scale zero-shot activity recognition by modeling the visual and linguistic attributes of action verbs. For example, the verb "salute" has several properties, such as being a light movement, a social act, and short in duration. We use these attributes as the internal mapping between visual and textual representations to reason about a previously unseen action. In contrast to much prior work that assumes access to gold standard attributes for zero-shot classes and focuses primarily on object attributes, our model uniquely learns to infer action attributes from dictionary definitions and distributed word representations. Experimental results confirm that action attributes inferred from language can provide a predictive signal for zero-shot prediction of previously unseen activities.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.