Weakly Supervised Learning of Heterogeneous Concepts in Videos

Sohil Shah; Kuldeep Kulkarni; Arijit Biswas; Ankit Gandhi; Om Deshmukh; and Larry Davis

arXiv:1607.03240·cs.CV·July 13, 2016

Weakly Supervised Learning of Heterogeneous Concepts in Videos

Sohil Shah, Kuldeep Kulkarni, Arijit Biswas, Ankit Gandhi, Om Deshmukh, and Larry Davis

PDF

Open Access

TL;DR

This paper introduces a generalized Indian Buffet Process model for weakly supervised learning in videos, enabling the classification and localization of heterogeneous concepts with location constraints, outperforming existing methods.

Contribution

It extends the IBP to handle heterogeneous concepts and location constraints in videos, providing a unified probabilistic framework for weakly supervised learning.

Findings

01

24% improvement in concept classification on Casablanca dataset

02

9% improvement in localization on A2D dataset

03

Effective integration of heterogeneous concepts and location constraints

Abstract

Typical textual descriptions that accompany online videos are 'weak': i.e., they mention the main concepts in the video but not their corresponding spatio-temporal locations. The concepts in the description are typically heterogeneous (e.g., objects, persons, actions). Certain location constraints on these concepts can also be inferred from the description. The goal of this paper is to present a generalization of the Indian Buffet Process (IBP) that can (a) systematically incorporate heterogeneous concepts in an integrated framework, and (b) enforce location constraints, for efficient classification and localization of the concepts in the videos. Finally, we develop posterior inference for the proposed formulation using mean-field variational approximation. Comparative evaluations on the Casablanca and the A2D datasets show that the proposed approach significantly outperforms other…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning