Weakly-Supervised Action Localization with Expectation-Maximization   Multi-Instance Learning

Zhekun Luo; Devin Guillory; Baifeng Shi; Wei Ke; Fang Wan; Trevor; Darrell; Huijuan Xu

arXiv:2004.00163·cs.CV·December 23, 2020·25 cites

Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning

Zhekun Luo, Devin Guillory, Baifeng Shi, Wei Ke, Fang Wan, Trevor, Darrell, Huijuan Xu

PDF

Open Access 1 Repo

TL;DR

This paper introduces an EM-based multiple instance learning framework for weakly-supervised action localization, explicitly modeling key instance assignment to improve accuracy and achieve state-of-the-art results.

Contribution

It proposes an EM-MIL approach that explicitly models key instance assignment, addressing limitations of attention-based methods under MIL assumptions.

Findings

01

Achieves state-of-the-art performance on THUMOS14.

02

Outperforms previous weakly-supervised methods.

03

More accurately models MIL assumptions.

Abstract

Weakly-supervised action localization requires training a model to localize the action segments in the video given only video level action label. It can be solved under the Multiple Instance Learning (MIL) framework, where a bag (video) contains multiple instances (action segments). Since only the bag's label is known, the main challenge is assigning which key instances within the bag to trigger the bag's label. Most previous models use attention-based approaches applying attentions to generate the bag's representation from instances, and then train it via the bag's classification. These models, however, implicitly violate the MIL assumption that instances in negative bags should be uniformly negative. In this work, we explicitly model the key instances assignment as a hidden variable and adopt an Expectation-Maximization (EM) framework. We derive two pseudo-label generation schemes to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

airmachine/EM-MIL-WeaklyActionDetection
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Anomaly Detection Techniques and Applications