Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning
Zhekun Luo, Devin Guillory, Baifeng Shi, Wei Ke, Fang Wan, Trevor, Darrell, Huijuan Xu

TL;DR
This paper introduces an EM-based multiple instance learning framework for weakly-supervised action localization, explicitly modeling key instance assignment to improve accuracy and achieve state-of-the-art results.
Contribution
It proposes an EM-MIL approach that explicitly models key instance assignment, addressing limitations of attention-based methods under MIL assumptions.
Findings
Achieves state-of-the-art performance on THUMOS14.
Outperforms previous weakly-supervised methods.
More accurately models MIL assumptions.
Abstract
Weakly-supervised action localization requires training a model to localize the action segments in the video given only video level action label. It can be solved under the Multiple Instance Learning (MIL) framework, where a bag (video) contains multiple instances (action segments). Since only the bag's label is known, the main challenge is assigning which key instances within the bag to trigger the bag's label. Most previous models use attention-based approaches applying attentions to generate the bag's representation from instances, and then train it via the bag's classification. These models, however, implicitly violate the MIL assumption that instances in negative bags should be uniformly negative. In this work, we explicitly model the key instances assignment as a hidden variable and adopt an Expectation-Maximization (EM) framework. We derive two pseudo-label generation schemes to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Anomaly Detection Techniques and Applications
