Temporal Localization of Fine-Grained Actions in Videos by Domain   Transfer from Web Images

Chen Sun; Sanketh Shetty; Rahul Sukthankar; Ram Nevatia

arXiv:1504.00983·cs.CV·August 5, 2015

Temporal Localization of Fine-Grained Actions in Videos by Domain Transfer from Web Images

Chen Sun, Sanketh Shetty, Rahul Sukthankar, Ram Nevatia

PDF

1 Repo

TL;DR

This paper introduces a domain transfer approach using web images and weak video labels to localize fine-grained actions in untrimmed videos, enabling effective training of action recognition models.

Contribution

It proposes a novel cross-domain transfer method leveraging noisy web images and weak labels to improve fine-grained action localization in videos.

Findings

01

Effective localization of actions using web images.

02

High accuracy on FGA-240 and THUMOS 2014 datasets.

03

Robust training with noisy, weakly labeled data.

Abstract

We address the problem of fine-grained action localization from temporally untrimmed web videos. We assume that only weak video-level annotations are available for training. The goal is to use these weak labels to identify temporal segments corresponding to the actions, and learn models that generalize to unconstrained web videos. We find that web images queried by action names serve as well-localized highlights for many actions, but are noisily labeled. To solve this problem, we propose a simple yet effective method that takes weak video labels and noisy image labels as input, and generates localized action frames as output. This is achieved by cross-domain transfer between video frames and web images, using pre-trained deep convolutional neural networks. We then use the localized action frames to train action recognition models with long short-term memory networks. We collect a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhengshou/AutoLoc
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.