Benchmarking Data Efficiency and Computational Efficiency of Temporal   Action Localization Models

Jan Warchocki; Teodor Oprescu; Yunhan Wang; Alexandru Damacus; Paul; Misterka; Robert-Jan Bruintjes; Attila Lengyel; Ombretta Strafforello; Jan; van Gemert

arXiv:2308.13082·cs.CV·August 28, 2023

Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models

Jan Warchocki, Teodor Oprescu, Yunhan Wang, Alexandru Damacus, Paul, Misterka, Robert-Jan Bruintjes, Attila Lengyel, Ombretta Strafforello, Jan, van Gemert

PDF

Open Access

TL;DR

This paper evaluates the data and computational efficiency of various temporal action localization models, highlighting TemporalMaxer’s superior performance in data-limited scenarios and TriDet’s efficiency during training.

Contribution

It provides a systematic benchmarking of deep models under data and computational constraints, introducing insights into their efficiency and recommending suitable models for limited-resource settings.

Findings

01

TemporalMaxer outperforms others in data-limited training.

02

TemporalMaxer requires the least computational resources during inference.

03

TriDet is recommended for training time-limited scenarios.

Abstract

In temporal action localization, given an input video, the goal is to predict which actions it contains, where they begin, and where they end. Training and testing current state-of-the-art deep learning models requires access to large amounts of data and computational power. However, gathering such data is challenging and computational resources might be limited. This work explores and measures how current deep temporal action localization models perform in settings constrained by the amount of data or computational power. We measure data efficiency by training each model on a subset of the training set. We find that TemporalMaxer outperforms other models in data-limited settings. Furthermore, we recommend TriDet when training time is limited. To test the efficiency of the models during inference, we pass videos of different lengths through each model. We find that TemporalMaxer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)