Accuracy and Performance Comparison of Video Action Recognition   Approaches

Matthew Hutchinson; Siddharth Samsi; William Arcand; David Bestor,; Bill Bergeron; Chansup Byun; Micheal Houle; Matthew Hubbell; Micheal Jones,; Jeremy Kepner; Andrew Kirby; Peter Michaleas; Lauren Milechin; Julie Mullen,; Andrew Prout; Antonio Rosa; Albert Reuther; Charles Yee; Vijay Gadepally

arXiv:2008.09037·cs.CV·January 5, 2021

Accuracy and Performance Comparison of Video Action Recognition Approaches

Matthew Hutchinson, Siddharth Samsi, William Arcand, David Bestor,, Bill Bergeron, Chansup Byun, Micheal Houle, Matthew Hubbell, Micheal Jones,, Jeremy Kepner, Andrew Kirby, Peter Michaleas, Lauren Milechin, Julie Mullen,, Andrew Prout, Antonio Rosa, Albert Reuther, Charles Yee

PDF

TL;DR

This paper provides a fair comparison of fourteen video action recognition models, evaluating their accuracy and computational performance under consistent training conditions and introducing a new accuracy metric.

Contribution

It offers a comprehensive, standardized comparison of state-of-the-art video action recognition models, including a new accuracy metric and performance analysis on HPC systems.

Findings

01

Models show varying accuracy levels under consistent training conditions.

02

Distributed training scales efficiently from 2 to 64 GPUs.

03

New accuracy metric provides additional insight into model performance.

Abstract

Over the past few years, there has been significant interest in video action recognition systems and models. However, direct comparison of accuracy and computational performance results remain clouded by differing training environments, hardware specifications, hyperparameters, pipelines, and inference methods. This article provides a direct comparison between fourteen off-the-shelf and state-of-the-art models by ensuring consistency in these training characteristics in order to provide readers with a meaningful comparison across different types of video action recognition algorithms. Accuracy of the models is evaluated using standard Top-1 and Top-5 accuracy metrics in addition to a proposed new accuracy metric. Additionally, we compare computational performance of distributed training from two to sixty-four GPUs on a state-of-the-art HPC system.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.