SMART-Vision: Survey of Modern Action Recognition Techniques in Vision

Ali K. AlShami; Ryan Rabinowitz; Khang Lam; Yousra Shleibik; Melkamu; Mersha; Terrance Boult; and Jugal Kalita

arXiv:2501.13066·cs.CV·January 23, 2025

SMART-Vision: Survey of Modern Action Recognition Techniques in Vision

Ali K. AlShami, Ryan Rabinowitz, Khang Lam, Yousra Shleibik, Melkamu, Mersha, Terrance Boult, and Jugal Kalita

PDF

TL;DR

This survey introduces the SMART-Vision taxonomy for human action recognition, emphasizing hybrid deep learning approaches, datasets, and emerging open-set challenges in the field.

Contribution

It presents a novel taxonomy that categorizes hybrid HAR methods and discusses how different architectures and modalities are integrated in modern systems.

Findings

01

Highlights the importance of hybrid approaches in HAR

02

Provides a comprehensive overview of datasets used in HAR

03

Discusses emerging open-HAR challenges and future directions

Abstract

Human Action Recognition (HAR) is a challenging domain in computer vision, involving recognizing complex patterns by analyzing the spatiotemporal dynamics of individuals' movements in videos. These patterns arise in sequential data, such as video frames, which are often essential to accurately distinguish actions that would be ambiguous in a single image. HAR has garnered considerable interest due to its broad applicability, ranging from robotics and surveillance systems to sports motion analysis, healthcare, and the burgeoning field of autonomous vehicles. While several taxonomies have been proposed to categorize HAR approaches in surveys, they often overlook hybrid methodologies and fail to demonstrate how different models incorporate various architectures and modalities. In this comprehensive survey, we present the novel SMART-Vision taxonomy, which illustrates how innovations in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.