SMART-Vision: Survey of Modern Action Recognition Techniques in Vision
Ali K. AlShami, Ryan Rabinowitz, Khang Lam, Yousra Shleibik, Melkamu, Mersha, Terrance Boult, and Jugal Kalita

TL;DR
This survey introduces the SMART-Vision taxonomy for human action recognition, emphasizing hybrid deep learning approaches, datasets, and emerging open-set challenges in the field.
Contribution
It presents a novel taxonomy that categorizes hybrid HAR methods and discusses how different architectures and modalities are integrated in modern systems.
Findings
Highlights the importance of hybrid approaches in HAR
Provides a comprehensive overview of datasets used in HAR
Discusses emerging open-HAR challenges and future directions
Abstract
Human Action Recognition (HAR) is a challenging domain in computer vision, involving recognizing complex patterns by analyzing the spatiotemporal dynamics of individuals' movements in videos. These patterns arise in sequential data, such as video frames, which are often essential to accurately distinguish actions that would be ambiguous in a single image. HAR has garnered considerable interest due to its broad applicability, ranging from robotics and surveillance systems to sports motion analysis, healthcare, and the burgeoning field of autonomous vehicles. While several taxonomies have been proposed to categorize HAR approaches in surveys, they often overlook hybrid methodologies and fail to demonstrate how different models incorporate various architectures and modalities. In this comprehensive survey, we present the novel SMART-Vision taxonomy, which illustrates how innovations in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
