Visual intelligence for efficient human action recognition in human computers interaction applications
Noorah Alghasham, Waleed Albattah

TL;DR
This paper introduces a deep learning model combining CNNs and RNNs for efficient and accurate human action recognition in human-computer interaction.
Contribution
A novel HAR model using EfficientNetB7 and LSTM for high accuracy and low computational cost without data augmentation.
Findings
The model achieved 97.8% accuracy on the UCF101 dataset.
It outperformed existing models on the HMDB51 dataset with 80.1% accuracy.
The model reduces computational complexity and avoids the need for data augmentation.
Abstract
Human Action Recognition (HAR) is a pivotal area in computer vision, video surveillance, and human-computer interaction (HCI), driven by the need for efficient and accurate models to enhance HCI experiences. Traditional HAR methods often rely on hand-crafted features and shallow learning techniques, which limits their ability to capture complex patterns. In contrast, this study proposes an efficient HAR model that leverages deep neural networks, specifically a combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), to enhance HCI through AI-powered action understanding. The model employs a pre-trained EfficientNetB7 network to extract rich spatial features from video frames, followed by a Long Short-Term Memory (LSTM) network to capture long-range temporal dependencies. This architecture enhances recognition accuracy while reducing computational…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Multimodal Machine Learning Applications
