A Novel Deep Hybrid Framework with Ensemble-Based Feature Optimization for Robust Real-Time Human Activity Recognition
Wasi Ullah, Yasir Noman Khalid, Saddam Hussain Khan

TL;DR
This paper introduces a deep hybrid framework combining customized CNN, attention-augmented LSTM, and a novel feature selection method to improve real-time human activity recognition accuracy and efficiency in challenging environments.
Contribution
It proposes a novel adaptive feature selection mechanism integrated with a hybrid deep learning model for robust, real-time HAR with reduced computational costs.
Findings
Achieved up to 99.65% accuracy with only seven features.
Enhanced robustness and efficiency in complex, real-world scenarios.
Improved inference time on the UCF-YouTube dataset.
Abstract
Real-time Human Activity Recognition (HAR) has wide-ranging applications in areas such as context-aware environments, public safety, assistive technologies, and autonomous monitoring and surveillance systems. However, existing real-time HAR systems face significant challenges, including limited scalability and high computational costs arising from redundant features. To address these issues, the Inception-V3 model was customized with region-based and boundary-aware operations, using average pooling and max pooling, respectively, to enhance region homogeneity, suppress noise, and capture discriminative local features, while improving robustness through down-sampling. Furthermore, to effectively encode motion dynamics, an Attention-Augmented Long Short-Term Memory (AA-LSTM) network was employed to learn temporal dependencies across video frames. Features are extracted from video dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
