A Multi-Modal CNN-LSTM Framework with Multi-Head Attention and Focal Loss for Real-Time Elderly Fall Detection

Lijie Zhou; Luran Wang

arXiv:2603.22313·cs.LG·March 25, 2026

A Multi-Modal CNN-LSTM Framework with Multi-Head Attention and Focal Loss for Real-Time Elderly Fall Detection

Lijie Zhou, Luran Wang

PDF

Open Access

TL;DR

This paper introduces a multi-modal deep learning framework combining CNN, LSTM, multi-head attention, and focal loss for real-time elderly fall detection, achieving high accuracy and low latency on wearable sensor data.

Contribution

The novel MultiModalFallDetector integrates multi-scale CNN, multi-head attention, and transfer learning to improve fall detection accuracy and efficiency over existing methods.

Findings

01

Achieved 98.7% F1-score on SisFall dataset

02

Maintains sub-50ms inference latency on edge devices

03

Outperforms traditional machine learning and standard deep learning approaches

Abstract

The increasing global aging population has intensified the demand for reliable health monitoring systems, particularly those capable of detecting critical events such as falls among elderly individuals. Traditional fall detection approaches relying on single-modality acceleration data suffer from high false alarm rates, while conventional machine learning methods require extensive hand-crafted feature engineering. This paper proposes a novel multi-modal deep learning framework, MultiModalFallDetector, designed for real-time elderly fall detection using wearable sensors. Our approach integrates multiple innovations: a multi-scale CNN-based feature extractor capturing motion dynamics at varying temporal resolutions; fusion of tri-axial accelerometer, gyroscope, and four-channel physiological signals; incorporation of a multi-head self-attention mechanism for dynamic temporal weighting;…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsContext-Aware Activity Recognition Systems · Balance, Gait, and Falls Prevention · Human Pose and Action Recognition