Multimodal Transformer for Nursing Activity Recognition
Momal Ijaz, Renato Diaz, Chen Chen

TL;DR
This paper introduces a multimodal transformer network that combines skeletal and acceleration data for nurse activity recognition, achieving state-of-the-art accuracy and outperforming existing methods.
Contribution
The paper proposes a novel multimodal transformer architecture that fuses skeletal and acceleration data for improved nurse activity recognition performance.
Findings
Achieved 81.8% accuracy on the NCRC dataset.
Outperformed state-of-the-art methods by 1.6%.
Demonstrated the effectiveness of multimodal fusion over single modalities.
Abstract
In an aging population, elderly patient safety is a primary concern at hospitals and nursing homes, which demands for increased nurse care. By performing nurse activity recognition, we can not only make sure that all patients get an equal desired care, but it can also free nurses from manual documentation of activities they perform, leading to a fair and safe place of care for the elderly. In this work, we present a multimodal transformer-based network, which extracts features from skeletal joints and acceleration data, and fuses them to perform nurse activity recognition. Our method achieves state-of-the-art performance of 81.8% accuracy on the benchmark dataset available for nurse activity recognition from the Nurse Care Activity Recognition Challenge. We perform ablation studies to show that our fusion model is better than single modality transformer variants (using only acceleration…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsContext-Aware Activity Recognition Systems · Health and Well-being Studies
MethodsGated Recurrent Unit
