Classification of Infant Sleep/Wake States: Cross-Attention among Large Scale Pretrained Transformer Networks using Audio, ECG, and IMU Data
Kai Chieh Chang, Mark Hasegawa-Johnson, Nancy L. McElwain, Bashima, Islam

TL;DR
This study introduces a multi-modal transformer-based neural network that combines audio, ECG, and IMU data from wearable devices to accurately classify infant sleep and wake states, surpassing single-modality methods.
Contribution
The paper presents a novel multi-modal transformer architecture with cross-attention for infant sleep/wake classification using audio, ECG, and IMU data from wearable devices.
Findings
Multi-modal data improves accuracy to 0.880 from 0.732 with single modalities.
Pretraining individual branches enhances model performance.
Cross-attention fusion effectively integrates multi-modal information.
Abstract
Infant sleep is critical to brain and behavioral development. Prior studies on infant sleep/wake classification have been largely limited to reliance on expensive and burdensome polysomnography (PSG) tests in the laboratory or wearable devices that collect single-modality data. To facilitate data collection and accuracy of detection, we aimed to advance this field of study by using a multi-modal wearable device, LittleBeats (LB), to collect audio, electrocardiogram (ECG), and inertial measurement unit (IMU) data among a cohort of 28 infants. We employed a 3-branch (audio/ECG/IMU) large scale transformer-based neural network (NN) to demonstrate the potential of such multi-modal data. We pretrained each branch independently with its respective modality, then finetuned the model by fusing the pretrained transformer layers with cross-attention. We show that multi-modal data significantly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsObstructive Sleep Apnea Research · Infant Health and Development · Speech and Audio Processing
