EgoEMS: A High-Fidelity Multimodal Egocentric Dataset for Cognitive Assistance in Emergency Medical Services
Keshara Weerasinghe, Xueren Ge, Tessa Heick, Lahiru Nuwan Wijayasingha, Anthony Cortez, Abhishek Satpathy, John Stankovic, Homa Alemzadeh

TL;DR
EgoEMS is a comprehensive, high-fidelity multimodal dataset capturing realistic emergency medical scenarios from an egocentric perspective, designed to facilitate AI development for cognitive assistance in EMS.
Contribution
This paper introduces EgoEMS, the first detailed multimodal dataset of EMS activities with annotations, benchmarks, and a low-cost, open-source data collection system.
Findings
EgoEMS includes over 20 hours of EMS activities from 62 participants.
Benchmarks for keystep recognition and action quality estimation are provided.
The dataset reflects real-world emergency dynamics with detailed annotations.
Abstract
Emergency Medical Services (EMS) are critical to patient survival in emergencies, but first responders often face intense cognitive demands in high-stakes situations. AI cognitive assistants, acting as virtual partners, have the potential to ease this burden by supporting real-time data collection and decision making. In pursuit of this vision, we introduce EgoEMS, the first end-to-end, high-fidelity, multimodal, multiperson dataset capturing over 20 hours of realistic, procedural EMS activities from an egocentric view in 233 simulated emergency scenarios performed by 62 participants, including 46 EMS professionals. Developed in collaboration with EMS experts and aligned with national standards, EgoEMS is captured using an open-source, low-cost, and replicable data collection system and is annotated with keysteps, timestamped audio transcripts with speaker diarization, action quality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSocial Robot Interaction and HRI · Multimodal Machine Learning Applications · Speech and dialogue systems
