Respiratory Status Detection with Video Transformers
Thomas Savage, Evan Madill

TL;DR
This paper explores using advanced video transformer models to automatically detect respiratory distress from videos, potentially aiding early clinical intervention.
Contribution
It demonstrates that a ViViT-based model with Lie Relative Encodings and Motion Guided Masking effectively recognizes respiratory distress from video clips.
Findings
Achieved an F1 score of 0.81 on respiratory distress detection.
Video transformers can detect subtle respiratory mechanics changes.
Proposed model outperforms baseline approaches.
Abstract
Recognition of respiratory distress through visual inspection is a life saving clinical skill. Clinicians can detect early signs of respiratory deterioration, creating a valuable window for earlier intervention. In this study, we evaluate whether recent advances in video transformers can enable Artificial Intelligence systems to recognize the signs of respiratory distress from video. We collected videos of healthy volunteers recovering after strenuous exercise and used the natural recovery of each participants respiratory status to create a labeled dataset for respiratory distress. Splitting the video into short clips, with earlier clips corresponding to more shortness of breath, we designed a temporal ordering challenge to assess whether an AI system can detect respiratory distress. We found a ViViT encoder augmented with Lie Relative Encodings (LieRE) and Motion Guided Masking,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNon-Invasive Vital Sign Monitoring · Phonocardiography and Auscultation Techniques · Healthcare Technology and Patient Monitoring
