TL;DR
This paper investigates using Transformer neural networks to detect respiratory insufficiency in COVID-19 patients from speech samples, achieving high accuracy with self-supervised acoustic modeling.
Contribution
It introduces a Transformer-based approach with self-supervised pretraining that significantly improves respiratory insufficiency detection accuracy over previous CNN methods.
Findings
Transformer models achieve 96.53% accuracy in RI detection.
Self-supervised pretraining enhances model performance.
Speech analysis can serve as a reliable biomarker for respiratory issues.
Abstract
This work explores speech as a biomarker and investigates the detection of respiratory insufficiency (RI) by analyzing speech samples. Previous work \cite{spira2021} constructed a dataset of respiratory insufficiency COVID-19 patient utterances and analyzed it by means of a convolutional neural network achieving an accuracy of , validating the hypothesis that one can detect RI through speech. Here, we study how Transformer neural network architectures can improve the performance on RI detection. This approach enables construction of an acoustic model. By choosing the correct pretraining technique, we generate a self-supervised acoustic model, leading to improved performance () of Transformers for RI detection.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Adam · Dense Connections · Softmax · Position-Wise Feed-Forward Layer · Label Smoothing · Absolute Position Encodings · Layer Normalization
