Audio MFCC-gram Transformers for respiratory insufficiency detection in   COVID-19

Marcelo Matheus Gauy; Marcelo Finger

arXiv:2210.14085·cs.SD·October 26, 2022

Audio MFCC-gram Transformers for respiratory insufficiency detection in COVID-19

Marcelo Matheus Gauy, Marcelo Finger

PDF

1 Repo

TL;DR

This paper investigates using Transformer neural networks to detect respiratory insufficiency in COVID-19 patients from speech samples, achieving high accuracy with self-supervised acoustic modeling.

Contribution

It introduces a Transformer-based approach with self-supervised pretraining that significantly improves respiratory insufficiency detection accuracy over previous CNN methods.

Findings

01

Transformer models achieve 96.53% accuracy in RI detection.

02

Self-supervised pretraining enhances model performance.

03

Speech analysis can serve as a reliable biomarker for respiratory issues.

Abstract

This work explores speech as a biomarker and investigates the detection of respiratory insufficiency (RI) by analyzing speech samples. Previous work \cite{spira2021} constructed a dataset of respiratory insufficiency COVID-19 patient utterances and analyzed it by means of a convolutional neural network achieving an accuracy of $87.04%$ , validating the hypothesis that one can detect RI through speech. Here, we study how Transformer neural network architectures can improve the performance on RI detection. This approach enables construction of an acoustic model. By choosing the correct pretraining technique, we generate a self-supervised acoustic model, leading to improved performance ( $96.53%$ ) of Transformers for RI detection.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

marcelomatheusgauy/audio_mfcc_gram_transformers
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Adam · Dense Connections · Softmax · Position-Wise Feed-Forward Layer · Label Smoothing · Absolute Position Encodings · Layer Normalization