Using Self-Supervised Feature Extractors with Attention for Automatic   COVID-19 Detection from Speech

John Mendon\c{c}a; Rub\'en Solera-Ure\~na; Alberto Abad; Isabel; Trancoso

arXiv:2107.00112·eess.AS·July 2, 2021·1 cites

Using Self-Supervised Feature Extractors with Attention for Automatic COVID-19 Detection from Speech

John Mendon\c{c}a, Rub\'en Solera-Ure\~na, Alberto Abad, Isabel, Trancoso

PDF

Open Access

TL;DR

This study evaluates the effectiveness of self-supervised speech feature extractors combined with attention mechanisms for automatic COVID-19 detection from speech, showing competitive or superior performance to traditional methods.

Contribution

It introduces the use of self-supervised speech representations with attention pooling for COVID-19 detection, demonstrating improved accuracy over traditional handcrafted features.

Findings

01

Self-supervised features outperform handcrafted features.

02

Attention pooling enhances utterance-level information aggregation.

03

Best model achieves 72.3% UAR on development set.

Abstract

The ComParE 2021 COVID-19 Speech Sub-challenge provides a test-bed for the evaluation of automatic detectors of COVID-19 from speech. Such models can be of value by providing test triaging capabilities to health authorities, working alongside traditional testing methods. Herein, we leverage the usage of pre-trained, problem agnostic, speech representations and evaluate their use for this task. We compare the obtained results against a CNN architecture trained from scratch and traditional frequency-domain representations. We also evaluate the usage of Self-Attention Pooling as an utterance-level information aggregation method. Experimental results demonstrate that models trained on features extracted from self-supervised models perform similarly or outperform fully-supervised models and models based on handcrafted features. Our best model improves the Unweighted Average Recall (UAR) from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Misinformation and Its Impacts · Speech Recognition and Synthesis