Fine-tuning Pre-trained Audio Models for COVID-19 Detection: A Technical Report

Daniel Oliveira de Brito; Let\'icia Gabriella de Souza; Marcelo Matheus Gauy; Marcelo Finger; Arnaldo Candido Junior

arXiv:2511.14939·cs.SD·November 20, 2025

Fine-tuning Pre-trained Audio Models for COVID-19 Detection: A Technical Report

Daniel Oliveira de Brito, Let\'icia Gabriella de Souza, Marcelo Matheus Gauy, Marcelo Finger, Arnaldo Candido Junior

PDF

Open Access

TL;DR

This study evaluates pre-trained audio models for COVID-19 detection, revealing limited generalization across datasets and emphasizing the importance of demographic controls and larger datasets for reliable performance.

Contribution

It demonstrates the impact of demographic balancing on model evaluation and highlights the challenges of developing generalizable audio-based COVID-19 detection models.

Findings

01

Moderate intra-dataset performance with Audio-MAE on Coswara (AUC 0.82)

02

Limited cross-dataset generalization (AUC 0.43-0.68)

03

Demographic balancing reduces performance metrics but offers more realistic assessment

Abstract

This technical report investigates the performance of pre-trained audio models on COVID-19 detection tasks using established benchmark datasets. We fine-tuned Audio-MAE and three PANN architectures (CNN6, CNN10, CNN14) on the Coswara and COUGHVID datasets, evaluating both intra-dataset and cross-dataset generalization. We implemented a strict demographic stratification by age and gender to prevent models from exploiting spurious correlations between demographic characteristics and COVID-19 status. Intra-dataset results showed moderate performance, with Audio-MAE achieving the strongest result on Coswara (0.82 AUC, 0.76 F1-score), while all models demonstrated limited performance on Coughvid (AUC 0.58-0.63). Cross-dataset evaluation revealed severe generalization failure across all models (AUC 0.43-0.68), with Audio-MAE showing strong performance degradation (F1-score 0.00-0.08). Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Face recognition and analysis · Domain Adaptation and Few-Shot Learning