# Tuberculosis Detection from Cough Recordings Using Bag-of-Words Classifiers

**Authors:** Irina Pavel, Iulian B. Ciocoiu

PMC · DOI: 10.3390/s25196133 · 2025-10-03

## TL;DR

This paper explores using Bag-of-Words classifiers to detect tuberculosis from cough sounds, achieving strong performance metrics.

## Contribution

The novel use of Bag-of-Words classifiers for tuberculosis detection from audio data is proposed and evaluated.

## Key findings

- The approach achieved up to 0.77 accuracy and 0.84 AUC on large cough datasets.
- The method is robust to different feature extraction and encoding combinations.
- Performance was validated using repeated k-fold cross-validation and external datasets.

## Abstract

The paper proposes the use of Bag-of-Words classifiers for the reliable detection of tuberculosis infection from cough recordings. The effect of using both independent and combined distinct feature extraction procedures and encoding strategies is evaluated in terms of standard performance metrics such as the Area Under Curve (AUC), accuracy, sensitivity, and F1-score. Experiments were conducted on two distinct large datasets, using both the original recordings and extended versions obtained by augmentation techniques. Performances were assessed by repeated k-fold cross-validation and by employing external datasets. An extensive ablation study revealed that the proposed approach yields up to 0.77 accuracy and 0.84 AUC values, comparing favorably against existing solutions and exhibiting robustness against various combinations of the setup parameters.

## Linked entities

- **Diseases:** tuberculosis (MONDO:0018076)

## Full-text entities

- **Diseases:** Tuberculosis (MESH:D014376), Cough (MESH:D003371)

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12526601/full.md

---
Source: https://tomesphere.com/paper/PMC12526601