# Machine learning augmented diagnostic testing to identify sources of variability in test performance

**Authors:** Christopher Jon Banks, Aeron Sanchez, Vicki Stewart, Kate Bowen, Thomas Doherty, Oliver Tearne, Graham Smith, Rowland R. Kao

PMC · DOI: 10.1371/journal.pcbi.1013651 · 2025-11-04

## TL;DR

This paper shows how machine learning can improve diagnostic testing for bovine tuberculosis, helping detect more infected herds without increasing false positives.

## Contribution

The novel use of machine learning to augment diagnostic testing for bovine tuberculosis, improving detection rates without compromising specificity.

## Key findings

- Machine learning improved detection of infected herds by over 5 percentage points, equivalent to 240 additional herds per year.
- The model can reduce unnecessary restrictions on over 5,000 uninfected herds when tuned for specificity.
- Simulation models suggest the approach could reduce infected animals and outbreaks in high-risk areas over time.

## Abstract

Diagnostic tests that can detect pre-clinical or sub-clinical infection, are one of the most powerful tools in our armoury of weapons to control infectious diseases. Considerable effort has been paid to improving diagnostic testing for human, plant and animal diseases, including strategies for targeting the use of diagnostic tests towards individuals who are more likely to be infected. We use machine learning to assess the surrounding risk landscape under which a diagnostic test is applied to augment its interpretation. We develop this to predict the occurrence of bovine tuberculosis incidents in cattle herds, exploiting the availability of exceptionally detailed testing records. We show that, without compromising test specificity, test sensitivity can be improved so that the proportion of infected herds detected improves by over 5 percentage points, or 240 additional infected herds detected in one year beyond those detected by the skin test alone. We also use feature importance testing for assessing the weighting of risk factors. While many factors are associated with increased risk of incidents, of note are several factors that suggest that in some herds there is a higher risk of infection going undetected.

Bovine tuberculosis (bTB) remains a major challenge for cattle farming in Great Britain, causing significant economic and animal welfare impacts. The standard skin test used to detect bTB is highly specific but can miss some infected herds. In this study, we used machine learning to combine detailed national testing records with herd information, creating a model that improves the detection of infected herds. Our approach increases the proportion of infected herds identified by over 5 percentage points—equivalent to 240 additional herds detected in one year—without increasing the number of false positives. Alternatively, if the model is tuned to focus on specificity, it can reduce unnecessary restrictions on over 5,000 herds that are not truly infected. We also used a simulation model to show that these improvements could potentially reduce the number of infected animals and outbreaks in high-risk areas over time. Our results demonstrate that machine learning can enhance existing disease testing strategies, offering practical benefits for disease control and farming communities.

## Linked entities

- **Diseases:** bovine tuberculosis (MONDO:0025136)

## Full-text entities

- **Diseases:** infectious diseases (MESH:D003141), infected (MESH:D007239)
- **Species:** Bos taurus (bovine, species) [taxon 9913], Homo sapiens (human, species) [taxon 9606]

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12646444/full.md

---
Source: https://tomesphere.com/paper/PMC12646444