# Adaptive modelling approach for predicting causes of death: insights from verbal autopsy data in Tanzania

**Authors:** Mahadia Tunga, James Chambua, Juma Lungo

PMC · DOI: 10.1093/inthealth/ihaf123 · 2025-11-17

## TL;DR

This paper introduces a machine learning model that improves accuracy in predicting causes of death using verbal autopsy data from Tanzania.

## Contribution

The study introduces an adaptive Bayesian networks model with a CoD decision flow, which has not been previously operationalized in VA research.

## Key findings

- The model achieved 97% accuracy, outperforming Support Vector Machine and Naïve Bayesian models.
- It demonstrated high specificity (97%) and sensitivity (94%), indicating strong performance in CoD classification.
- The model's adaptability allows for improved predictions as datasets expand.

## Abstract

The World Health Organization (WHO) has approved the use of a verbal autopsy (VA), a survey-based approach to generate out-of-hospital causes of death (CoDs). Through this study, an adaptive Bayesian networks machine learning model was developed and tested. The model is scalable and adaptable for predicting new causes as the dataset expands.

The 2016 WHO questionnaire was used to collect data from Iringa, Tanzania, and data augmentation was performed using the Synthetic Minority Oversampling Technique for nominal features to increase the dataset size and reduce bias in the CoD classification. The model development was guided by a CoD decision flow that integrates essential factors and steps for accurate CoD prediction. To our knowledge, no previous study has provided this operational guide for VA cause of death prediction.

The model was evaluated using accuracy, sensitivity, specificity and F1 score metrics and compared with Support Vector Machine and Naïve Bayesian models. Results indicated an average accuracy of 97%, specificity of 97%, sensitivity of 94% and F1 score of 94%, which are superior compared with Naïve Bayesian and Support Vector Machine models.

The reported performance of the developed model demonstrates the potential for this model to enhance VA-based CoD data by integrating a machine learning approach with physician expertise. The results highlight the effectiveness of combining Bayesian networks with physician Symptom Cause Information as a valuable tool in advancing the performance of CoD predictions.

## Full-text entities

- **Diseases:** Symptom (MESH:D012816), of death (MESH:D003643)

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12766446/full.md

---
Source: https://tomesphere.com/paper/PMC12766446