# Composite dietary antioxidant index and HPV infection from single and mixed associations to SHAP-interpreted machine learning predictions

**Authors:** Pei Zhang

PMC · DOI: 10.3389/fnut.2025.1619742 · 2025-07-31

## TL;DR

A higher intake of dietary antioxidants, especially vitamin E, is linked to a lower risk of HPV infection in women, and a machine learning model can predict this risk using factors like antioxidant intake.

## Contribution

This study introduces a machine learning model (GBM) with SHAP interpretation to predict HPV infection risk using dietary antioxidant data.

## Key findings

- CDAI was independently negatively associated with HPV infection (OR: 0.98, 95%CI: 0.97–0.99).
- Vitamin E showed the strongest negative association with HPV infection in mixture analyses.
- The Gradient Boosting Machine (GBM) model achieved an AUC of 0.685 in predicting HPV infection.

## Abstract

Some studies have shown that dietary antioxidants may prevent the occurrence of Human Papillomavirus (HPV) infection. However, the relationship between the composite dietary antioxidant index (CDAI) and HPV infection among adult women in the United States remains unknown.

Participants from the National Health and Nutrition Examination Survey (NHANES) during 2003–2016 were included. Multivariable logistic regression, restricted cubic spline (RCS) regression, weighted quantile sum (WQS) regression, and Bayesian kernel machine regression (BKMR) were used to analyze the associations between CDAI and its sub-components and HPV infection. In addition, nine machine learning (ML) methods were employed to construct predictive models, and SHapley Additive exPlanations (SHAP) was used to further interpret the optimal model.

This study enrolled 9,224 adult female participants. After adjusting for multiple confounding variables, CDAI was independently negatively associated with HPV infection (OR: 0.98, 95%CI: 0.97–0.99, p = 0.01). RCS indicated an L-shaped association between CDAI and HPV infection. In the WQS model, the WQS index of CDAI was still robustly negatively associated with HPV infection (OR: 0.78, 95%CI: 0.71–0.86, p < 0.0001). In the mixture effect, BKMR analysis confirmed the negative association between six antioxidants and HPV infection. Both WQS and BKMR confirmed that vitamin E had the strongest negative association with HPV infection. Additionally, among the nine machine—learning models, the Gradient Boosting Machine (GBM) showed the best predictive performance [area under curve (AUC) = 0.685]. SHAP analysis indicated that marital status, smoking, drinking, race, age, and CDAI had a significant impact on the model’s prediction.

Antioxidant—rich diets, especially increased intake of vitamin E, are significantly negatively associated with HPV infection. A GBM model with 12 features can effectively predict the occurrence of HPV infection, among which CDAI is an important factor in the model.

## Linked entities

- **Diseases:** Human Papillomavirus infection (MONDO:0005161)

## Full-text entities

- **Diseases:** HPV infection (MESH:D030361)
- **Chemicals:** vitamin E (MESH:D014810)
- **Species:** Homo sapiens (human, species) [taxon 9606], Human papillomavirus (species) [taxon 10566]

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12350107/full.md

---
Source: https://tomesphere.com/paper/PMC12350107