# Using buccal methylomic data to create explainable aging clocks as well as classifiers and regressors for lifestyle and demographic factors

**Authors:** Maxim N. Shokhirev, Adiv A. Johnson

PMC · DOI: 10.3389/fgene.2025.1637186 · 2025-10-01

## TL;DR

This study shows that DNA methylation from cheek swabs can predict age, smoking, race, and lifestyle factors, with insights into the biological pathways involved.

## Contribution

The study introduces novel regressors and explainable deep learning models using buccal methylomic data for lifestyle and demographic predictions.

## Key findings

- Classifiers for smoking status and race/ethnicity were successfully built using buccal methylomic data.
- Regressors for BMI, alcohol intake, and chronological age were developed with identified key CpGs and biological pathways.
- Explainable deep learning models linked DNA methylation sites to Reactome pathways and transcription factors for age estimation.

## Abstract

In human blood, it has been demonstrated that methylomic information can be used to predict smoking status, alcohol intake, and chronological age. While it is possible to robustly predict chronological age using DNA methylation information derived from buccal tissue, it remains to be determined if other variables can be directly predicted in cheek swabs. Here, we demonstrate that classifiers for smoking status and race/ethnicity can be built in a buccal methylomic dataset derived from 8,045 adults spanning an age range of 18–93 years. Furthermore, we build novel regressors for body mass index, alcohol intake, and chronological age. For each of these models, we identify the 1,000 most important CpGs and perform enrichment analyses on them to expose associated biological pathways and transcription factor targets. We additionally explore how the architecture of an epigenetic aging clock–specifically how many hidden layers are present–influences model accuracy. Finally, we build proof-of-concept, explainable deep learning models that connect DNA methylation sites annotated to genes to Reactome pathways or to transcription factors. These pathways and target sets are then used to estimate age, a feature that provides interpretability. All together, these findings further emphasize the usability of buccal data for epigenetic predictions.

## Full-text entities

- **Chemicals:** alcohol (MESH:D000438)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12521809/full.md

---
Source: https://tomesphere.com/paper/PMC12521809