# A multi-dataset exploratory framework for understanding digital behavior and substance use risk profiles

**Authors:** Perla Shiva Sindhu, Modigari Narendra

PMC · DOI: 10.3389/fpubh.2026.1771271 · Frontiers in Public Health · 2026-03-18

## TL;DR

This study explores how digital behavior and psychological factors influence substance use, using multiple datasets and machine learning to identify risk profiles.

## Contribution

The paper introduces a multi-dataset framework combining quantitative and qualitative methods to explore digital behavior and substance use risk.

## Key findings

- Nonlinear relationships were found between social media engagement, anxiety, and loneliness.
- Anxiety scores plateaued at higher digital engagement levels, challenging linear dose-response assumptions.
- Hyperparameter tuning improved machine learning model performance across datasets.

## Abstract

Substance use continues to evolve as a multidimensional public health challenge influenced by traditional behavioral triggers and emerging digital interactions. This study investigates how demographic factors, psychological states, and patterns of digital engagement shape substance use behaviors using multiple behavioral data sources.

Quantitative analyses were conducted using the NHANES dataset and a Kaggle social media psychology dataset to identify statistical relationships and train predictive machine learning models for substance use indicators and digital behavioral patterns. Random Forest, XGBoost, AdaBoost, Support Vector Regression (SVR), and Logistic Regression models were evaluated, with hyperparameter tuning applied to improve predictive performance. In addition, a supplementary survey (N = 236) was collected and used as a qualitative interpretive layer to contextualize the relationship between digital behavior and substance use risk.

The analysis revealed nonlinear relationships between social media engagement, anxiety, and loneliness. Contrary to the widely cited linear dose–response assumption, anxiety scores plateaued at higher levels of digital engagement, suggesting that the qualitative nature of online interactions may exert greater influence on psychological distress than usage duration alone. Machine learning models demonstrated improved predictive performance after hyperparameter tuning across both datasets.

These findings highlight the importance of considering digital engagement patterns alongside traditional behavioral and demographic factors in substance use research. The results support the development of platform-specific digital well-being strategies, nuanced behavioral modeling approaches, and culturally sensitive interventions that integrate both objective behavioral data and subjective user experiences. The proposed multi-source evidence framework provides a foundation for future exploratory behavioral risk profiling and prevention systems.

## Full-text entities

- **Diseases:** psychological distress (MESH:D012128), Substance use (MESH:D019966), anxiety (MESH:D001007)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13038959/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13038959/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/PMC13038959/full.md

---
Source: https://tomesphere.com/paper/PMC13038959