# Leveraging Large Language Models to Identify Engagement-Driving Features in Vaping-Related TikTok Videos: Cross-Sectional Study

**Authors:** Zidian Xie, Nanda Kishore Korrapolu, Amisha Dubey, Luchuan Song, Chenliang Xu, Karen M Wilson, AnaPaula Cupertino, Dongmei Li

PMC · DOI: 10.2196/76265 · Journal of Medical Internet Research · 2025-11-20

## TL;DR

This study identifies features of TikTok videos that increase user engagement, aiming to improve vaping prevention campaigns.

## Contribution

The study uses large language models to analyze TikTok videos and reveals engagement-driving features for vaping prevention.

## Key findings

- Videos with car backgrounds, young adults, talking, emojis, and funny content have higher engagement.
- Promotional content in videos is associated with lower user engagement.
- GPT-4 outperformed Video-LLaMA in identifying video features accurately.

## Abstract

Electronic cigarette (e-cigarette) use is prevalent in youth and young adults in the United States. TikTok (ByteDance), a popular social media platform among youth and young adults, has become a key avenue for disseminating e-cigarette-related videos, with promotional videos constituting the predominant form.

This study aimed to identify key e-cigarette-related TikTok video features associated with high user engagement to assist with future video design for vaping prevention campaigns.

We collected 1487 e-cigarette-related TikTok videos and related metadata posted between January 2023 and January 2024 using the TikTok API (application programming interface). We applied large language models GPT-4 and Video-LLaMA to extract video features (eg, promotion content, background, perceived sex, lifestyle, talking, cartoon, vaping tricks, and containing emojis) from e-cigarette-related TikTok videos. We randomly selected and hand-coded 25 videos to check the accuracy of 2 models in identifying these video features. We used a linear mixed effects model with random intercept to identify significant video features associated with high TikTok user engagement ([likes+shares+comments]/views).

Compared to the Video-LLaMA model, the GPT-4 model exhibited higher accuracy (83%‐100% vs 24%‐88%) in video feature identification. Notably, video backgrounds in cars (rate ratio [RR]=3.91, 95% CI 1.25‐12.20; P=.009) demonstrated significantly higher user engagement than in public spaces. Moreover, videos featuring young adults (RR=1.24, 95% CI 1.00‐1.53; P=.048), talking (RR=1.63, 95% CI 1.30‐2.05; P<.001), containing emojis (RR=1.88, 95% CI 1.48‐2.38; P<.001), or funny and silly content (RR=1.61, 95% CI 1.29‐2.00; P<.001) exhibited heightened user engagement. Conversely, videos with promotional content (RR=0.40, 95% CI 0.45‐0.81; P=.001) experienced lower engagement.

TikTok video features like background settings, young adult presence, talking, and containing emojis and funny or silly content substantially enhance user engagement. These insights offer valuable guidance for designing compelling videos in vaping prevention campaigns to improve social media user engagement.

## Full-text entities

- **Chemicals:** TikTok (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12634013/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12634013/full.md

## References

47 references — full list in the complete paper: https://tomesphere.com/paper/PMC12634013/full.md

---
Source: https://tomesphere.com/paper/PMC12634013