# HybridSense-LLM: A Structured Multimodal Framework for Large-Language-Model–Based Wellness Prediction from Wearable Sensors with Contextual Self-Reports

**Authors:** Cheng-Huan Yu, Mohammad Masum

PMC · DOI: 10.3390/bioengineering13010120 · Bioengineering · 2026-01-20

## TL;DR

HybridSense-LLM combines wearable sensor data and language models to predict wellness metrics like stress and sleep quality, using structured prompts and various prompting strategies.

## Contribution

Introduces HybridSense, a novel framework integrating wearable data and LLMs for interpretable wellness prediction using structured prompts and multiple prompting strategies.

## Key findings

- Zero-shot prompting performs best for fatigue and stress prediction.
- Few-shot prompting improves sleep-quality estimation.
- Combining high-level descriptors with waveform context enhances readiness prediction.

## Abstract

Wearable sensors generate continuous physiological and behavioral data at a population scale, yet wellness prediction remains limited by noisy measurements, irregular sampling, and subjective outcomes. We introduce HybridSense, a unified framework that integrates raw wearable signals and their statistical descriptors with large language model–based reasoning to produce accurate and interpretable estimates of stress, fatigue, readiness, and sleep quality. Using the PMData dataset, minute-level heart rate and activity logs are transformed into daily statistical features, whose relevance is ranked using a Random Forest model. These features, together with short waveform segments, are embedded into structured prompts and evaluated across seven prompting strategies using three large language model families: OpenAI 4o-mini, Gemini 2.0 Flash, and DeepSeek Chat. Bootstrap analyses demonstrate robust, task-dependent performance. Zero-shot prompting performs best for fatigue and stress, while few-shot prompting improves sleep-quality estimation. HybridSense further enhances readiness prediction by combining high-level descriptors with waveform context, and self-consistency and tree-of-thought prompting stabilize predictions for highly variable targets. All evaluated models exhibit low inference cost and practical latency. These results suggest that prompt-driven large language model reasoning, when paired with interpretable signal features, offers a scalable and transparent approach to wellness prediction from consumer wearable data.

## Full-text entities

- **Diseases:** fatigue (MESH:D005221)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12837951/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12837951/full.md

## References

43 references — full list in the complete paper: https://tomesphere.com/paper/PMC12837951/full.md

---
Source: https://tomesphere.com/paper/PMC12837951