LifeAgentBench: A Multi-dimensional Benchmark and Agent for Personal Health Assistants in Digital Health

Ye Tian; Zihao Wang; Onat Gungor; Xiaoran Fan; Tajana Rosing

arXiv:2601.13880·cs.AI·January 21, 2026

LifeAgentBench: A Multi-dimensional Benchmark and Agent for Personal Health Assistants in Digital Health

Ye Tian, Zihao Wang, Onat Gungor, Xiaoran Fan, Tajana Rosing

PDF

Open Access

TL;DR

LifeAgentBench is a comprehensive benchmark for evaluating large language models in personalized digital health support, highlighting current limitations and proposing a new multi-step agent to improve health reasoning in real-world scenarios.

Contribution

The paper introduces LifeAgentBench, a large-scale benchmark for health reasoning, and proposes LifeAgent, a multi-step agent that enhances LLM performance in digital health tasks.

Findings

01

11 LLMs evaluated reveal bottlenecks in reasoning capabilities.

02

LifeAgent outperforms baseline models in complex health reasoning tasks.

03

Benchmark and agent are publicly available for future research.

Abstract

Personalized digital health support requires long-horizon, cross-dimensional reasoning over heterogeneous lifestyle signals, and recent advances in mobile sensing and large language models (LLMs) make such support increasingly feasible. However, the capabilities of current LLMs in this setting remain unclear due to the lack of systematic benchmarks. In this paper, we introduce LifeAgentBench, a large-scale QA benchmark for long-horizon, cross-dimensional, and multi-user lifestyle health reasoning, containing 22,573 questions spanning from basic retrieval to complex reasoning. We release an extensible benchmark construction pipeline and a standardized evaluation protocol to enable reliable and scalable assessment of LLM-based health assistants. We then systematically evaluate 11 leading LLMs on LifeAgentBench and identify key bottlenecks in long-horizon aggregation and cross-dimensional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare and Education · Topic Modeling