Temporally Phenotyping GLP-1RA Case Reports with Large Language Models: A Textual Time Series Corpus and Risk Modeling

Sayantan Kumar; Jeremy C. Weiss

arXiv:2604.06197·cs.CL·April 9, 2026

Temporally Phenotyping GLP-1RA Case Reports with Large Language Models: A Textual Time Series Corpus and Risk Modeling

Sayantan Kumar, Jeremy C. Weiss

PDF

TL;DR

This paper introduces a textual time-series corpus of diabetes case reports, evaluates large language models for timeline extraction, and demonstrates risk modeling for respiratory outcomes using the extracted data.

Contribution

It creates a novel corpus of clinical timelines from case reports, assesses LLM performance in extracting these timelines, and applies the data to risk modeling in diabetes.

Findings

01

LLM GPT5 achieved 87.1% event coverage and 84.3% temporal sequencing accuracy.

02

Time-to-event analysis indicated lower respiratory risk among GLP-1 users (HR=0.259).

03

Temporal annotations and code will be publicly released.

Abstract

Type 2 diabetes case reports describe complex clinical courses, but their timelines are often expressed in language that is difficult to reuse in longitudinal modeling. To address this gap, we developed a textual time-series corpus of 136 PubMed Open Access single-patient case reports involving glucagon-like peptide 1 receptor agonists, with clinical events associated with their most probable reference times. We evaluated automated LLM timeline extraction against gold-standard timelines annotated by clinical domain experts, assessing how well systems recovered clinical events and their timings. The best-performing LLM produced high event coverage (GPT5; 0.871) and reliable temporal sequencing across symptoms (GPT5; 0.843), diagnoses, treatments, laboratory tests, and outcomes. As a downstream demonstration, time-to-event analyses in diabetes suggested lower risk of respiratory sequelae…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.