A Dataset and Benchmark for Consumer Healthcare Question Summarization

Abhishek Basu; Deepak Gupta; Dina Demner-Fushman; Shweta Yadav

arXiv:2512.23637·cs.CL·December 30, 2025

A Dataset and Benchmark for Consumer Healthcare Question Summarization

Abhishek Basu, Deepak Gupta, Dina Demner-Fushman, Shweta Yadav

PDF

Open Access

TL;DR

This paper introduces CHQ-Sum, a new domain-expert annotated dataset of consumer health questions and summaries, to advance healthcare question summarization research.

Contribution

It provides the first large-scale, domain-expert annotated dataset for consumer healthcare question summarization, enabling better model development.

Findings

01

State-of-the-art models perform variably on the dataset

02

The dataset improves understanding of consumer health questions

03

Benchmark results highlight future research directions

Abstract

The quest for seeking health information has swamped the web with consumers health-related questions. Generally, consumers use overly descriptive and peripheral information to express their medical condition or other healthcare needs, contributing to the challenges of natural language understanding. One way to address this challenge is to summarize the questions and distill the key information of the original question. Recently, large-scale datasets have significantly propelled the development of several summarization tasks, such as multi-document summarization and dialogue summarization. However, a lack of a domain-expert annotated dataset for the consumer healthcare questions summarization task inhibits the development of an efficient summarization system. To address this issue, we introduce a new dataset, CHQ-Sum,m that contains 1507 domain-expert annotated consumer health questions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Advanced Text Analysis Techniques