CHQ-Summ: A Dataset for Consumer Healthcare Question Summarization
Shweta Yadav, Deepak Gupta, and Dina Demner-Fushman

TL;DR
This paper introduces CHQ-Summ, a new dataset of consumer health questions with summaries, to improve natural language understanding and summarization of health-related social media posts.
Contribution
The paper presents a novel, expert-annotated dataset for consumer health question summarization and benchmarks it with multiple state-of-the-art models.
Findings
The dataset contains 1507 annotated questions and summaries.
State-of-the-art models show promising results on the dataset.
The dataset aids in understanding and summarizing consumer health questions.
Abstract
The quest for seeking health information has swamped the web with consumers' health-related questions. Generally, consumers use overly descriptive and peripheral information to express their medical condition or other healthcare needs, contributing to the challenges of natural language understanding. One way to address this challenge is to summarize the questions and distill the key information of the original question. To address this issue, we introduce a new dataset, CHQ-Summ that contains 1507 domain-expert annotated consumer health questions and corresponding summaries. The dataset is derived from the community question-answering forum and therefore provides a valuable resource for understanding consumer health-related posts on social media. We benchmark the dataset on multiple state-of-the-art summarization models to show the effectiveness of the dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Expert finding and Q&A systems
