Determining Health Utilities through Data Mining of Social Media
Christopher Thompson, Josh Introne, and Clint Young

TL;DR
This paper introduces a novel method using social media data and natural language processing to estimate health utilities and disease severity, reducing reliance on costly patient surveys.
Contribution
It presents a new approach to characterize health utilities by analyzing social media conversations with machine learning, enabling scalable and cost-effective health assessments.
Findings
Successfully distinguished mild from severe diseases using social media data.
Demonstrated potential to predict health utilities where data is lacking.
Enabled estimation of temporal and geographic disease severity patterns.
Abstract
'Health utilities' measure patient preferences for perfect health compared to specific unhealthy states, such as asthma, a fractured hip, or colon cancer. When integrated over time, these estimations are called quality adjusted life years (QALYs). Until now, characterizing health utilities (HUs) required detailed patient interviews or written surveys. While reliable and specific, this data remained costly due to efforts to locate, enlist and coordinate participants. Thus the scope, context and temporality of diseases examined has remained limited. Now that more than a billion people use social media, we propose a novel strategy: use natural language processing to analyze public online conversations for signals of the severity of medical conditions and correlate these to known HUs using machine learning. In this work, we filter a dataset that originally contained 2 billion tweets for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHealth Literacy and Information Accessibility · Mental Health via Writing · Data-Driven Disease Surveillance
