Automatically Identifying Comparator Groups on Twitter for Digital Epidemiology of Pregnancy Outcomes
Ari Z. Klein, Abeselom Gebreyesus, Graciela Gonzalez-Hernandez

TL;DR
This paper presents a natural language processing pipeline that automatically identifies Twitter users who have announced pregnancy outcomes, enabling large-scale observational studies of pregnancy health using social media data.
Contribution
The study develops and evaluates a supervised machine learning pipeline for detecting women with known pregnancy outcomes on Twitter, facilitating digital epidemiology research.
Findings
Achieved a user-level F1-score of 0.933 in identifying pregnancy outcome reports.
Developed a classifier with 0.947 precision and 0.920 recall.
Pipeline enables large-scale identification of comparator groups for pregnancy outcome studies.
Abstract
Despite the prevalence of adverse pregnancy outcomes such as miscarriage, stillbirth, birth defects, and preterm birth, their causes are largely unknown. We seek to advance the use of social media for observational studies of pregnancy outcomes by developing a natural language processing pipeline for automatically identifying users from which to select comparator groups on Twitter. We annotated 2361 tweets by users who have announced their pregnancy on Twitter, which were used to train and evaluate supervised machine learning algorithms as a basis for automatically detecting women who have reported that their pregnancy had reached term and their baby was born at a normal weight. Upon further processing the tweet-level predictions of a majority voting-based ensemble classifier, the pipeline achieved a user-level F1-score of 0.933, with a precision of 0.947 and a recall of 0.920. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPregnancy and preeclampsia studies · Global Maternal and Child Health · Gestational Diabetes Research and Management
