Predicting pregnancy using large-scale data from a women's health tracking mobile application
Bo Liu, Shuyang Shi, Yongshang Wu, Daniel Thomas, Laura Symul, Emma, Pierson, Jure Leskovec

TL;DR
This study demonstrates that mobile health tracking data can effectively predict pregnancy probabilities, with models showing significant stratification and interpretability aligned with fertility research, highlighting broad potential for women's health applications.
Contribution
Developed and evaluated deep learning and logistic regression models for pregnancy prediction using large-scale mobile app data, introducing interpretability techniques aligned with fertility science.
Findings
Women in top 10% predicted probability have 89% chance of pregnancy over 6 cycles
Models significantly stratify pregnancy likelihood across populations
Interpretability techniques reveal trends consistent with fertility research
Abstract
Predicting pregnancy has been a fundamental problem in women's health for more than 50 years. Previous datasets have been collected via carefully curated medical studies, but the recent growth of women's health tracking mobile apps offers potential for reaching a much broader population. However, the feasibility of predicting pregnancy from mobile health tracking data is unclear. Here we develop four models -- a logistic regression model, and 3 LSTM models -- to predict a woman's probability of becoming pregnant using data from a women's health tracking app, Clue by BioWink GmbH. Evaluating our models on a dataset of 79 million logs from 65,276 women with ground truth pregnancy test data, we show that our predicted pregnancy probabilities meaningfully stratify women: women in the top 10% of predicted probabilities have a 89% chance of becoming pregnant over 6 menstrual cycles, as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPregnancy and preeclampsia studies · demographic modeling and climate adaptation · Machine Learning in Healthcare
MethodsLogistic Regression
