Data Augmentation for Classification of Negative Pregnancy Outcomes in Imbalanced Data

Md Badsha Biswas

arXiv:2512.22732·cs.CL·December 30, 2025

Data Augmentation for Classification of Negative Pregnancy Outcomes in Imbalanced Data

Md Badsha Biswas

PDF

Open Access

TL;DR

This paper presents a novel NLP-based data augmentation approach using social media data to improve classification of negative pregnancy outcomes in imbalanced datasets, aiding epidemiological research.

Contribution

It introduces a new NLP pipeline for identifying and categorizing pregnancy experiences from social media, addressing data imbalance and noise challenges.

Findings

01

Effective identification of women sharing pregnancy experiences

02

Enhanced dataset quality for negative pregnancy outcomes

03

Potential for improved epidemiological analysis

Abstract

Infant mortality remains a significant public health concern in the United States, with birth defects identified as a leading cause. Despite ongoing efforts to understand the causes of negative pregnancy outcomes like miscarriage, stillbirths, birth defects, and premature birth, there is still a need for more comprehensive research and strategies for intervention. This paper introduces a novel approach that uses publicly available social media data, especially from platforms like Twitter, to enhance current datasets for studying negative pregnancy outcomes through observational research. The inherent challenges in utilizing social media data, including imbalance, noise, and lack of structure, necessitate robust preprocessing techniques and data augmentation strategies. By constructing a natural language processing (NLP) pipeline, we aim to automatically identify women sharing their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPregnancy and Medication Impact · Mental Health via Writing · Social Media in Health Education