Academic case reports lack diversity: Assessing the presence and diversity of sociodemographic and behavioral factors related to Post COVID-19 Condition
Juan Andres Medina Florez, Shaina Raza, Rashida Lynn, Zahra Shakeri,, Brendan T. Smith, Elham Dolatabadi

TL;DR
This study develops an NLP framework to analyze sociodemographic and behavioral factors in Post COVID-19 Condition case reports, revealing disparities and underrepresented groups to improve understanding and care.
Contribution
It introduces a comprehensive NLP pipeline combining NER, NLI, and data augmentation to analyze SDOH in over 7,000 PCC case reports, highlighting disparities and entity variability.
Findings
Encoder-only BERT models outperform RNNs in NER tasks.
Significant underrepresentation of race and housing status in reports.
High entailment for violence and insurance attributes, high contradiction for gender and marital status.
Abstract
Understanding the prevalence, disparities, and symptom variations of Post COVID-19 Condition (PCC) for vulnerable populations is crucial to improving care and addressing intersecting inequities. This study aims to develop a comprehensive framework for integrating social determinants of health (SDOH) into PCC research by leveraging NLP techniques to analyze disparities and variations in SDOH representation within PCC case reports. Following construction of a PCC Case Report Corpus, comprising over 7,000 case reports from the LitCOVID repository, a subset of 709 reports were annotated with 26 core SDOH-related entity types using pre-trained named entity recognition (NER) models, human review, and data augmentation to improve quality, diversity and representation of entity types. An NLP pipeline integrating NER, natural language inference (NLI), trigram and frequency analyses was developed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 Pandemic Impacts
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Layer Normalization · Dense Connections · Adam · Softmax · Linear Warmup With Linear Decay · Residual Connection · Dropout · WordPiece
