Task formulation for Extracting Social Determinants of Health from Clinical Narratives
Manabu Torii, Ian M. Finn, Son Doan, Paul Wang, Elly W. Yang, Daniel, S. Zisook

TL;DR
This paper compares three different NLP systems developed for extracting social determinants of health from clinical narratives, highlighting their methodologies, performance, and practical trade-offs in a challenging shared task.
Contribution
It introduces and evaluates three distinct approaches—machine learning classifiers, large language models, and rule-based extraction—for SDOH information retrieval in clinical texts.
Findings
The systems achieved F1 scores of 0.884, 0.831, and 0.663.
Independent phrase extraction outperformed relation-based methods.
Large language models provided a good balance of performance and flexibility.
Abstract
Objective: The 2022 n2c2 NLP Challenge posed identification of social determinants of health (SDOH) in clinical narratives. We present three systems that we developed for the Challenge and discuss the distinctive task formulation used in each of the three systems. Materials and Methods: The first system identifies target pieces of information independently using machine learning classifiers. The second system uses a large language model (LLM) to extract complete structured outputs per document. The third system extracts candidate phrases using machine learning and identifies target relations with hand-crafted rules. Results: The three systems achieved F1 scores of 0.884, 0.831, and 0.663 in the Subtask A of the Challenge, which are ranked third, seventh, and eighth among the 15 participating teams. The review of the extraction results from our systems reveals characteristics of each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Health Sciences Research and Education
