Improving Drug Identification in Overdose Death Surveillance using Large Language Models

Arthur J. Funnell; Panayiotis Petousis; Fabrice Harel-Canada; Ruby Romero; Alex A. T. Bui; Adam Koncsol; Hritika Chaturvedi; Chelsea Shover; David Goodman-Meza

arXiv:2507.12679·cs.CL·July 18, 2025

Improving Drug Identification in Overdose Death Surveillance using Large Language Models

Arthur J. Funnell, Panayiotis Petousis, Fabrice Harel-Canada, Ruby Romero, Alex A. T. Bui, Adam Koncsol, Hritika Chaturvedi, Chelsea Shover, David Goodman-Meza

PDF

TL;DR

This study demonstrates that fine-tuned clinical language models, especially BioClinicalBERT, can accurately classify drug involvement in overdose death reports, significantly improving surveillance speed and reliability over traditional methods.

Contribution

The paper introduces the application of fine-tuned clinical language models for drug identification in overdose reports, showing superior accuracy and robustness compared to existing NLP approaches.

Findings

01

BioClinicalBERT achieved macro F1 >= 0.998 internally.

02

External validation showed macro F1 = 0.966, confirming robustness.

03

Models outperformed traditional machine learning and general NLP models.

Abstract

The rising rate of drug-related deaths in the United States, largely driven by fentanyl, requires timely and accurate surveillance. However, critical overdose data are often buried in free-text coroner reports, leading to delays and information loss when coded into ICD (International Classification of Disease)-10 classifications. Natural language processing (NLP) models may automate and enhance overdose surveillance, but prior applications have been limited. A dataset of 35,433 death records from multiple U.S. jurisdictions in 2020 was used for model training and internal testing. External validation was conducted using a novel separate dataset of 3,335 records from 2023-2024. Multiple NLP approaches were evaluated for classifying specific drug involvement from unstructured death certificate text. These included traditional single- and multi-label classifiers, as well as fine-tuned…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.