Validation of a Small Language Model for DSM-5 Substance Category Classification in Child Welfare Records
Brian E. Perron, Dragan Stoll, Bryan G. Victor, Zia Qia, Andreas Jud, Joseph P. Ryan

TL;DR
This study validates a small, locally hosted language model's ability to accurately classify specific DSM-5 substance categories in child welfare investigation narratives, demonstrating high reliability and precision for most categories.
Contribution
It demonstrates that a 20-billion-parameter local language model can classify multiple substance types from child welfare texts, extending binary detection to detailed multi-label classification.
Findings
High inter-method agreement (kappa 0.94-1.00) for alcohol, cannabis, opioid, stimulant, sedative categories
Classification precision ranged from 92% to 100% for most categories
Test-retest reliability ranged from 92.1% to 99.1% across categories
Abstract
Background: Recent studies have demonstrated that large language models (LLMs) can perform binary classification tasks on child welfare narratives, detecting the presence or absence of constructs such as substance-related problems, domestic violence, and firearms involvement. Whether smaller, locally deployable models can move beyond binary detection to classify specific substance types from these narratives remains untested. Objective: To validate a locally hosted LLM classifier for identifying specific substance types aligned with DSM-5 categories in child welfare investigation narratives. Methods: A locally hosted 20-billion-parameter LLM classified child maltreatment investigation narratives from a Midwestern U.S. state. Records previously identified as containing substance-related problems were passed to a second classification stage targeting seven DSM-5 substance categories.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsChild Abuse and Trauma · Mental Health via Writing · Prenatal Substance Exposure Effects
