RaX-Crash: A Resource Efficient and Explainable Small Model Pipeline with an Application to City Scale Injury Severity Prediction
Di Zhu, Chen Xie, Ziwei Wang, Haoyun Zhang

TL;DR
RaX-Crash is an efficient, explainable small model pipeline that accurately predicts injury severity in NYC vehicle collisions, combining tree-based models with interpretability tools to enhance city-scale injury analysis.
Contribution
The paper introduces RaX-Crash, a resource-efficient, explainable small model pipeline that outperforms language models in injury severity prediction and offers insights into key vulnerability factors.
Findings
XGBoost and Random Forest achieve ~78% accuracy, outperforming small language models.
Class weighting improves fatal injury recall with minimal accuracy loss.
SHAP analysis identifies vulnerability factors, timing, and location as key severity drivers.
Abstract
New York City reports over one hundred thousand motor vehicle collisions each year, creating substantial injury and public health burden. We present RaX-Crash, a resource efficient and explainable small model pipeline for structured injury severity prediction on the official NYC Motor Vehicle Collisions dataset. RaX-Crash integrates three linked tables with tens of millions of records, builds a unified feature schema in partitioned storage, and trains compact tree based ensembles (Random Forest and XGBoost) on engineered tabular features, which are compared against locally deployed small language models (SLMs) prompted with textual summaries. On a temporally held out test set, XGBoost and Random Forest achieve accuracies of 0.7828 and 0.7794, clearly outperforming SLMs (0.594 and 0.496); class imbalance analysis shows that simple class weighting improves fatal recall with modest…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic and Road Safety · Injury Epidemiology and Prevention · Autonomous Vehicle Technology and Safety
