CrashSage: A Large Language Model-Centered Framework for Contextual and Interpretable Traffic Crash Analysis
Hao Zhen, Jidong J. Yang

TL;DR
CrashSage introduces a novel LLM-based framework that transforms crash data into structured narratives, enhancing interpretability and accuracy in traffic crash analysis for better safety insights.
Contribution
The paper presents a new LLM-centered approach with data transformation, augmentation, fine-tuning, and explainability techniques for improved crash severity inference.
Findings
LLaMA3-8B outperforms baseline models in crash severity prediction.
Context-aware data augmentation improves narrative quality.
Gradient-based explainability reveals key risk factors.
Abstract
Road crashes claim over 1.3 million lives annually worldwide and incur global economic losses exceeding $1.8 trillion. Such profound societal and financial impacts underscore the urgent need for road safety research that uncovers crash mechanisms and delivers actionable insights. Conventional statistical models and tree ensemble approaches typically rely on structured crash data, overlooking contextual nuances and struggling to capture complex relationships and underlying semantics. Moreover, these approaches tend to incur significant information loss, particularly in narrative elements related to multi-vehicle interactions, crash progression, and rare event characteristics. This study presents CrashSage, a novel Large Language Model (LLM)-centered framework designed to advance crash analysis and modeling through four key innovations. First, we introduce a tabular-to-text…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsBalanced Selection
