CrashChat: A Multimodal Large Language Model for Multitask Traffic Crash Video Analysis
Kaidi Liang, Ke Li, Xianbiao Hu, Ruwen Qin

TL;DR
CrashChat is a multimodal large language model designed for comprehensive traffic crash video analysis, integrating multiple tasks such as recognition, localization, and description within a unified framework, significantly advancing traffic safety research.
Contribution
It introduces CrashChat, a novel multimodal LLM with a multitask learning strategy that improves performance across crash analysis tasks and outperforms existing models on public datasets.
Findings
Near-perfect accuracy in crash recognition
176% improvement in crash localization
40% improvement in pre-crash localization
Abstract
Automating crash video analysis is essential to leverage the growing availability of driving video data for traffic safety research and accountability attribution in autonomous driving. Crash video analysis is a challenging multitask problem due to the complex spatiotemporal dynamics of crash events in video data and the diverse analytical requirements involved. It requires capabilities spanning crash recognition, temporal grounding, and high-level video understanding. Existing models, however, cannot perform all these tasks within a unified framework, and effective training strategies for such models remain underexplored. To fill these gaps, this paper proposes CrashChat, a multimodal large language model (MLLM) for multitask traffic crash analysis, built upon VideoLLaMA3. CrashChat acquires domain-specific knowledge through instruction fine-tuning and employs a novel multitask…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Domain Adaptation and Few-Shot Learning · Traffic and Road Safety
