HazardNet: A Small-Scale Vision Language Model for Real-Time Traffic   Safety Detection at Edge Devices

Mohammad Abu Tami; Mohammed Elhenawy; and Huthaifa I. Ashqar

arXiv:2502.20572·cs.CV·March 3, 2025

HazardNet: A Small-Scale Vision Language Model for Real-Time Traffic Safety Detection at Edge Devices

Mohammad Abu Tami, Mohammed Elhenawy, and Huthaifa I. Ashqar

PDF

TL;DR

HazardNet is a compact vision-language model fine-tuned for real-time traffic safety detection on edge devices, utilizing a new VQA dataset to outperform larger models in accuracy and efficiency.

Contribution

This work introduces HazardNet, a small-scale vision-language model optimized for edge deployment, and HazardQA, a specialized dataset for safety-critical traffic scenarios.

Findings

01

HazardNet achieved up to 89% improvement in F1-Score over the base model.

02

HazardNet's performance is comparable to larger models like GPT-4o, with some cases up to 6% better.

03

The model enables real-time traffic safety detection on resource-constrained edge devices.

Abstract

Traffic safety remains a vital concern in contemporary urban settings, intensified by the increase of vehicles and the complicated nature of road networks. Traditional safety-critical event detection systems predominantly rely on sensor-based approaches and conventional machine learning algorithms, necessitating extensive data collection and complex training processes to adhere to traffic safety regulations. This paper introduces HazardNet, a small-scale Vision Language Model designed to enhance traffic safety by leveraging the reasoning capabilities of advanced language and vision models. We built HazardNet by fine-tuning the pre-trained Qwen2-VL-2B model, chosen for its superior performance among open-source alternatives and its compact size of two billion parameters. This helps to facilitate deployment on edge devices with efficient inference throughput. In addition, we present…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.