V-RoAst: Visual Road Assessment. Can VLM be a Road Safety Assessor Using the iRAP Standard?

Natchapon Jongwiriyanurak; Zichao Zeng; June Moh Goo; Xinglei Wang; Ilya Ilyankou; Kerkritt Sriroongvikrai; Nicola Christie; Meihui Wang; Huanfa Chen; James Haworth

arXiv:2408.10872·cs.CV·May 7, 2026

V-RoAst: Visual Road Assessment. Can VLM be a Road Safety Assessor Using the iRAP Standard?

Natchapon Jongwiriyanurak, Zichao Zeng, June Moh Goo, Xinglei Wang, Ilya Ilyankou, Kerkritt Sriroongvikrai, Nicola Christie, Meihui Wang, Huanfa Chen, James Haworth

PDF

1 Repo

TL;DR

This paper explores using Vision-Language Models for zero-shot road safety assessment, introducing a new dataset and benchmarking their performance against traditional methods for low-cost infrastructure risk analysis.

Contribution

It presents the first open-source dataset from ThaiRAP and evaluates VLMs as flexible, zero-shot tools for road safety classification without retraining.

Findings

01

VLMs generalise well to unseen safety classes

02

They outperform traditional CNN baselines in zero-shot settings

03

Code and dataset are publicly available for further research

Abstract

Road safety assessments are critical yet costly, especially in Low- and Middle-Income Countries (LMICs), where most roads remain unrated. Traditional methods require expert annotation and training data, while supervised learning-based approaches struggle to generalise across regions. In this paper, we introduce \textit{V-RoAst}, a zero-shot Visual Question Answering (VQA) framework using Vision-Language Models (VLMs) to classify road safety attributes defined by the iRAP standard. We introduce the first open-source dataset from ThaiRAP, consisting of over 2,000 curated street-level images from Thailand annotated for this task. We evaluate Gemini-1.5-flash and GPT-4o-mini on this dataset and benchmark their performance against VGGNet and ResNet baselines. While VLMs underperform on spatial awareness, they generalise well to unseen classes and offer flexible prompt-based reasoning without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

PongNJ/V-RoAst
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.