PaveBench: A Versatile Benchmark for Pavement Distress Perception and Interactive Vision-Language Analysis
Dexiang Li, Zhenning Che, Haijun Zhang, Dongliang Zhou, Zhao Zhang, Yahong Han

TL;DR
PaveBench is a comprehensive benchmark for pavement distress perception and interactive vision-language analysis, supporting multiple tasks and real-world data to improve pavement inspection systems.
Contribution
It introduces PaveBench, a large-scale, multi-task benchmark with real-world data, annotations, and a novel agent-augmented QA framework for pavement inspection.
Findings
State-of-the-art methods evaluated on PaveBench.
PaveVQA enables multi-turn, fact-grounded reasoning.
Agent-augmented framework improves visual question answering.
Abstract
Pavement condition assessment is essential for road safety and maintenance. Existing research has made significant progress. However, most studies focus on conventional computer vision tasks such as classification, detection, and segmentation. In real-world applications, pavement inspection requires more than visual recognition. It also requires quantitative analysis, explanation, and interactive decision support. Current datasets are limited. They focus on unimodal perception. They lack support for multi-turn interaction and fact-grounded reasoning. They also do not connect perception with vision-language analysis. To address these limitations, we introduce PaveBench, a large-scale benchmark for pavement distress perception and interactive vision-language analysis on real-world highway inspection images. PaveBench supports four core tasks: classification, object detection, semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
