Video-based Vehicle Surveillance in the Wild: License Plate, Make, and Model Recognition with Self Reflective Vision-Language Models

Pouya Parsa; Keya Li; Kara M. Kockelman; Seongjin Choi

arXiv:2508.01387·cs.CV·August 5, 2025

Video-based Vehicle Surveillance in the Wild: License Plate, Make, and Model Recognition with Self Reflective Vision-Language Models

Pouya Parsa, Keya Li, Kara M. Kockelman, Seongjin Choi

PDF

Open Access

TL;DR

This paper explores using large vision-language models for license plate, make, and model recognition in videos captured by handheld devices, addressing challenges of motion, occlusion, and viewpoint variation.

Contribution

It introduces a novel VLM-based recognition pipeline with self-reflection, improving accuracy in in-motion, unconstrained video scenarios for traffic analysis.

Findings

01

Achieved 91.67% ALPR accuracy on campus dataset

02

Attained 66.67% make and model recognition accuracy

03

Self-reflection module improved results by 5.72% on average

Abstract

Automatic license plate recognition (ALPR) and vehicle make and model recognition underpin intelligent transportation systems, supporting law enforcement, toll collection, and post-incident investigation. Applying these methods to videos captured by handheld smartphones or non-static vehicle-mounted cameras presents unique challenges compared to fixed installations, including frequent camera motion, varying viewpoints, occlusions, and unknown road geometry. Traditional ALPR solutions, dependent on specialized hardware and handcrafted OCR pipelines, often degrade under these conditions. Recent advances in large vision-language models (VLMs) enable direct recognition of textual and semantic attributes from arbitrary imagery. This study evaluates the potential of VLMs for ALPR and makes and models recognition using monocular videos captured with handheld smartphones and non-static mounted…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVehicle License Plate Recognition · Advanced Neural Network Applications · Automated Road and Building Extraction