Video-based Vehicle Surveillance in the Wild: License Plate, Make, and Model Recognition with Self Reflective Vision-Language Models
Pouya Parsa, Keya Li, Kara M. Kockelman, Seongjin Choi

TL;DR
This paper explores using large vision-language models for license plate, make, and model recognition in videos captured by handheld devices, addressing challenges of motion, occlusion, and viewpoint variation.
Contribution
It introduces a novel VLM-based recognition pipeline with self-reflection, improving accuracy in in-motion, unconstrained video scenarios for traffic analysis.
Findings
Achieved 91.67% ALPR accuracy on campus dataset
Attained 66.67% make and model recognition accuracy
Self-reflection module improved results by 5.72% on average
Abstract
Automatic license plate recognition (ALPR) and vehicle make and model recognition underpin intelligent transportation systems, supporting law enforcement, toll collection, and post-incident investigation. Applying these methods to videos captured by handheld smartphones or non-static vehicle-mounted cameras presents unique challenges compared to fixed installations, including frequent camera motion, varying viewpoints, occlusions, and unknown road geometry. Traditional ALPR solutions, dependent on specialized hardware and handcrafted OCR pipelines, often degrade under these conditions. Recent advances in large vision-language models (VLMs) enable direct recognition of textual and semantic attributes from arbitrary imagery. This study evaluates the potential of VLMs for ALPR and makes and models recognition using monocular videos captured with handheld smartphones and non-static mounted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVehicle License Plate Recognition · Advanced Neural Network Applications · Automated Road and Building Extraction
