Neural Sentinel: Unified Vision Language Model (VLM) for License Plate Recognition with Human-in-the-Loop Continual Learning
Karthik Sivakoti

TL;DR
Neural Sentinel introduces a unified vision language model for license plate recognition that improves accuracy, reduces complexity, and enables multi-task learning with human-in-the-loop continual adaptation.
Contribution
The paper demonstrates that a fine-tuned VLM can perform license plate recognition, vehicle attribute extraction, and zero-shot auxiliary tasks within a single model, with continual learning capabilities.
Findings
92.3% license plate recognition accuracy
Achieves 152ms inference latency
Enables zero-shot auxiliary tasks like vehicle color detection
Abstract
Traditional Automatic License Plate Recognition (ALPR) systems employ multi-stage pipelines consisting of object detection networks followed by separate Optical Character Recognition (OCR) modules, introducing compounding errors, increased latency, and architectural complexity. This research presents Neural Sentinel, a novel unified approach that leverages Vision Language Models (VLMs) to perform license plate recognition, state classification, and vehicle attribute extraction through a single forward pass. Our primary contribution lies in demonstrating that a fine-tuned PaliGemma 3B model, adapted via Low-Rank Adaptation (LoRA), can simultaneously answer multiple visual questions about vehicle images, achieving 92.3% plate recognition accuracy, which is a 14.1% improvement over EasyOCR and 9.9% improvement over PaddleOCR baselines. We introduce a Human-in-the-Loop (HITL) continual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVehicle License Plate Recognition · Advanced Neural Network Applications · Smart Parking Systems Research
