LP-LLM: End-to-End Real-World Degraded License Plate Text Recognition via Large Multimodal Models
Haoyan Gong, Hongbin Liu

TL;DR
This paper introduces LP-LLM, an end-to-end multimodal framework that improves license plate recognition under severe degradation by explicitly modeling character structure and leveraging large vision-language models.
Contribution
It proposes a novel structure-aware reasoning module with learnable character queries, enabling explicit character sequence modeling within a large multimodal model for degraded license plate recognition.
Findings
Significantly outperforms existing methods on synthetic and real-world datasets.
Effectively models character structure to improve recognition accuracy.
Demonstrates the benefit of integrating explicit structural priors into large models.
Abstract
Real-world License Plate Recognition (LPR) faces significant challenges from severe degradations such as motion blur, low resolution, and complex illumination. The prevailing "restoration-then-recognition" two-stage paradigm suffers from a fundamental flaw: the pixel-level optimization objectives of image restoration models are misaligned with the semantic goals of character recognition, leading to artifact interference and error accumulation. While Vision-Language Models (VLMs) have demonstrated powerful general capabilities, they lack explicit structural modeling for license plate character sequences (e.g., fixed length, specific order). To address this, we propose an end-to-end structure-aware multimodal reasoning framework based on Qwen3-VL. The core innovation lies in the Character-Aware Multimodal Reasoning Module (CMRM), which introduces a set of learnable Character Slot Queries.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Advanced Neural Network Applications
