Leveraging Model Fusion for Improved License Plate Recognition
Rayson Laroca, Luiz A. Zanlorensi, Valter Estevam, Rodrigo Minetto,, David Menotti

TL;DR
This paper investigates combining multiple deep learning models for license plate recognition, demonstrating that fusion improves accuracy and robustness across datasets, especially when balancing speed and performance.
Contribution
It introduces a straightforward fusion approach for up to 12 models, showing significant performance gains and practical strategies for balancing speed and accuracy.
Findings
Fusion reduces subpar performance on diverse datasets.
Combining 4-6 models balances speed and accuracy effectively.
Model fusion enhances robustness in license plate recognition.
Abstract
License Plate Recognition (LPR) plays a critical role in various applications, such as toll collection, parking management, and traffic law enforcement. Although LPR has witnessed significant advancements through the development of deep learning, there has been a noticeable lack of studies exploring the potential improvements in results by fusing the outputs from multiple recognition models. This research aims to fill this gap by investigating the combination of up to 12 different models using straightforward approaches, such as selecting the most confident prediction or employing majority vote-based strategies. Our experiments encompass a wide range of datasets, revealing substantial benefits of fusion approaches in both intra- and cross-dataset setups. Essentially, fusing multiple models reduces considerably the likelihood of obtaining subpar performance on a particular…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVehicle License Plate Recognition · Handwritten Text Recognition Techniques · Advanced Neural Network Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
