YOLO for Knowledge Extraction from Vehicle Images: A Baseline Study
Saraa Al-Saddik, Manna Elizabeth Philip, Ali Haidar

TL;DR
This study evaluates YOLO-based deep learning models for vehicle attribute extraction in real-world conditions, demonstrating their effectiveness and efficiency for law enforcement applications.
Contribution
It provides a baseline comparison of YOLO-v11, YOLO-World, and YOLO-Classification on a large, challenging vehicle image dataset, highlighting the importance of multi-view inference.
Findings
YOLO-v11 and YOLO-World outperform classification-only models in make and shape extraction.
Smaller YOLO variants achieve comparable accuracy to larger models, enabling real-time use.
Multi-view inference significantly improves model performance on complex datasets.
Abstract
Accurate identification of vehicle attributes such as make, colour, and shape is critical for law enforcement and intelligence applications. This study evaluates the effectiveness of three state-of-the-art deep learning approaches YOLO-v11, YOLO-World, and YOLO-Classification on a real-world vehicle image dataset. This dataset was collected under challenging and unconstrained conditions by NSW Police Highway Patrol Vehicles. A multi-view inference (MVI) approach was deployed to enhance the performance of the models' predictions. To conduct the analyses, datasets with 100,000 plus images were created for each of the three metadata prediction tasks, specifically make, shape and colour. The models were tested on a separate dataset with 29,937 images belonging to 1809 number plates. Different sets of experiments have been investigated by varying the models sizes. A classification accuracy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis · Face recognition and analysis
