TL;DR
This paper introduces the FGVD dataset, a challenging real-world fine-grained vehicle detection dataset captured from a moving car, with hierarchical labels and complex traffic scenarios, highlighting the need for improved hierarchical models.
Contribution
The paper presents the first in-the-wild fine-grained vehicle detection dataset with hierarchical labels and evaluates baseline detectors, demonstrating the dataset's difficulty and the potential of hierarchical models.
Findings
Baseline detectors perform poorly on FGVD.
Hierarchical models improve classification accuracy.
FGVD is the most challenging fine-grained vehicle dataset.
Abstract
The previous fine-grained datasets mainly focus on classification and are often captured in a controlled setup, with the camera focusing on the objects. We introduce the first Fine-Grained Vehicle Detection (FGVD) dataset in the wild, captured from a moving camera mounted on a car. It contains 5502 scene images with 210 unique fine-grained labels of multiple vehicle types organized in a three-level hierarchy. While previous classification datasets also include makes for different kinds of cars, the FGVD dataset introduces new class labels for categorizing two-wheelers, autorickshaws, and trucks. The FGVD dataset is challenging as it has vehicles in complex traffic scenarios with intra-class and inter-class variations in types, scale, pose, occlusion, and lighting conditions. The current object detectors like yolov5 and faster RCNN perform poorly on our dataset due to a lack of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
