A 2-Stage Model for Vehicle Class and Orientation Detection with Photo-Realistic Image Generation

Youngmin Kim; Donghwa Kang; Hyeongboo Baek

arXiv:2506.01338·cs.CV·June 3, 2025

A 2-Stage Model for Vehicle Class and Orientation Detection with Photo-Realistic Image Generation

Youngmin Kim, Donghwa Kang, Hyeongboo Baek

PDF

TL;DR

This paper introduces a two-stage model that uses photo-realistic image generation to improve vehicle class and orientation detection from synthetic data, addressing class imbalance and domain adaptation issues.

Contribution

It presents a novel two-stage detection framework that transforms synthetic images into real-like images and combines location and classification for accurate vehicle detection.

Findings

01

Achieved 4th place in IEEE BigData Challenge 2022 VOD

02

Improved detection accuracy with photo-realistic image transformation

03

Effective handling of class imbalance in synthetic training data

Abstract

We aim to detect the class and orientation of a vehicle by training a model with synthetic data. However, the distribution of the classes in the training data is imbalanced, and the model trained on the synthetic image is difficult to predict in real-world images. We propose a two-stage detection model with photo-realistic image generation to tackle this issue. Our model mainly takes four steps to detect the class and orientation of the vehicle. (1) It builds a table containing the image, class, and location information of objects in the image, (2) transforms the synthetic images into real-world images style, and merges them into the meta table. (3) Classify vehicle class and orientation using images from the meta-table. (4) Finally, the vehicle class and orientation are detected by combining the pre-extracted location information and the predicted classes. We achieved 4th place in IEEE…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.