Analysis of Classifier Training on Synthetic Data for Cross-Domain   Datasets

Andoni Cort\'es; Clemente Rodr\'iguez; Gorka Velez; Javier; Barandiar\'an; Marcos Nieto

arXiv:2410.22748·cs.CV·October 31, 2024

Analysis of Classifier Training on Synthetic Data for Cross-Domain Datasets

Andoni Cort\'es, Clemente Rodr\'iguez, Gorka Velez, Javier, Barandiar\'an, Marcos Nieto

PDF

TL;DR

This paper investigates the effectiveness of synthetic data for training traffic sign recognition models in autonomous driving, demonstrating that synthetic data can outperform real data in cross-domain scenarios and improve generalization.

Contribution

It introduces a novel augmentation pipeline with structured shadows and highlights, and a semi-supervised method for generating synthetic images, enhancing cross-domain model performance.

Findings

01

Synthetic training data outperforms real data in cross-domain tests (+10% precision)

02

Proposed augmentation improves model robustness and generalization

03

Synthetic data reduces the need for extensive real data collection

Abstract

A major challenges of deep learning (DL) is the necessity to collect huge amounts of training data. Often, the lack of a sufficiently large dataset discourages the use of DL in certain applications. Typically, acquiring the required amounts of data costs considerable time, material and effort. To mitigate this problem, the use of synthetic images combined with real data is a popular approach, widely adopted in the scientific community to effectively train various detectors. In this study, we examined the potential of synthetic data-based training in the field of intelligent transportation systems. Our focus is on camera-based traffic sign recognition applications for advanced driver assistance systems and autonomous driving. The proposed augmentation pipeline of synthetic datasets includes novel augmentation processes such as structured shadows and gaussian specular highlights. A…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFocus