A Scaling Law for Synthetic-to-Real Transfer: How Much Is Your Pre-training Effective?
Hiroaki Mikami, Kenji Fukumizu, Shogo Murai, Shuji Suzuki, Yuta, Kikuchi, Taiji Suzuki, Shin-ichi Maeda, Kohei Hayashi

TL;DR
This paper introduces a scaling law that predicts the effectiveness of synthetic pre-training for real vision tasks, helping determine whether to increase data or improve synthesis methods.
Contribution
It derives a simple, predictive scaling law for synthetic-to-real transfer performance and validates it across multiple experimental settings.
Findings
The scaling law accurately predicts transfer performance.
Increasing synthetic data size has diminishing returns when domain gap is large.
The generalization bound aligns with empirical results.
Abstract
Synthetic-to-real transfer learning is a framework in which a synthetically generated dataset is used to pre-train a model to improve its performance on real vision tasks. The most significant advantage of using synthetic images is that the ground-truth labels are automatically available, enabling unlimited expansion of the data size without human cost. However, synthetic data may have a huge domain gap, in which case increasing the data size does not improve the performance. How can we know that? In this study, we derive a simple scaling law that predicts the performance from the amount of pre-training data. By estimating the parameters of the law, we can judge whether we should increase the data or change the setting of image synthesis. Further, we analyze the theory of transfer learning by considering learning dynamics and confirm that the derived generalization bound is consistent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications
