TL;DR
This paper systematically evaluates style transfer strategies for domain generalization in computer vision, revealing key design principles and proposing StyleMixDG, a simple yet effective augmentation method that improves real-world scene understanding.
Contribution
It provides a comprehensive empirical analysis of style transfer factors and introduces StyleMixDG, a lightweight augmentation approach that enhances domain generalization without architectural changes.
Findings
Expanding the style pool yields larger gains than limited styles.
Texture complexity has negligible effect with a large style pool.
Diverse artistic styles outperform domain-aligned styles.
Abstract
Deep learning models for computer vision often suffer from poor generalization when deployed in real-world settings, especially when trained on synthetic data due to the well-known Sim2Real gap. Despite the growing popularity of style transfer as a data augmentation strategy for domain generalization, the literature contains unresolved contradictions regarding three key design axes: the diversity of the style pool, the role of texture complexity, and the choice of style source. We present a systematic empirical study that isolates and evaluates each of these factors for driving scene understanding, resolving inconsistencies in prior work. Our findings show that (i) expanding the style pool yields larger gains than repeated augmentation with few styles, (ii) texture complexity has no significant effect when the pool is sufficiently large, and (iii) diverse artistic styles outperform…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
