What Makes Synthetic Data Effective in Image Segmentation

Jinjin Zhang; Xiefan Guo; Yizhou Jin; Nan Zhou; Di Huang

arXiv:2605.19289·cs.CV·May 20, 2026

What Makes Synthetic Data Effective in Image Segmentation

Jinjin Zhang, Xiefan Guo, Yizhou Jin, Nan Zhou, Di Huang

PDF

1 Repo

TL;DR

This paper systematically analyzes the effectiveness of synthetic images from diffusion models in image segmentation, proposing a framework called SENSE that enhances segmentation performance across multiple datasets and architectures.

Contribution

It uncovers key factors like scene density and instance fidelity that make synthetic data effective, and introduces SENSE, a scalable, model-agnostic framework for improved segmentation.

Findings

01

Synthetic images with dense scenes and fine details improve segmentation.

02

SENSE significantly boosts performance on Cityscapes, COCO, and ADE20K.

03

The approach is compatible with various models and scales well.

Abstract

Driven by rapid advances in large-scale generative models, synthetic data has emerged as a promising solution for visual understanding. While modern diffusion models achieve remarkable photorealistic image synthesis, their potential in complex visual segmentation tasks remains underexplored. In this work, we conduct a systematic analysis of synthetic images from state-of-the-art diffusion models to uncover the factors governing their utility. In particular, synthetic images characterized by dense scene composition and fine instance fidelity demonstrate distinctive benefits, yielding significantly more discriminative spatial representations. Building on these insights, we propose SENSE, a unified framework that leverages flexible and scalable synthetic data to substantially enhance segmentation performance. Notably, SENSE is model-agnostic, compatible with diverse architectures (e.g.,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhang0jhon/SENSE
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.