Benchmarking Semantic Segmentation Models via Appearance and Geometry Attribute Editing

Zijin Yin; Bing Li; Kongming Liang; Hao Sun; Zhongjiang He; Zhanyu Ma; Jun Guo

arXiv:2603.01535·cs.CV·March 17, 2026

Benchmarking Semantic Segmentation Models via Appearance and Geometry Attribute Editing

Zijin Yin, Bing Li, Kongming Liang, Hao Sun, Zhongjiang He, Zhanyu Ma, Jun Guo

PDF

Open Access

TL;DR

This paper introduces Gen4Seg, an automatic pipeline using diffusion models to generate challenging, attribute-edited images for stress-testing semantic segmentation models, revealing robustness limitations and improving performance.

Contribution

We propose a novel attribute editing pipeline with diffusion models to create diverse benchmarks and analyze segmentation model robustness comprehensively.

Findings

01

Open-vocabulary models are not more robust to geometric variations.

02

Data augmentation techniques have limited effectiveness against appearance changes.

03

The pipeline can serve as an effective data augmentation tool to enhance model performance.

Abstract

Semantic segmentation takes pivotal roles in various applications such as autonomous driving and medical image analysis. When deploying segmentation models in practice, it is critical to test their behaviors in varied and complex scenes in advance. In this paper, we construct an automatic data generation pipeline Gen4Seg to stress-test semantic segmentation models by generating various challenging samples with different attribute changes. Beyond previous evaluation paradigms focusing solely on global weather and style transfer, we investigate variations in both appearance and geometry attributes at the object and image level. These include object color, material, size, position, as well as image-level variations such as weather and style. To achieve this, we propose to edit visual attributes of existing real images with precise control of structural information, empowered by diffusion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Face recognition and analysis