Towards Any-Quality Image Segmentation via Generative and Adaptive Latent Space Enhancement

Guangqian Guo; Aixi Ren; Yong Guo; Xuehui Yu; Jiacheng Tian; Wenli Li; Chaowei Wang; Yaoxing Wang; Shan Gao

arXiv:2601.02018·cs.CV·April 28, 2026

Towards Any-Quality Image Segmentation via Generative and Adaptive Latent Space Enhancement

Guangqian Guo, Aixi Ren, Yong Guo, Xuehui Yu, Jiacheng Tian, Wenli Li, Chaowei Wang, Yaoxing Wang, Shan Gao

PDF

TL;DR

GleSAM++ enhances zero-shot image segmentation robustness on low-quality images by leveraging generative latent space enhancement, adaptive mechanisms, and compatibility techniques, while maintaining performance on high-quality images.

Contribution

This work introduces GleSAM++, a novel framework that improves segmentation on degraded images through latent space enhancement and adaptive strategies, with minimal additional parameters.

Findings

01

Significant robustness improvement on degraded images.

02

Maintains high performance on high-quality images.

03

Effective on unseen degradation types.

Abstract

Segment Anything Models (SAMs), known for their exceptional zero-shot segmentation performance, have garnered significant attention in the research community. Nevertheless, their performance drops significantly on severely degraded, low-quality images, limiting their effectiveness in real-world scenarios. To address this, we propose GleSAM++, which utilizes Generative Latent space Enhancement to boost robustness on low-quality images, thus enabling generalization across various image qualities. Additionally, to improve compatibility between the pre-trained diffusion model and the segmentation framework, we introduce two techniques, i.e., Feature Distribution Alignment (FDA) and Channel Replication and Expansion (CRE). However, the above components lack explicit guidance regarding the degree of degradation. The model is forced to implicitly fit a complex noise distribution that spans…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.