MSSSeg: Learning Multi-Scale Structural Complexity for Self-Supervised Segmentation

Haotang Li; Zhenyu Qi; Hao Qin; Huanrui Yang; Kebin Peng; Qing Guo; Sen He

arXiv:2512.23997·cs.CV·March 17, 2026

MSSSeg: Learning Multi-Scale Structural Complexity for Self-Supervised Segmentation

Haotang Li, Zhenyu Qi, Hao Qin, Huanrui Yang, Kebin Peng, Qing Guo, Sen He

PDF

Open Access

TL;DR

MSSSeg introduces a novel framework that explicitly models multi-scale structural complexity using semantic and depth information, significantly improving self-supervised segmentation accuracy.

Contribution

It proposes three innovative components—Differentiable Box-Counting, Structural Augmentation, and Persistent Homology Loss—to explicitly learn and supervise structural complexity in segmentation.

Findings

01

Achieves state-of-the-art results on COCO-Stuff-27, Cityscapes, and Potsdam datasets.

02

Demonstrates the importance of explicit structural complexity modeling.

03

Maintains computational efficiency while improving segmentation quality.

Abstract

Self-supervised semantic segmentation methods often suffer from structural errors, including merging distinct objects or fragmenting coherent regions, because they rely primarily on low-level appearance cues such as color and texture. These cues lack structural discriminability: they carry no information about the structural organization of a region, making it difficult to distinguish boundaries between similar-looking objects or maintain coherence within internally varying regions. Recent approaches attempt to address this by incorporating depth priors, yet remain limited by not explicitly modeling structural complexity that persists even when appearance cues are ambiguous. To bridge this gap, we present MSSSeg, a framework that explicitly learns multi-scale structural complexity from both semantic and depth domains, via three coupled components: (1) a Differentiable Box-Counting (DBC)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis