Use square root affinity to regress labels in semantic segmentation
Lumeng Cao, Zhouwang Yang

TL;DR
This paper introduces a novel supervised affinity-based loss function for semantic segmentation that leverages label and output affinities with a square root kernel, improving segmentation performance with minimal additional computation.
Contribution
It proposes a new affinity regression loss using label and output affinities with a square root kernel, enhancing segmentation accuracy without increasing inference complexity.
Findings
Improved segmentation accuracy on NYUv2 and Cityscapes datasets.
The proposed AR loss effectively promotes pair-wise label consistency.
Model training remains simple and computationally efficient.
Abstract
Semantic segmentation is a basic but non-trivial task in computer vision. Many previous work focus on utilizing affinity patterns to enhance segmentation networks. Most of these studies use the affinity matrix as a kind of feature fusion weights, which is part of modules embedded in the network, such as attention models and non-local models. In this paper, we associate affinity matrix with labels, exploiting the affinity in a supervised way. Specifically, we utilize the label to generate a multi-scale label affinity matrix as a structural supervision, and we use a square root kernel to compute a non-local affinity matrix on output layers. With such two affinities, we define a novel loss called Affinity Regression loss (AR loss), which can be an auxiliary loss providing pair-wise similarity penalty. Our model is easy to train and adds little computational burden without run-time…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection
