Right Regions, Wrong Labels: Semantic Label Flips in Segmentation under Correlation Shift

Akshit Achara; Yovin Yathathugoda; Nick Byrne; Michela Antonelli; Esther Puyol Anton; Alexander Hammers; Andrew P. King

arXiv:2604.13326·cs.CV·April 16, 2026

Right Regions, Wrong Labels: Semantic Label Flips in Segmentation under Correlation Shift

Akshit Achara, Yovin Yathathugoda, Nick Byrne, Michela Antonelli, Esther Puyol Anton, Alexander Hammers, Andrew P. King

PDF

1 Repo

TL;DR

This paper investigates how semantic segmentation models can incorrectly assign plausible but wrong labels under correlation shifts, introducing diagnostic tools and a flip-risk score to assess robustness.

Contribution

It identifies and quantifies semantic label-flip failures in segmentation under distribution shifts, proposing new metrics and a flip-risk score for robustness evaluation.

Findings

01

Increasing correlation during training widens label-flip errors in test conditions.

02

The flip-risk score can effectively flag flip-prone cases at inference.

03

Decomposing errors reveals insights beyond traditional overlap metrics.

Abstract

The robustness of machine learning models can be compromised by spurious correlations between non-causal features in the input data and target labels. A common way to test for such correlations is to train on data where the label is strongly tied to some non-causal cue, then evaluate on examples where that tie no longer holds. This idea is well established for classification tasks, but for semantic segmentation the specific failure modes are not well understood. We show that a model may achieve reasonable overlap while assigning the wrong semantic label, swapping one plausible foreground class for another, even when object boundaries are largely correct. We focus on this semantic label-flip behaviour and quantify it with a simple diagnostic (Flip) that counts how often ground truth foreground pixels are assigned the wrong foreground identity while remaining predicted as foreground. In a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

acharaakshit/label-flips
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.