SoftPQ: Robust Instance Segmentation Evaluation via Soft Matching and Tunable Thresholds
Ranit Karmakar, Simon F. N{\o}rrelykke

TL;DR
SoftPQ is a novel, flexible instance segmentation evaluation metric that uses soft matching and tunable thresholds to provide more nuanced, robust, and informative assessments of segmentation quality.
Contribution
We introduce SoftPQ, a new metric that replaces binary matching with graded evaluation using tunable thresholds and sublinear penalties, improving robustness and feedback for segmentation models.
Findings
SoftPQ exhibits smoother score behavior than traditional metrics.
It is more robust to structural segmentation errors.
SoftPQ captures meaningful quality differences overlooked by existing metrics.
Abstract
Segmentation evaluation metrics traditionally rely on binary decision logic: predictions are either correct or incorrect, based on rigid IoU thresholds. Detection--based metrics such as F1 and mAP determine correctness at the object level using fixed overlap cutoffs, while overlap--based metrics like Intersection over Union (IoU) and Dice operate at the pixel level, often overlooking instance--level structure. Panoptic Quality (PQ) attempts to unify detection and segmentation assessment, but it remains dependent on hard-threshold matching--treating predictions below the threshold as entirely incorrect. This binary framing obscures important distinctions between qualitatively different errors and fails to reward gradual model improvements. We propose SoftPQ, a flexible and interpretable instance segmentation metric that redefines evaluation as a graded continuum rather than a binary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection
