BREPS: Bounding-Box Robustness Evaluation of Promptable Segmentation

Andrey Moskalenko; Danil Kuznetsov; Irina Dudko; Anastasiia Iasakova; Nikita Boldyrev; Denis Shepelev; Andrei Spiridonov; Andrey Kuznetsov; Vlad Shakhuro

arXiv:2601.15123·cs.CV·April 17, 2026

BREPS: Bounding-Box Robustness Evaluation of Promptable Segmentation

Andrey Moskalenko, Danil Kuznetsov, Irina Dudko, Anastasiia Iasakova, Nikita Boldyrev, Denis Shepelev, Andrei Spiridonov, Andrey Kuznetsov, Vlad Shakhuro

PDF

1 Repo 1 Video

TL;DR

This paper evaluates the robustness of promptable segmentation models like SAM to natural variations in bounding box prompts, introducing BREPS to generate adversarial prompts and benchmarking across diverse datasets.

Contribution

It presents BREPS, a novel white-box optimization method for adversarial bounding box generation, and provides a comprehensive robustness benchmark for promptable segmentation models.

Findings

01

SAM-like models are highly sensitive to natural prompt noise.

02

Adversarial bounding boxes can significantly alter segmentation quality.

03

Robustness varies across datasets and applications.

Abstract

Promptable segmentation models such as SAM have established a powerful paradigm, enabling strong generalization to unseen objects and domains with minimal user input, including points, bounding boxes, and text prompts. Among these, bounding boxes stand out as particularly effective, often outperforming points while significantly reducing annotation costs. However, current training and evaluation protocols typically rely on synthetic prompts generated through simple heuristics, offering limited insight into real-world robustness. In this paper, we investigate the robustness of promptable segmentation models to natural variations in bounding box prompts. First, we conduct a controlled user study and collect thousands of real bounding box annotations. Our analysis reveals substantial variability in segmentation quality across users for the same model and instance, indicating that SAM-like…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

emb-ai/BREPS
github

Videos

BREPS: Bounding-Box Robustness Evaluation of Promptable Segmentation· underline