BLO-Inst: Bi-Level Optimization Based Alignment of YOLO and SAM for Robust Instance Segmentation

Li Zhang; Pengtao Xie

arXiv:2601.22061·cs.CV·January 30, 2026

BLO-Inst: Bi-Level Optimization Based Alignment of YOLO and SAM for Robust Instance Segmentation

Li Zhang, Pengtao Xie

PDF

Open Access

TL;DR

BLO-Inst introduces a bi-level optimization framework to align object detection with segmentation objectives, enabling more effective and automated instance segmentation by transforming detectors into segmentation-aware prompt generators.

Contribution

It proposes a novel bi-level optimization approach that jointly aligns detection and segmentation tasks, addressing objective mismatch and overfitting issues in existing pipelines.

Findings

01

Outperforms standard baselines in general and biomedical domains

02

Improves segmentation accuracy by aligning detection with segmentation objectives

03

Demonstrates robustness across diverse datasets

Abstract

The Segment Anything Model has revolutionized image segmentation with its zero-shot capabilities, yet its reliance on manual prompts hinders fully automated deployment. While integrating object detectors as prompt generators offers a pathway to automation, existing pipelines suffer from two fundamental limitations: objective mismatch, where detectors optimized for geometric localization do not correspond to the optimal prompting context required by SAM, and alignment overfitting in standard joint training, where the detector simply memorizes specific prompt adjustments for training samples rather than learning a generalizable policy. To bridge this gap, we introduce BLO-Inst, a unified framework that aligns detection and segmentation objectives by bi-level optimization. We formulate the alignment as a nested optimization problem over disjoint data splits. In the lower level, the SAM is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Medical Image Segmentation Techniques · Domain Adaptation and Few-Shot Learning