BLO-Inst: Bi-Level Optimization Based Alignment of YOLO and SAM for Robust Instance Segmentation
Li Zhang, Pengtao Xie

TL;DR
BLO-Inst introduces a bi-level optimization framework to align object detection with segmentation objectives, enabling more effective and automated instance segmentation by transforming detectors into segmentation-aware prompt generators.
Contribution
It proposes a novel bi-level optimization approach that jointly aligns detection and segmentation tasks, addressing objective mismatch and overfitting issues in existing pipelines.
Findings
Outperforms standard baselines in general and biomedical domains
Improves segmentation accuracy by aligning detection with segmentation objectives
Demonstrates robustness across diverse datasets
Abstract
The Segment Anything Model has revolutionized image segmentation with its zero-shot capabilities, yet its reliance on manual prompts hinders fully automated deployment. While integrating object detectors as prompt generators offers a pathway to automation, existing pipelines suffer from two fundamental limitations: objective mismatch, where detectors optimized for geometric localization do not correspond to the optimal prompting context required by SAM, and alignment overfitting in standard joint training, where the detector simply memorizes specific prompt adjustments for training samples rather than learning a generalizable policy. To bridge this gap, we introduce BLO-Inst, a unified framework that aligns detection and segmentation objectives by bi-level optimization. We formulate the alignment as a nested optimization problem over disjoint data splits. In the lower level, the SAM is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Medical Image Segmentation Techniques · Domain Adaptation and Few-Shot Learning
