Instruct2See: Learning to Remove Any Obstructions Across Distributions

Junhang Li; Yu Guo; Chuhua Xian; Shengfeng He

arXiv:2505.17649·cs.CV·May 26, 2025

Instruct2See: Learning to Remove Any Obstructions Across Distributions

Junhang Li, Yu Guo, Chuhua Xian, Shengfeng He

PDF

1 Video

TL;DR

Instruct2See is a zero-shot framework that effectively removes various unseen obstructions from images by unifying the process as a mask restoration task using multi-modal prompts and dynamic mask adaptation.

Contribution

It introduces a novel unified approach for obstruction removal that handles both seen and unseen obstacles using multi-modal prompts and a tunable mask adapter.

Findings

01

Achieves strong generalization on out-of-distribution obstacles

02

Performs well on both in-distribution and out-of-distribution data

03

Demonstrates effective real-time mask adjustment capabilities

Abstract

Images are often obstructed by various obstacles due to capture limitations, hindering the observation of objects of interest. Most existing methods address occlusions from specific elements like fences or raindrops, but are constrained by the wide range of real-world obstructions, making comprehensive data collection impractical. To overcome these challenges, we propose Instruct2See, a novel zero-shot framework capable of handling both seen and unseen obstacles. The core idea of our approach is to unify obstruction removal by treating it as a soft-hard mask restoration problem, where any obstruction can be represented using multi-modal prompts, such as visual semantics and textual instructions, processed through a cross-attention unit to enhance contextual understanding and improve mode control. Additionally, a tunable mask adapter allows for dynamic soft masking, enabling real-time…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Instruct2See: Learning to Remove Any Obstructions Across Distributions· slideslive