DFR: A Decompose-Fuse-Reconstruct Framework for Multi-Modal Few-Shot Segmentation
Shuai Chen, Fanman Meng, Xiwei Zhang, Haoran Wei, Chenhao Wu, Qingbo Wu, Hongliang Li

TL;DR
This paper introduces DFR, a framework that effectively combines visual, textual, and audio information for multi-modal few-shot segmentation, significantly improving performance by leveraging a novel decomposition, fusion, and reconstruction strategy.
Contribution
The paper proposes a new multi-modal framework that systematically integrates three modalities using hierarchical decomposition, contrastive fusion, and dual-path reconstruction for enhanced segmentation.
Findings
DFR outperforms existing methods on multiple benchmarks.
The hierarchical decomposition improves semantic extraction.
Contrastive fusion maintains cross-modal consistency.
Abstract
This paper presents DFR (Decompose, Fuse and Reconstruct), a novel framework that addresses the fundamental challenge of effectively utilizing multi-modal guidance in few-shot segmentation (FSS). While existing approaches primarily rely on visual support samples or textual descriptions, their single or dual-modal paradigms limit exploitation of rich perceptual information available in real-world scenarios. To overcome this limitation, the proposed approach leverages the Segment Anything Model (SAM) to systematically integrate visual, textual, and audio modalities for enhanced semantic understanding. The DFR framework introduces three key innovations: 1) Multi-modal Decompose: a hierarchical decomposition scheme that extracts visual region proposals via SAM, expands textual semantics into fine-grained descriptors, and processes audio features for contextual enrichment; 2) Multi-modal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Medical Imaging Techniques and Applications · Anomaly Detection Techniques and Applications
