Mask-aware inference with State-Space Models
Ignasi Mas, Ramon Morros, Javier-Ruiz Hidalgo, Ivan Huerta

TL;DR
This paper introduces Partial Vision Mamba, a new architecture that integrates mask-aware inference into State Space Models, improving handling of missing data in vision tasks like depth completion and inpainting.
Contribution
We propose PVM, a novel component that adapts partial operations to Mamba, enabling effective mask-aware inference in SSMs for various vision tasks.
Findings
PVM improves depth completion accuracy with invalid data.
The approach generalizes well across inpainting and classification tasks.
State Space Models can effectively handle arbitrary missing data with PVM.
Abstract
Many real-world computer vision tasks, such as depth completion, must handle inputs with arbitrarily shaped regions of missing or invalid data. For Convolutional Neural Networks (CNNs), Partial Convolutions solved this by a mask-aware re-normalization conditioned only on valid pixels. Recently, State Space Models (SSMs) like Mamba have emerged, offering high performance with linear complexity. However, these architectures lack an inherent mechanism for handling such arbitrarily shaped invalid data at inference time. To bridge this gap, we introduce Partial Vision Mamba (PVM), a novel architectural component that ports the principles of partial operations to the Mamba backbone. We also define a series of rules to design architectures using PVM. We show the efficacy and generalizability of our approach in the tasks of depth completion, image inpainting, and classification with invalid…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging
