TL;DR
BVI-Mamba is a novel video enhancement framework that uses a visual state-space model to improve low-light and underwater videos efficiently, outperforming existing transformer and convolution-based methods.
Contribution
It introduces a VSS-based framework with feature alignment and UNet-like enhancement modules that reduce computational resources while improving video quality.
Findings
Outperforms Transformer and convolution-based models in enhancement quality.
Reduces memory usage and computational time compared to existing methods.
Effective in both low-light and underwater environments.
Abstract
Videos captured in low-light and underwater conditions often suffer from distortions such as noise, low contrast, color imbalance, and blur. These issues not only limit visibility but also degrade automatic tasks like detection. Post-processing is typically required but can be time-consuming. AI-based tools for video enhancement also demand significantly more computational resources compared to image-based methods. This paper introduces a novel framework, Visual Mamba, designed to reduce memory usage and computational time by leveraging the Visual State Space (VSS) model. The framework consists of two modules: (i) a feature alignment module, where spatio-temporal displacement between input frames is registered in the feature space, and (ii) an enhancement module, where noise removal and brightness adjustment are performed using a UNet-like architecture, with all convolutional layers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
