Mamba-based Light Field Super-Resolution with Efficient Subspace Scanning
Ruisheng Gao, Zeyu Xiao, Zhiwei Xiong

TL;DR
This paper introduces MLFSR, a light field super-resolution method using Mamba-based sequence modeling with an efficient subspace scanning strategy, achieving high performance with better speed and memory efficiency.
Contribution
It proposes a novel Mamba-based super-resolution framework with a subspace scanning strategy and new modules, improving efficiency while maintaining competitive accuracy.
Findings
Outperforms CNN-based models in accuracy.
Rivals Transformer-based methods in performance.
Achieves faster inference and lower memory usage.
Abstract
Transformer-based methods have demonstrated impressive performance in 4D light field (LF) super-resolution by effectively modeling long-range spatial-angular correlations, but their quadratic complexity hinders the efficient processing of high resolution 4D inputs, resulting in slow inference speed and high memory cost. As a compromise, most prior work adopts a patch-based strategy, which fails to leverage the full information from the entire input LFs. The recently proposed selective state-space model, Mamba, has gained popularity for its efficient long-range sequence modeling. In this paper, we propose a Mamba-based Light Field Super-Resolution method, named MLFSR, by designing an efficient subspace scanning strategy. Specifically, we tokenize 4D LFs into subspace sequences and conduct bi-directional scanning on each subspace. Based on our scanning strategy, we then design the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Optical Coherence Tomography Applications · Advanced Optical Sensing Technologies
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
