Mamba-based Light Field Super-Resolution with Efficient Subspace   Scanning

Ruisheng Gao; Zeyu Xiao; Zhiwei Xiong

arXiv:2406.16083·eess.IV·June 25, 2024

Mamba-based Light Field Super-Resolution with Efficient Subspace Scanning

Ruisheng Gao, Zeyu Xiao, Zhiwei Xiong

PDF

Open Access

TL;DR

This paper introduces MLFSR, a light field super-resolution method using Mamba-based sequence modeling with an efficient subspace scanning strategy, achieving high performance with better speed and memory efficiency.

Contribution

It proposes a novel Mamba-based super-resolution framework with a subspace scanning strategy and new modules, improving efficiency while maintaining competitive accuracy.

Findings

01

Outperforms CNN-based models in accuracy.

02

Rivals Transformer-based methods in performance.

03

Achieves faster inference and lower memory usage.

Abstract

Transformer-based methods have demonstrated impressive performance in 4D light field (LF) super-resolution by effectively modeling long-range spatial-angular correlations, but their quadratic complexity hinders the efficient processing of high resolution 4D inputs, resulting in slow inference speed and high memory cost. As a compromise, most prior work adopts a patch-based strategy, which fails to leverage the full information from the entire input LFs. The recently proposed selective state-space model, Mamba, has gained popularity for its efficient long-range sequence modeling. In this paper, we propose a Mamba-based Light Field Super-Resolution method, named MLFSR, by designing an efficient subspace scanning strategy. Specifically, we tokenize 4D LFs into subspace sequences and conduct bi-directional scanning on each subspace. Based on our scanning strategy, we then design the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Optical Coherence Tomography Applications · Advanced Optical Sensing Technologies

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings