Hi-Mamba: Hierarchical Mamba for Efficient Image Super-Resolution
Junbo Qiao, Jincheng Liao, Wei Li, Yulun Zhang, Yong Guo, Yi Wen,, Zhangxizi Qiu, Jiao Xie, Jie Hu, Shaohui Lin

TL;DR
Hi-Mamba introduces a hierarchical network for image super-resolution that efficiently models spatial dependencies with single-direction scanning, outperforming existing methods in accuracy and computational efficiency.
Contribution
The paper proposes a novel Hierarchical Mamba network combining local and region SSMs with directional alternation to improve efficiency and performance in image super-resolution.
Findings
Achieves 0.29 dB PSNR improvement on Manga109 for 3x SR.
Demonstrates superior performance across five benchmark datasets.
Reduces computational overhead compared to multi-direction scanning methods.
Abstract
State Space Models (SSM), such as Mamba, have shown strong representation ability in modeling long-range dependency with linear complexity, achieving successful applications from high-level to low-level vision tasks. However, SSM's sequential nature necessitates multiple scans in different directions to compensate for the loss of spatial dependency when unfolding the image into a 1D sequence. This multi-direction scanning strategy significantly increases the computation overhead and is unbearable for high-resolution image processing. To address this problem, we propose a novel Hierarchical Mamba network, namely, Hi-Mamba, for image super-resolution (SR). Hi-Mamba consists of two key designs: (1) The Hierarchical Mamba Block (HMB) assembled by a Local SSM (L-SSM) and a Region SSM (R-SSM) both with the single-direction scanning, aggregates multi-scale representations to enhance the…
Peer Reviews
Decision·ICLR 2025 Conference Withdrawn Submission
1. Various techniques are proposed to improve the modeling ability of Mamba-based SR model, with comprehensive ablation studies validating the effectiveness. 2. The Hi-Mamba-L strikes a better balance between performance and computational cost compared to the previous mamba-based method MambaIR.
1. The results of SRFormer are presented in Table 3, however, its model complexity and GPU Latency results are not included in Table 4 and Figure 5. 2. Most citations that be in parenthesis are not in parenthesis, which impacts the readibility. 3. The novelty seems limited.
1. The paper is well-structured and easy to follow. The figures provide clear structure information about different key components of the proposed method. Moreover, the figures are clean and well-organized, with appropriate color schemes that enhance clarity and readability. 2. The visual comparisons on the Urban100 benchmark demonstrate the effectiveness of the proposed method. Hi-Mamba restores better local textures than competitive methods, while maintaining higher LAM attributions and area
1. The motivation for this paper largely draws from previous works, such as MambaIR [1], LocalMamba [2], Vmamba [3], and MSVMamba [4], with limited innovation. First, the introduction section does not adequately elaborate on the paper’s motivation. For instance, the reasons why Local SSM and Region SSM enhance modeling capabilities are not sufficiently explained. Additionally, using Local SSM and multi-scale features is not novel within Mamba-related studies (e.g., LocalMamba and MSVMamba). 2.
Dual feature extraction capability: By combining the L-SSM and R-SSM modules, Hi-Mamba can extract local and regional features of the image at the same time, focusing on small-scale and large-scale contextual relationships respectively, thereby enhancing the expression ability of features. Superior performance and computational efficiency: Experimental results show that Hi-Mamba performs well on multiple benchmark data sets. For example, in the ×2 SR task of the Urban100 data set, Hi-Mamba-S ca
1. The proposed Hi-Mamba-T and Hi-Mamba-S variants do not show significant differences in PSNR and SSIM evaluation metrics compared to Hi-Mamba-T, aside from the differences in parameters and FLOPs. The advantages and roles of the two variants are not clearly stated. 2. As the model variants Hi-Mamba-T, Hi-Mamba-S, and Hi-Mamba-L increase in depth and width, their parameters and FLOPs also increase. When using more complex models, the optimization performance in terms of computation may not be e
1. The Hi-Mamba architecture introduces a hierarchical structure and a new directional alternation technique, which is a creative approach to enhancing Mamba's performance in SR tasks. 2. The paper addresses high computation costs problem of SSM-based SR models due to multi-directional scanning, and provides an alternative that maintains performance with less computational demand.
1. While the proposed changes are effective, they appear as relatively incremental modifications over existing SSM-based models, particularly MambaIR. The hierarchical structure and directional alternation are valuable but do not constitute a fundamentally new approach, which may limit the overall impact. 2. The performance improvement of Hi-Mamba compared to other state-of-the-art models, such as MambaIR and Transformer-based SR models, is relatively small. While Hi-Mamba demonstrates slight g
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Advanced Vision and Imaging · Medical Image Segmentation Techniques
MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces
