MGHF: Multi-Granular High-Frequency Perceptual Loss for Image Super-Resolution
Shoaib Meraj Sami, Md Mahedi Hasan, Mohammad Saeed Ebrahimi Saadabadi, Jeremy Dawson, Nasser Nasrabadi, Raghuveer Rao

TL;DR
This paper introduces MGHF, a novel multi-granular high-frequency perceptual loss framework based on invertible neural networks, which enhances image super-resolution by preserving detailed textures, styles, and content across multiple perspectives.
Contribution
The paper proposes an INN-based multi-granular perceptual loss framework with adaptive feature reweighting and comprehensive constraints for improved super-resolution quality.
Findings
Significant performance improvements across various super-resolution methods.
Effective preservation of textures, styles, and content details.
Enhanced local detail retention using modulated PatchNCE.
Abstract
While different variants of perceptual losses have been employed in super-resolution literature to synthesize more realistic, appealing, and detailed high-resolution images, most are convolutional neural networks-based, causing information loss during guidance and often relying on complicated architectures and training procedures. We propose an invertible neural network (INN)-based naive \textbf{M}ulti-\textbf{G}ranular \textbf{H}igh-\textbf{F}requency (MGHF-n) perceptual loss trained on ImageNet to overcome these issues. Furthermore, we develop a comprehensive framework (MGHF-c) with several constraints to preserve, prioritize, and regularize information across multiple perspectives: texture and style preservation, content preservation, regional detail preservation, and joint content-style regularization. Information is prioritized through adaptive entropy-based pruning and reweighting…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
- Utilizing INNs to improve conventional perceptual loss is interesting, and to the best of my knowledge, it is also novel. - MGHF often outperforms baseline methods in terms of quantitative evaluation.
My primary concern is about the fundamental of this work, considering the **perception-distortion trade-off** and **information preservating with INNs**. Please counter-argue and provide according experimental results if necessary. --- **Weakness1** I appreciate the effort of the authors'. However, I have doubt about the fundamental of this work. The authors propose to use an INN as a tool to preserve all information (theoretically proven); thereby "addressing the perception distortion trad
1. The design of MGHF-n and MGHF-c is technically sound and creative, combining INN-based feature extraction with entropy-based pruning, adaptive weighting, content-style consistency, and contrastive local information preservation (PatchNCE). The hierarchical integration of these components is well-motivated. 2. The experiments cover a wide range of SR models (GANs, diffusion, transformers) and datasets (RealSR, DrealSR, DIV2K, etc.), using both reference (PSNR, SSIM, LPIPS) and non-reference me
1. The writing is often dense and notation-heavy, making it difficult to follow, especially in Section 2 and the appendix. Key concepts (e.g., AWDFE, LIP) could be explained more clearly. The structure of the DFE and the exact role of each loss component could be better modularized and summarized. 2. It is unclear whether the enhanced baselines (e.g., OSEDiff+MGHF-c) are trained from scratch or fine-tuned from pre-trained models. Training details (e.g., dataset splits, optimization settings) are
The paper is quite dense and contains a substantial amount of material.
I found the way the paper was presented to be very confusing and unnecessarily complex, making it difficult to understand what was going on. For example, why is the information loss and unwanted harmonics introduced by CNN a problem? How do they affect the results? Invertible neural network (INN) is lossless by definition, so why is it useful to include the very complex theorems? Why is it necessary to introduce diffeomorphism here? I'm not an expert on diffeomorphisms, and this paper is very
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Image and Signal Denoising Methods · Image and Video Quality Assessment
