MSDS: Deep Structural Similarity with Multiscale Representation
Danling Kang, Xue-Hua Chen, Bin Liu, Keke Zhang, Weiling Chen, Tiesong Zhao

TL;DR
This paper introduces MSDS, a multiscale extension of DeepSSIM, which improves deep perceptual similarity modeling by explicitly incorporating spatial scale, leading to better alignment with human visual perception.
Contribution
The paper presents a minimal multiscale framework that isolates and demonstrates the importance of spatial scale in deep-feature-based perceptual similarity models.
Findings
MSDS outperforms single-scale DeepSSIM on benchmark datasets.
Incorporating multiscale information yields statistically significant improvements.
The approach introduces negligible additional complexity.
Abstract
Deep-feature-based perceptual similarity models have demonstrated strong alignment with human visual perception in Image Quality Assessment (IQA). However, most existing approaches operate at a single spatial scale, implicitly assuming that structural similarity at a fixed resolution is sufficient. The role of spatial scale in deep-feature similarity modeling thus remains insufficiently understood. In this letter, we isolate spatial scale as an independent factor using a minimal multiscale extension of DeepSSIM, referred to as Deep Structural Similarity with Multiscale Representation (MSDS). The proposed framework decouples deep feature representation from cross-scale integration by computing DeepSSIM independently across pyramid levels and fusing the resulting scores with a lightweight set of learnable global weights. Experiments on multiple benchmark datasets demonstrate consistent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
