Human Perceptual Evaluations for Image Compression
Yash Patel, Srikar Appalaraju, R. Manmatha

TL;DR
This paper investigates the reliability of perceptual similarity metrics like MS-SSIM in evaluating image compression quality, revealing that higher MS-SSIM scores do not always correlate with better perceptual quality according to user studies.
Contribution
It demonstrates through user studies that optimizing for MS-SSIM can be misleading, as higher scores do not necessarily mean better perceptual quality in image compression.
Findings
Higher MS-SSIM does not always mean better perceptual quality.
Deep learning compression methods optimized for MS-SSIM can be perceptually worse.
User studies are essential for accurate evaluation of image compression quality.
Abstract
Recently, there has been much interest in deep learning techniques to do image compression and there have been claims that several of these produce better results than engineered compression schemes (such as JPEG, JPEG2000 or BPG). A standard way of comparing image compression schemes today is to use perceptual similarity metrics such as PSNR or MS-SSIM (multi-scale structural similarity). This has led to some deep learning techniques which directly optimize for MS-SSIM by choosing it as a loss function. While this leads to a higher MS-SSIM for such techniques, we demonstrate using user studies that the resulting improvement may be misleading. Deep learning techniques for image compression with a higher MS-SSIM may actually be perceptually worse than engineered compression schemes with a lower MS-SSIM.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Signal Denoising Methods · Advanced Image Processing Techniques · Advanced Data Compression Techniques
