Learning to Generate Images with Perceptual Similarity Metrics

Jake Snell; Karl Ridgeway; Renjie Liao; Brett D. Roads; Michael C.; Mozer; Richard S. Zemel

arXiv:1511.06409·cs.LG·January 25, 2017

Learning to Generate Images with Perceptual Similarity Metrics

Jake Snell, Karl Ridgeway, Renjie Liao, Brett D. Roads, Michael C., Mozer, Richard S. Zemel

PDF

1 Repo

TL;DR

This paper introduces the use of the multiscale structural-similarity score (MS-SSIM) as a perceptually aligned loss function for training image synthesis networks, leading to images preferred by humans over traditional pixel-wise losses.

Contribution

It demonstrates that MS-SSIM, being differentiable, improves image quality in synthesis tasks and aligns better with human perception compared to pixel-wise loss functions.

Findings

01

Humans prefer images generated with MS-SSIM loss over pixel-wise loss.

02

MS-SSIM-optimized models outperform pixel-wise models in image reconstruction quality.

03

Perceptually-optimized representations enhance performance in image classification and super-resolution.

Abstract

Deep networks are increasingly being applied to problems involving image synthesis, e.g., generating images from textual descriptions and reconstructing an input image from a compact representation. Supervised training of image-synthesis networks typically uses a pixel-wise loss (PL) to indicate the mismatch between a generated image and its corresponding target image. We propose instead to use a loss function that is better calibrated to human perceptual judgments of image quality: the multiscale structural-similarity score (MS-SSIM). Because MS-SSIM is differentiable, it is easily incorporated into gradient-descent learning. We compare the consequences of using MS-SSIM versus PL loss on training deterministic and stochastic autoencoders. For three different architectures, we collected human judgments of the quality of image reconstructions. Observers reliably prefer images synthesized…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

clementchadebec/benchmark_VAE
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.