Generating Images with Perceptual Similarity Metrics based on Deep Networks
Alexey Dosovitskiy, Thomas Brox

TL;DR
This paper introduces deep perceptual similarity metrics (DeePSiM) that improve image generation quality by using deep network features for loss functions, resulting in sharper, more natural images across various applications.
Contribution
The paper proposes DeePSiM, a novel class of loss functions based on deep network features, which enhances image generation by better aligning with perceptual similarity.
Findings
Generated images are sharper and more natural.
DeePSiM improves autoencoder and VAE training.
Inversion of deep networks yields more realistic images.
Abstract
Image-generating machine learning models are typically trained with loss functions based on distance in the image space. This often leads to over-smoothed results. We propose a class of loss functions, which we call deep perceptual similarity metrics (DeePSiM), that mitigate this problem. Instead of computing distances in the image space, we compute distances between image features extracted by deep neural networks. This metric better reflects perceptually similarity of images and thus leads to better results. We show three applications: autoencoder training, a modification of a variational autoencoder, and inversion of deep convolutional networks. In all cases, the generated images look sharp and resemble natural images.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques · Advanced Image Processing Techniques
MethodsSolana Customer Service Number +1-833-534-1729
