Wasserstein Distortion: Unifying Fidelity and Realism
Yang Qiu, Aaron B. Wagner, Johannes Ball\'e, Lucas Theis

TL;DR
This paper introduces Wasserstein distortion, a new image metric that unifies fidelity and realism, enabling improved texture generation and perceptual quality assessment.
Contribution
The paper proposes Wasserstein distortion as a unified, mathematically grounded metric that generalizes existing measures of image fidelity and realism.
Findings
Wasserstein distortion reduces to fidelity or realism constraints with different parameters.
It effectively measures similarity between images, capturing both fidelity and perceptual quality.
Generated textures demonstrate smooth transitions and high fidelity to references.
Abstract
We introduce a distortion measure for images, Wasserstein distortion, that simultaneously generalizes pixel-level fidelity on the one hand and realism or perceptual quality on the other. We show how Wasserstein distortion reduces to a pure fidelity constraint or a pure realism constraint under different parameter choices and discuss its metric properties. Pairs of images that are close under Wasserstein distortion illustrate its utility. In particular, we generate random textures that have high fidelity to a reference texture in one location of the image and smoothly transition to an independent realization of the texture as one moves away from this point. Wasserstein distortion attempts to generalize and unify prior work on texture generation, image realism and distortion, and models of the early human visual system, in the form of an optimizable metric in the mathematical sense.
Peer Reviews
Decision·Submitted to ICLR 2024
The paper is well written and clear. The objective of proposing a distortion measure associated with characteristics of the HVS is interesting and novel. The formulation of their distortion measure based on Wasserstein distances is sound and also practical. The authors achieve good and efficient texture synthesis results but not as impressive synthesis of realistic images as recent methods based on diffusion. Nonetheless their approach is much more interpretable. They do use features extracted f
The paper is mostly a proof-of-concept at this stage. The author mention using their method for image encoding but this is not part of this work, and it is not clear that it would work well since, like recent super-resolution approaches based on single image patches approach, it could tend to fill in details with elements that would look realistic but are not real.
- the paper is well grounded in its field, the introduction covers broadly the existing literature - the structure of the paper is clear and easy to follow - theoretical claims are illustrated by numerical experiments
- the Wasserstein distortion as introduced in section 2 is not really new and is in fact very close to the work of Freeman et al (2012) the main innovation being that the parameter sigma can freely be fixed at any position in the image instead of being constrained by the eccentricity of the visual receptive fields. - the maths of section 2 are overly complicated for a naive reader : in the end the authors use discrete optimal transport between empirical distributions and assume Gaussiannity whic
The idea of using nonuniform weights over pixels in distributions enables incorporating fidelity and realism into a common framework The proposed scheme enables smooth interpolation between fidelity and realism. The proposed methodology is grounded in theories of the HVS.
The main contribution of this work over prior works is that the proposed formulation considers distributions using nonuniform weights over pixels while existing works use equal weights. It is not clear to me if the contribution is significant enough. Can one apply simpler modifications to existing approaches to achieve fidelity in specific regions and realism in the other regions? For instance, one could adapt the approach of [Ref1] which can control the spatial distribution of textures accordin
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging
