TL;DR
This paper introduces OverLORD, a novel two-stage framework that achieves high-fidelity, disentangled image translation by combining latent optimization for disentanglement and adversarial training for synthesis, handling correlated attributes effectively.
Contribution
OverLORD is the first framework to combine disentanglement and high-quality synthesis without adversarial training in the disentanglement stage, improving over prior methods.
Findings
Better disentanglement than state-of-the-art methods.
Higher translation quality and output diversity.
Effective modeling of correlated attributes.
Abstract
Image translation methods typically aim to manipulate a set of labeled attributes (given as supervision at training time e.g. domain label) while leaving the unlabeled attributes intact. Current methods achieve either: (i) disentanglement, which exhibits low visual fidelity and can only be satisfied where the attributes are perfectly uncorrelated. (ii) visually-plausible translations, which are clearly not disentangled. In this work, we propose OverLORD, a single framework for disentangling labeled and unlabeled attributes as well as synthesizing high-fidelity images, which is composed of two stages; (i) Disentanglement: Learning disentangled representations with latent optimization. Differently from previous approaches, we do not rely on adversarial training or any architectural biases. (ii) Synthesis: Training feed-forward encoders for inferring the learned attributes and tuning the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
