3D GAN Inversion with Pose Optimization
Jaehoon Ko, Kyusun Cho, Daewon Choi, Kwangrok Ryoo, Seungryong Kim

TL;DR
This paper presents a novel 3D GAN inversion method that simultaneously infers camera pose and latent code, enabling multi-view consistent editing and 3D reconstruction from a single image, leveraging pre-trained estimators and NeRF-derived depth.
Contribution
It introduces a generalizable approach for 3D GAN inversion that does not rely on ground-truth camera viewpoints, improving multi-view consistency and image reconstruction quality.
Findings
Enhanced multi-view consistent image editing.
Improved 3D reconstruction accuracy.
Superior results compared to 2D GAN-based editing.
Abstract
With the recent advances in NeRF-based 3D aware GANs quality, projecting an image into the latent space of these 3D-aware GANs has a natural advantage over 2D GAN inversion: not only does it allow multi-view consistent editing of the projected image, but it also enables 3D reconstruction and novel view synthesis when given only a single image. However, the explicit viewpoint control acts as a main hindrance in the 3D GAN inversion process, as both camera pose and latent code have to be optimized simultaneously to reconstruct the given image. Most works that explore the latent space of the 3D-aware GANs rely on ground-truth camera viewpoint or deformable 3D model, thus limiting their applicability. In this work, we introduce a generalizable 3D GAN inversion method that infers camera viewpoint and latent code simultaneously to enable multi-view consistent semantic image editing. The key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
3D GAN inversion with Pose Optimization· youtube
Taxonomy
TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · Advanced Image and Video Retrieval Techniques
