Multi-View Image-to-Image Translation Supervised by 3D Pose

Idit Diamant; Oranit Dror; Hai Victor Habi; Arnon Netzer

arXiv:2104.05779·cs.CV·April 14, 2021

Multi-View Image-to-Image Translation Supervised by 3D Pose

Idit Diamant, Oranit Dror, Hai Victor Habi, Arnon Netzer

PDF

Open Access 1 Repo

TL;DR

This paper introduces an end-to-end multi-view image translation framework that uses 3D pose constraints to generate consistent, photo-realistic person images across multiple viewpoints with new poses.

Contribution

It proposes a novel joint learning approach for unpaired image translation models guided by 3D human pose constraints to ensure multi-view pose consistency.

Findings

01

Improved multi-view pose consistency in generated images

02

Enhanced photo-realism over baseline methods

03

Effective in generating images with new poses across views

Abstract

We address the task of multi-view image-to-image translation for person image generation. The goal is to synthesize photo-realistic multi-view images with pose-consistency across all views. Our proposed end-to-end framework is based on a joint learning of multiple unpaired image-to-image translation models, one per camera viewpoint. The joint learning is imposed by constraints on the shared 3D human pose in order to encourage the 2D pose projections in all views to be consistent. Experimental results on the CMU-Panoptic dataset demonstrate the effectiveness of the suggested framework in generating photo-realistic images of persons with new poses that are more consistent across all views in comparison to a standard Image-to-Image baseline. The code is available at: https://github.com/sony-si/MultiView-Img2Img

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sony-si/MultiView-Img2Img
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Advanced Image Processing Techniques