UMFuse: Unified Multi View Fusion for Human Editing applications
Rishabh Jain, Mayur Hemani, Duygu Ceylan, Krishna Kumar Singh, Jingwan, Lu, Mausoom Sarkar, Balaji Krishnamurthy

TL;DR
UMFuse introduces a multi-view fusion approach that leverages multiple source images and pose information to improve human image editing accuracy and coherence, especially when target poses differ significantly from input images.
Contribution
The paper presents a novel multi-view fusion network that combines pose key points and textures from multiple images, enabling more accurate and coherent human editing compared to single-view methods.
Findings
Effective multi-view human reposing demonstrated.
Enhanced image coherence in Mix&Match Human Image generation.
Limitations of single-view editing highlighted.
Abstract
Numerous pose-guided human editing methods have been explored by the vision community due to their extensive practical applications. However, most of these methods still use an image-to-image formulation in which a single image is given as input to produce an edited image as output. This objective becomes ill-defined in cases when the target pose differs significantly from the input pose. Existing methods then resort to in-painting or style transfer to handle occlusions and preserve content. In this paper, we explore the utilization of multiple views to minimize the issue of missing information and generate an accurate representation of the underlying human model. To fuse knowledge from multiple viewpoints, we design a multi-view fusion network that takes the pose key points and texture from multiple source images and generates an explainable per-pixel appearance retrieval map.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
UMFuse: Unified Multi View Fusion for Human Editing Applications· youtube
Taxonomy
TopicsAdvanced Vision and Imaging · Visual Attention and Saliency Detection · Generative Adversarial Networks and Image Synthesis
