Multi-view Image Prompted Multi-view Diffusion for Improved 3D Generation
Seungwook Kim, Yichun Shi, Kejie Li, Minsu Cho, Peng Wang

TL;DR
This paper introduces MultiImageDream, a multi-view diffusion model that uses multiple image prompts to improve 3D object generation, outperforming single-image prompts without fine-tuning.
Contribution
The work extends ImageDream to support multiple image prompts, significantly enhancing 3D generation quality without additional training.
Findings
Multi-image prompts improve 3D generation performance.
The method outperforms single-image prompt approaches.
No fine-tuning of the pre-trained model is required.
Abstract
Using image as prompts for 3D generation demonstrate particularly strong performances compared to using text prompts alone, for images provide a more intuitive guidance for the 3D generation process. In this work, we delve into the potential of using multiple image prompts, instead of a single image prompt, for 3D generation. Specifically, we build on ImageDream, a novel image-prompt multi-view diffusion model, to support multi-view images as the input prompt. Our method, dubbed MultiImageDream, reveals that transitioning from a single-image prompt to multiple-image prompts enhances the performance of multi-view and 3D object generation according to various quantitative evaluation metrics and qualitative assessments. This advancement is achieved without the necessity of fine-tuning the pre-trained ImageDream multi-view diffusion model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Optical Imaging Technologies · Image and Video Stabilization
MethodsDiffusion
