Turning Text and Imagery into Captivating Visual Video

Mingming Wang; Elijah Miller

arXiv:2406.01851·cs.HC·June 5, 2024

Turning Text and Imagery into Captivating Visual Video

Mingming Wang, Elijah Miller

PDF

Open Access

TL;DR

This paper presents a generative model-based approach for creating multi-perspective architectural videos from images and text, enhancing design visualization and communication.

Contribution

It introduces a novel application of generative models for architectural visualization, enabling multi-view and text-to-video synthesis from single images or descriptions.

Findings

01

Enables consistent multi-view architectural videos from single images

02

Generates design videos directly from textual descriptions

03

Improves speed and creativity in architectural visualization

Abstract

The ability to visualize a structure from multiple perspectives is crucial for comprehensive planning and presentation. This paper introduces an advanced application of generative models, akin to Stable Video Diffusion, tailored for architectural visualization. We explore the potential of these models to create consistent multi-perspective videos of buildings from single images and to generate design videos directly from textual descriptions. The proposed method enhances the design process by offering rapid prototyping, cost and time efficiency, and an enriched creative space for architects and designers. By harnessing the power of AI, our approach not only accelerates the visualization of architectural concepts but also enables a more interactive and immersive experience for clients and stakeholders. This advancement in architectural visualization represents a significant leap forward,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSubtitles and Audiovisual Media · Video Analysis and Summarization · Multimedia Communication and Technology