DynVFX: Augmenting Real Videos with Dynamic Content
Danah Yatim, Rafail Fridman, Omer Bar-Tal, Tali Dekel

TL;DR
DynVFX is a zero-shot framework that seamlessly augments real videos with dynamic content based on simple text instructions, integrating new objects or effects naturally into existing scenes.
Contribution
It introduces a novel inference-based method that manipulates attention features for realistic, automated video augmentation without training, leveraging pre-trained text-to-video and vision-language models.
Findings
Effective augmentation of real videos with diverse dynamic content
Seamless integration accounting for camera motion and occlusions
Automated process requiring only simple user instructions
Abstract
We present a method for augmenting real-world videos with newly generated dynamic content. Given an input video and a simple user-provided text instruction describing the desired content, our method synthesizes dynamic objects or complex scene effects that naturally interact with the existing scene over time. The position, appearance, and motion of the new content are seamlessly integrated into the original footage while accounting for camera motion, occlusions, and interactions with other dynamic objects in the scene, resulting in a cohesive and realistic output video. We achieve this via a zero-shot, training-free framework that harnesses a pre-trained text-to-video diffusion transformer to synthesize the new content and a pre-trained vision-language model to envision the augmented scene in detail. Specifically, we introduce a novel inference-based method that manipulates features…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Multimedia Communication and Technology · Image and Video Quality Assessment
MethodsSoftmax · Attention Is All You Need · Diffusion
