P-Flow: Prompting Visual Effects Generation

Rui Zhao; Mike Zheng Shou

arXiv:2603.22091·cs.CV·March 24, 2026

P-Flow: Prompting Visual Effects Generation

Rui Zhao, Mike Zheng Shou

PDF

Open Access

TL;DR

P-Flow is a training-free framework that refines text prompts at test time to accurately generate dynamic visual effects in videos, leveraging vision-language models for high-fidelity customization without modifying the underlying generative model.

Contribution

It introduces a novel prompt optimization method for dynamic visual effects in video generation, enabling high-quality customization without retraining the model.

Findings

01

Outperforms existing methods in visual effect fidelity and diversity

02

Effective in both text-to-video and image-to-video tasks

03

Achieves high-quality effects without model modification

Abstract

Recent advancements in video generation models have significantly improved their ability to follow text prompts. However, the customization of dynamic visual effects, defined as temporally evolving and appearance-driven visual phenomena like object crushing or explosion, remains underexplored. Prior works on motion customization or control mainly focus on low-level motions of the subject or camera, which can be guided using explicit control signals such as motion trajectories. In contrast, dynamic visual effects involve higher-level semantics that are more naturally suited for control via text prompts. However, it is hard and time-consuming for humans to craft a single prompt that accurately specifies these effects, as they require complex temporal reasoning and iterative refinement over time. To address this challenge, we propose P-Flow, a novel training-free framework for customizing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Human Motion and Animation