Mind-of-Director: Multi-modal Agent-Driven Film Previsualization via Collaborative Decision-Making
Shufeng Nan, Mengtian Li, Sixiao Zheng, Yuwei Lu, Han Zhang, Yanwei Fu

TL;DR
Mind-of-Director is a multi-agent framework that automates film previsualization by collaboratively generating sequences within a game engine, integrating script, scene, character, and camera design.
Contribution
It introduces a novel multi-modal agent system that models collaborative decision-making for automated and interactive film previsualization.
Findings
Generates high-quality, semantically grounded previz sequences in about 25 minutes per idea.
Demonstrates effective agent collaboration for automated prototyping and human-in-the-loop filmmaking.
Achieves positive results in extensive experiments and human evaluations.
Abstract
We present Mind-of-Director, a multi-modal agent-driven framework for film previz that models the collaborative decision-making process of a film production team. Given a creative idea, Mind-of-Director orchestrates multiple specialized agents to produce previz sequences within the game engine. The framework consists of four cooperative modules: Script Development, where agents draft and refine the screenplay iteratively; Virtual Scene Design, which transforms text into semantically aligned 3D environments; Character Behaviour Control, which determines character blocking and motion; and Camera Planning, which optimizes framing, movement, and composition for cinematic camera effects. A real-time visual editing system built in the game engine further enables interactive inspection and synchronized timeline adjustment across scenes, behaviours, and cameras. Extensive experiments and human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
