AniMaker: Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation

Haoyuan Shi; Yunxin Li; Xinyu Chen; Longyue Wang; Baotian Hu; Min Zhang

arXiv:2506.10540·cs.MA·October 3, 2025

AniMaker: Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation

Haoyuan Shi, Yunxin Li, Xinyu Chen, Longyue Wang, Baotian Hu, Min Zhang

PDF

Open Access 1 Repo

TL;DR

AniMaker is a multi-agent framework that generates coherent storytelling animations from text by using MCTS-driven clip generation and a specialized evaluation system, improving quality and efficiency over existing methods.

Contribution

This paper introduces AniMaker, the first multi-agent system combining MCTS-based clip generation and a novel animation evaluation framework for story-coherent video creation from text.

Findings

01

Achieves higher quality animations as per VBench and AniEval metrics.

02

Significantly improves multi-candidate clip generation efficiency.

03

Creates more coherent and story-consistent animations from text input.

Abstract

Despite rapid advancements in video generation models, generating coherent storytelling videos that span multiple scenes and characters remains challenging. Current methods often rigidly convert pre-generated keyframes into fixed-length clips, resulting in disjointed narratives and pacing issues. Furthermore, the inherent instability of video generation models means that even a single low-quality clip can significantly degrade the entire output animation's logical coherence and visual continuity. To overcome these obstacles, we introduce AniMaker, a multi-agent framework enabling efficient multi-candidate clip generation and storytelling-aware clip selection, thus creating globally consistent and story-coherent animation solely from text input. The framework is structured around specialized agents, including the Director Agent for storyboard generation, the Photography Agent for video…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hitsz-tmg/anim-director
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Artificial Intelligence in Games · Video Analysis and Summarization

MethodsContrastive Language-Image Pre-training