MoZoo:Unleashing Video Diffusion power in animal fur and muscle simulation

Dongxia Liu; Jie Ma; Xiaochen Yang; Jiancheng Zhang; Bin Xia; Zhehan Kan; Nisha Huang; Jun Liang; Wenming Yang; Jin Li

arXiv:2605.13857·cs.GR·May 15, 2026

MoZoo:Unleashing Video Diffusion power in animal fur and muscle simulation

Dongxia Liu, Jie Ma, Xiaochen Yang, Jiancheng Zhang, Bin Xia, Zhehan Kan, Nisha Huang, Jun Liang, Wenming Yang, Jin Li

PDF

TL;DR

MoZoo introduces a novel generative dynamics solver for high-fidelity animal fur and muscle simulation, leveraging multimodal guidance, role-aware synchronization, and synthetic data to improve realism and efficiency.

Contribution

The paper presents MoZoo, a new approach combining role-aware synchronization, asymmetric attention, and synthetic data pipelines for realistic animal simulation.

Findings

01

MoZoo achieves high-fidelity fur and muscle simulation across various animals.

02

The method maintains superior temporal and structural consistency.

03

MoZoo outperforms existing techniques in realism and computational efficiency.

Abstract

The creation of cinematic-quality animal effects necessitates the precise modeling of muscle and fur dynamics, a process that remains both labor-intensive and computationally expensive within traditional production workflows. While generative diffusion models have shown promise in diverse artistic workflows, their capacity for high-fidelity animal simulation remains largely unexploited. We present MoZoo, a generative dynamics solver that bypasses conventional refinement to synthesize high-fidelity animal videos from coarse meshes under multimodal guidance. We propose Role-Aware RoPE (RAR-RoPE) which employs role-based index remapping to synchronize motion alignment while decoupling reference information via fixed temporal offsets. Complementing this, Asymmetric Decoupled Attention partitions the latent sequence to enforce a unidirectional information flow, effectively preventing feature…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.