R-DMesh: Video-Guided 3D Animation via Rectified Dynamic Mesh Flow
Zijie Wu, Lixin Xu, Puhua Jiang, Sicong Liu, Chunchao Guo, Xiang Bai

TL;DR
R-DMesh is a novel framework that rectifies pose misalignment in video-guided 3D animation, enabling high-fidelity dynamic mesh generation by disentangling motion components and leveraging a diffusion transformer.
Contribution
The paper introduces a unified approach with a VAE and Triflow Attention to automatically align arbitrary input meshes with video context, addressing a key challenge in 3D animation.
Findings
R-DMesh effectively solves pose misalignment in 3D animation.
The method enables robust pose retargeting and 4D generation.
Constructed a large-scale dataset with over 500k dynamic mesh sequences.
Abstract
Video-guided 3D animation holds immense potential for content creation, offering intuitive and precise control over dynamic assets. However, practical deployment faces a critical yet frequently overlooked hurdle: the pose misalignment dilemma. In real-world scenarios, the initial pose of a user-provided static mesh rarely aligns with the starting frame of a reference video. Naively forcing a mesh to follow a mismatched trajectory inevitably leads to severe geometric distortion or animation failure. To address this, we present Rectified Dynamic Mesh (R-DMesh), a unified framework designed to generate high-fidelity 4D meshes that are ``rectified'' to align with video context. Unlike standard motion transfer approaches, our method introduces a novel VAE that explicitly disentangles the input into a conditional base mesh, relative motion trajectories, and a crucial rectification jump…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
