TapMo: Shape-aware Motion Generation of Skeleton-free Characters
Jiaxu Zhang, Shaoli Huang, Zhigang Tu, Xin Chen, Xiaohang Zhan, Gang, Yu, Ying Shan

TL;DR
TapMo introduces a shape-aware, text-driven motion generation pipeline for skeleton-free 3D characters, enabling realistic animations without traditional rigging, and outperforms existing methods in quality and generalizability.
Contribution
The paper presents TapMo, a novel shape deformation-aware diffusion model that generates mesh-specific motions for non-rigged characters, eliminating the need for skeletal rigging.
Findings
Outperforms existing auto-animation methods in quality.
Works effectively on both seen and unseen 3D characters.
Handles diverse non-human meshes with shape-aware features.
Abstract
Previous motion generation methods are limited to the pre-rigged 3D human model, hindering their applications in the animation of various non-rigged characters. In this work, we present TapMo, a Text-driven Animation Pipeline for synthesizing Motion in a broad spectrum of skeleton-free 3D characters. The pivotal innovation in TapMo is its use of shape deformation-aware features as a condition to guide the diffusion model, thereby enabling the generation of mesh-specific motions for various characters. Specifically, TapMo comprises two main components - Mesh Handle Predictor and Shape-aware Diffusion Module. Mesh Handle Predictor predicts the skinning weights and clusters mesh vertices into adaptive handles for deformation control, which eliminates the need for traditional skeletal rigging. Shape-aware Motion Diffusion synthesizes motion with mesh-specific adaptations. This module…
Peer Reviews
Decision·ICLR 2024 poster
1. the generated animations for a variety of 3D characters are impressive. The structure of the 3D meshes are well recognized when associating it with the animations. 2. The application of shape deformation feature in animation is nice.
There are still penetrations between foot and ground in the generated animations, which downgrade the animation quality.
The research addresses an interesting and promising problem, as far as I know it is the first attempt to enable text-driven motion synthesis for skeleton-free characters. Comprehensive experiments are conducted, yielding impressive results across diverse shapes. Supplementary videos and a user study further validate the naturalness of the generated results. The combination of diffusion-based motion synthesis and skeleton-free mesh deformation is interesting and novel.
Some details are not clearly explained, such as the mesh deformation feature, what exactly is f_ and how it's obtained, and its dimensions, which are not reflected in the main text. From the appendix, it seems to be a 512-dimensional vector. Further explanation from the authors is desired. And how does mesh-specific adaptation affect the vertices, it is not included in the equations. How is the Discriminator implemented? Are the two modules trained jointly or separately? What are the visualizat
1. it is good to study generating shape-aware motions, especially for non-humanoid 3D characters. 2. The proposed method seems to be reasonable and might be promising to generate motions for unseen characters.
1. The proposed mesh handle predictor is simple and straightforward, but it is not clear how the proposed method resolves different characters that have different topologies with different semantics. Currently, the manuscript mentions that "each handle is dynamically assigned to vertices with the same semantics across different meshes", but it is not clear how the method will select those handles. Also, it is unclear how the method will choose the number of handles since different topologies te
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · 3D Shape Modeling and Analysis · Human Pose and Action Recognition
MethodsDiffusion
