Skin Tokens: A Learned Compact Representation for Unified Autoregressive Rigging

Jia-peng Zhang; Cheng-Feng Pu; Meng-Hao Guo; Yan-Pei Cao; Shi-Min Hu

arXiv:2602.04805·cs.GR·February 5, 2026

Skin Tokens: A Learned Compact Representation for Unified Autoregressive Rigging

Jia-peng Zhang, Cheng-Feng Pu, Meng-Hao Guo, Yan-Pei Cao, Shi-Min Hu

PDF

Open Access

TL;DR

This paper introduces SkinTokens, a learned discrete skinning representation, and TokenRig, a unified autoregressive framework that models skeletons and skin deformations, significantly improving rigging accuracy and robustness in 3D models.

Contribution

The paper proposes SkinTokens as a novel compact skinning representation and develops TokenRig, a unified autoregressive model for rigging, combining learning and reinforcement learning for better generalization.

Findings

01

Achieves 98-133% improvement in skinning accuracy over state-of-the-art methods.

02

Enhances bone prediction accuracy by 17-22% with reinforcement learning.

03

Provides a scalable, unified approach to 3D rigging with higher fidelity.

Abstract

The rapid proliferation of generative 3D models has created a critical bottleneck in animation pipelines: rigging. Existing automated methods are fundamentally limited by their approach to skinning, treating it as an ill-posed, high-dimensional regression task that is inefficient to optimize and is typically decoupled from skeleton generation. We posit this is a representation problem and introduce SkinTokens: a learned, compact, and discrete representation for skinning weights. By leveraging an FSQ-CVAE to capture the intrinsic sparsity of skinning, we reframe the task from continuous regression to a more tractable token sequence prediction problem. This representation enables TokenRig, a unified autoregressive framework that models the entire rig as a single sequence of skeletal parameters and SkinTokens, learning the complicated dependencies between skeletons and skin deformations.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Human Motion and Animation