MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion
Lehong Wu, Lilang Lin, Jiahang Zhang, Yiyang Ma, Jiaying Liu

TL;DR
MacDiff introduces a novel diffusion-based framework for skeleton modeling that improves representation learning and data augmentation, outperforming existing methods on key benchmarks.
Contribution
The paper pioneers the use of diffusion models for skeleton representation learning, integrating contrastive and generative objectives for enhanced performance.
Findings
Achieves state-of-the-art results on representation benchmarks.
Enhances fine-tuning performance with diffusion-based data augmentation.
Demonstrates the effectiveness of masked diffusion in skeleton modeling.
Abstract
Self-supervised learning has proved effective for skeleton-based human action understanding. However, previous works either rely on contrastive learning that suffers false negative problems or are based on reconstruction that learns too much unessential low-level clues, leading to limited representations for downstream tasks. Recently, great advances have been made in generative learning, which is naturally a challenging yet meaningful pretext task to model the general underlying data distributions. However, the representation learning capacity of generative models is under-explored, especially for the skeletons with spacial sparsity and temporal redundancy. To this end, we propose Masked Conditional Diffusion (MacDiff) as a unified framework for human skeleton modeling. For the first time, we leverage diffusion models as effective skeleton representation learners. Specifically, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsElasticity and Material Modeling · Protein Structure and Dynamics · Bone health and osteoporosis research
MethodsDiffusion · Contrastive Learning
