CA$^2$T-Net: Category-Agnostic 3D Articulation Transfer from Single Image
Jasmine Collins, Anqi Liang, Jitendra Malik, Hao Zhang, Fr\'ed\'eric, Devernay

TL;DR
This paper introduces CA$^2$T-Net, a neural network that transfers articulation from a single image to 3D models across arbitrary categories, independent of topology, enabling automatic animation and motion inference.
Contribution
The method is the first to transfer articulation from a single image to 3D models regardless of object category or topology, trained solely on synthetic data.
Findings
Successfully transfers articulation to diverse 3D models
Works with real images for motion inference
Operates across arbitrary object categories
Abstract
We present a neural network approach to transfer the motion from a single image of an articulated object to a rest-state (i.e., unarticulated) 3D model. Our network learns to predict the object's pose, part segmentation, and corresponding motion parameters to reproduce the articulation shown in the input image. The network is composed of three distinct branches that take a shared joint image-shape embedding and is trained end-to-end. Unlike previous methods, our approach is independent of the topology of the object and can work with objects from arbitrary categories. Our method, trained with only synthetic data, can be used to automatically animate a mesh, infer motion from real images, and transfer articulation to functionally similar but geometrically distinct 3D models at test time.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · 3D Surveying and Cultural Heritage
MethodsTest
