Statistical transformer networks: learning shape and appearance models via self supervision
Anil Bas, William A. P. Smith

TL;DR
The paper introduces Statistical Transformer Networks (StaTN), which integrate learned deformable shape models into neural networks, enabling unsupervised learning of shape and appearance models for improved nonrigid alignment and versatile applications.
Contribution
It proposes StaTN, a novel extension of Spatial Transformer Networks that learns deformable shape models without supervision, enhancing shape and appearance modeling capabilities.
Findings
StaTN learns optimal nonrigid alignment for specific tasks.
The model can learn shape and appearance models with no supervision.
It can be reused across different tasks and settings.
Abstract
We generalise Spatial Transformer Networks (STN) by replacing the parametric transformation of a fixed, regular sampling grid with a deformable, statistical shape model which is itself learnt. We call this a Statistical Transformer Network (StaTN). By training a network containing a StaTN end-to-end for a particular task, the network learns the optimal nonrigid alignment of the input data for the task. Moreover, the statistical shape model is learnt with no direct supervision (such as landmarks) and can be reused for other tasks. Besides training for a specific task, we also show that a StaTN can learn a shape model using generic loss functions. This includes a loss inspired by the minimum description length principle in which an appearance model is also learnt from scratch. In this configuration, our model learns an active appearance model and a means to fit the model from scratch with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · 3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Spatial Transformer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam
