AETv2: AutoEncoding Transformations for Self-Supervised Representation Learning by Minimizing Geodesic Distances in Lie Groups
Feng Lin, Haohang Xu, Houqiang Li, Hongkai Xiong, Guo-Jun Qi

TL;DR
AETv2 introduces a novel self-supervised learning method that encodes transformations on Lie groups using geodesic distances, improving representation learning by better capturing the transformation manifold.
Contribution
It proposes a new approach to measure transformation deviations on Lie groups using geodesic distances, enhancing self-supervised learning effectiveness.
Findings
AETv2 outperforms previous models in multiple tasks.
Using geodesic distances improves transformation estimation accuracy.
The method effectively captures the manifold structure of transformations.
Abstract
Self-supervised learning by predicting transformations has demonstrated outstanding performances in both unsupervised and (semi-)supervised tasks. Among the state-of-the-art methods is the AutoEncoding Transformations (AET) by decoding transformations from the learned representations of original and transformed images. Both deterministic and probabilistic AETs rely on the Euclidean distance to measure the deviation of estimated transformations from their groundtruth counterparts. However, this assumption is questionable as a group of transformations often reside on a curved manifold rather staying in a flat Euclidean space. For this reason, we should use the geodesic to characterize how an image transform along the manifold of a transformation group, and adopt its length to measure the deviation between transformations. Particularly, we present to autoencode a Lie group of homography…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Human Pose and Action Recognition
