Quantised Transforming Auto-Encoders: Achieving Equivariance to   Arbitrary Transformations in Deep Networks

Jianbo Jiao; Jo\~ao F. Henriques

arXiv:2111.12873·cs.CV·November 29, 2021

Quantised Transforming Auto-Encoders: Achieving Equivariance to Arbitrary Transformations in Deep Networks

Jianbo Jiao, Jo\~ao F. Henriques

PDF

Open Access

TL;DR

This paper introduces a novel auto-encoder architecture that learns to be equivariant to arbitrary transformations, including non-geometric ones, purely from data, enhancing interpretability and robustness of deep networks.

Contribution

It proposes a flexible auto-encoder model that enforces multiple equivariance relations simultaneously without explicit transformation models.

Findings

01

Successfully re-renders transformed images on synthetic and real datasets.

02

Achieves object pose estimation with improved robustness.

03

Reduces to a CNN for translation-equivariance.

Abstract

In this work we investigate how to achieve equivariance to input transformations in deep networks, purely from data, without being given a model of those transformations. Convolutional Neural Networks (CNNs), for example, are equivariant to image translation, a transformation that can be easily modelled (by shifting the pixels vertically or horizontally). Other transformations, such as out-of-plane rotations, do not admit a simple analytic model. We propose an auto-encoder architecture whose embedding obeys an arbitrary set of equivariance relations simultaneously, such as translation, rotation, colour changes, and many others. This means that it can take an input image, and produce versions transformed by a given amount that were not observed before (e.g. a different point of view of the same object, or a colour variation). Despite extending to many (even non-geometric)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications