A Mechanism for Producing Aligned Latent Spaces with Autoencoders
Saachi Jain, Adityanarayanan Radhakrishnan, Caroline Uhler

TL;DR
This paper provides a theoretical framework showing how autoencoders produce aligned latent spaces, and demonstrates their application in aligning biological and linguistic data.
Contribution
It characterizes the stretching behavior of linear and nonlinear autoencoders and introduces an initialization scheme for controlled alignment.
Findings
Linear autoencoders stretch along data's left singular vectors.
An initialization scheme enables arbitrary stretching in linear autoencoders.
Autoencoders can align drug signatures and semantic shifts effectively.
Abstract
Aligned latent spaces, where meaningful semantic shifts in the input space correspond to a translation in the embedding space, play an important role in the success of downstream tasks such as unsupervised clustering and data imputation. In this work, we prove that linear and nonlinear autoencoders produce aligned latent spaces by stretching along the left singular vectors of the data. We fully characterize the amount of stretching in linear autoencoders and provide an initialization scheme to arbitrarily stretch along the top directions using these networks. We also quantify the amount of stretching in nonlinear autoencoders in a simplified setting. We use our theoretical results to align drug signatures across cell types in gene expression space and semantic shifts in word embedding spaces.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Machine Learning in Healthcare · Neural Networks and Applications
