Lifting Architectural Constraints of Injective Flows
Peter Sorrenson, Felix Draxler, Armand Rousselot, Sander Hummerich,, Lea Zimmermann, Ullrich K\"othe

TL;DR
This paper introduces an efficient method for training Injective Flows that jointly learn data manifolds and distributions, overcoming previous architectural and computational limitations, and demonstrating competitive results across various data types.
Contribution
It proposes a new stable maximum likelihood training objective for Injective Flows compatible with flexible architectures, enabling better manifold learning.
Findings
Efficient estimator for maximum likelihood compatible with free-form architectures
Stable training method prevents divergence when learning data manifolds
Competitive performance demonstrated on toy, tabular, and image datasets
Abstract
Normalizing Flows explicitly maximize a full-dimensional likelihood on the training data. However, real data is typically only supported on a lower-dimensional manifold leading the model to expend significant compute on modeling noise. Injective Flows fix this by jointly learning a manifold and the distribution on it. So far, they have been limited by restrictive architectures and/or high computational cost. We lift both constraints by a new efficient estimator for the maximum likelihood loss, compatible with free-form bottleneck architectures. We further show that naively learning both the data manifold and the distribution on it can lead to divergent solutions, and use this insight to motivate a stable maximum likelihood training objective. We perform extensive experiments on toy, tabular and image data, demonstrating the competitive performance of the resulting model.
Peer Reviews
Decision·ICLR 2024 poster
I'll enumerate the strengths below for ease of reference in discussion. These are not listed in order of importance. 1. The paper is generally very well-written and comes up quite polished. I'll outline below: - The introduction is very clean. The motivation for the method is clearly laid out. - The paper is well-situated amongst the related work. - The background is easy to digest. - The path to the final model in Section 4 is laid out well. - The appendix is quite thorough
I'll write weaknesses in a list as well. Again this list is not ordered in terms of importance. 1. In the end, this paper could be summarized as simply training autoencoders with a different training loss, with the loss motivated by previous work in injective flows. The novelty and significance of this particular choice of loss over other types of autoencoder losses is not completely clear for a couple of reasons: (i) Table 3 is not convincing, as the best results are still produced by other au
The presented techniques are original, with contributions on removing limitations from prior methods. The presented techniques are interesting and potentially valuable.
The clarity should be significantly improved. For example, many important derivations should be moved to the main manuscript, and important assumptions should be highlighted.
This paper has several strengths: 1. It is well written and easy to follow, and I think the authors did a good job of deciding which material to include in the main manuscript and which details to include in the appendix. 2. It is well motivated, as I agree with the authors that the current injective flow literature uses overly restrictive architectures and/or computationally intensive training procedures. 3. The simple observation that, when the encoder $f$ is the left inverse of the decoder
5. In my view, the main weakness of the paper is the lack of ablations. As mentioned, the paper proposes 3 improvements over rectangular flows, and it is unclear how much each of these contributes to the empirical performance of the proposed method. I think table 1 provides a perfect test bed to carry out these ablations: results using the same architecture as rectangular flows should be added to the table, both (a) using the gradient estimator from eq 10, and (b) that from eq 16. Ideally, using
Code & Models
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Machine Learning in Healthcare · Anomaly Detection Techniques and Applications
MethodsNormalizing Flows
