Linear combinations of latents in generative models: subspaces and beyond
Erik Bodin, Alexandru Stere, Dragos D. Margineantu, Carl Henrik Ek, Henry Moss

TL;DR
This paper introduces LOL, a versatile method for forming linear combinations of latent variables in generative models, enabling more controlled and expressive data generation across different modalities.
Contribution
It proposes LOL, a general-purpose, easy-to-implement approach for linear combinations of latents that works broadly across models and data types.
Findings
LOL simplifies the creation of low-dimensional latent representations
It effectively constructs subspaces within the latent space
LOL enhances control over generative processes
Abstract
Sampling from generative models has become a crucial tool for applications like data synthesis and augmentation. Diffusion, Flow Matching and Continuous Normalising Flows have shown effectiveness across various modalities, and rely on latent variables for generation. For experimental design or creative applications that require more control over the generation process, it has become common to manipulate the latent variable directly. However, existing approaches for performing such manipulations (e.g. interpolation or forming low-dimensional representations) only work well in special cases or are network or data-modality specific. We propose Latent Optimal Linear combinations (LOL) as a general-purpose method to form linear combinations of latent variables that adhere to the assumptions of the generative model. As LOL is easy to implement and naturally addresses the broader task of…
Peer Reviews
Decision·ICLR 2025 Poster
- Well-motivated problem, the proposed method is novel, reasonable, and most importantly, relatively simple, yet seems to performing quite well in (certain) empirical benchmarks.
- The writing of the paper can be improved. Some parts are ambiguously written. For example, the notion of latent space $x_T$ can be unified for both flow matching and diffusion at time $T$, instead of writing $x(0)$ and $x_T$ to avoid confusion. The important observation, starting at line 206: "Having a latent vector with a norm that is likely for a sample from..." is extremely unclear. In fact, the sentence does not make sense. - I have a big question mark about the usage of the univariate te
1. The core argument of the paper, "the norm is not a sufficient statistic to look at when determining whether the latent is good for generations", is intuitive and sound. Especially Figure 3 presents a convincing argument. 2. They paper has many illustrations of the different interpolation methods results. In particular, the ones of low-dimensional subspaces are very nice.
1. The contribution of this paper is limited. Norm being not enough is known, and the proposed interpolation method, under the setting of diffusion/flow model, is just rescaling the samples norm: $\alpha=0$ and we rescale the sample by the expected norm gain $\sqrt{\beta}$. 2. The normality testing part feels out of place and under-explored. I assume it is there to justify that the norm is not enough, and we need broad characteristics. In that case, Figure 3 alone is enough. Then the paper never
The method is simple and easy to understand.
I think the examples in the paper are not convincing that the proposed method is much better than the baselines. Some numbers indicate this but there is no human evaluation in the results. I think that better examples would have made the paper stronger.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTheoretical and Computational Physics · Advanced Mathematical Modeling in Engineering
MethodsDiffusion · Normalizing Flows
