What We Don't C: Manifold Disentanglement for Structured Discovery
Brian Rogers, Micah Bowles, Chris J. Lintott, Steve Croft, Oliver N. F. King, James Kostas Ray

TL;DR
This paper presents What We Don't C, a method that disentangles latent representations by removing conditioned information, enabling better exploration of unrepresented factors in high-dimensional data.
Contribution
It introduces a novel latent flow matching technique for explicit disentanglement of latent subspaces by removing conditional information.
Findings
Enhances interpretability of latent representations.
Facilitates discovery of unmodeled data factors.
Provides a simple mechanism for analyzing and controlling generative models.
Abstract
Accessing information in learned representations is critical for annotation, discovery, and data filtering in disciplines where high-dimensional datasets are common. We introduce What We Don't C, a novel approach based on latent flow matching that disentangles latent subspaces by explicitly removing information included in conditional guidance, resulting in meaningful residual representations. This allows factors of variation which have not already been captured in conditioning to become more readily available. We show how guidance in the flow path necessarily represses the information from the guiding, conditioning variables. Our results highlight this approach as a simple yet powerful mechanism for analyzing, controlling, and repurposing latent representations, providing a pathway toward using generative models to explore what we don't capture, consider, or catalog.
Peer Reviews
Decision·Submitted to ICLR 2026
1. The idea of combining flow matching with variational autoencoders (VAEs) is interesting and has potential to inspire further exploration in disentangled representation learning. 2. The paper is well-structured.
1. Since the method is built on top of VAEs and relies on the approximately Gaussian distribution of their latent space, its use is restricted to this specific class of generative models. 2. The paper lacks sufficient supporting evidence. In the experimental section, the author evaluates the method on synthetic 2D Gaussian data, CMNIST, and a real-world dataset. All three datasets are relatively simple, and other existing disentanglement methods are known to perform well on them—particularly o
- addresses an important problem in an interesting way (including allowing further disentanglement of pretrained models) - reasonable breadth of experiments, from simple controlled to complex real datasets, including intuitive results in figures 6 and 7 - intuitive results, especially in figures 6 and 7
- no reproducibility statement or opensource code, which is especially important for less theoretical contributions like this - no (argument for the lack of) clear contextualization or comparison against existing disentanglement approaches - hard-to-follow theory presentation in sections 2 and 3; maybe I just lack the background, but I guess I'm not the only reader who would benefit from gentler, more precise guidance through it - unpolished writing
The paper has a clear and compelling motivation: instead of continuing to reinforce information we already understand in a dataset, it focuses on uncovering what remains after known factors are removed. This conceptual reframing is refreshing and feels genuinely useful, especially for exploratory scientific analysis. The authors present the idea in an intuitive way, and the progression of experiments (from synthetic data to real astrophysics imagery) helps build trust in the approach. The qualit
The main limitation is that the evaluation remains largely qualitative, making it difficult to assess how well the method performs relative to established baselines in representation learning or disentanglement research. The paper would benefit from more systematic quantitative comparisons or metrics to support its claims. Some of the theoretical explanations around how information is preserved or removed during the latent flow process are also hard to follow and could use clearer intuition rath
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Domain Adaptation and Few-Shot Learning
