Pullback Flow Matching on Data Manifolds

Friso de Kruiff; Erik Bekkers; Ozan \"Oktem; Carola-Bibiane Sch\"onlieb; Willem Diepeveen

arXiv:2410.04543·cs.LG·July 10, 2025

Pullback Flow Matching on Data Manifolds

Friso de Kruiff, Erik Bekkers, Ozan \"Oktem, Carola-Bibiane Sch\"onlieb, Willem Diepeveen

PDF

Open Access 3 Reviews

TL;DR

Pullback Flow Matching (PFM) introduces a new framework for generative modeling on data manifolds, leveraging pullback geometry and isometric learning to improve manifold learning and sample generation.

Contribution

PFM is the first method to use pullback geometry and isometric learning for scalable, efficient generative modeling on data manifolds with designable latent spaces.

Findings

01

Enhanced manifold learning and interpolation in latent space.

02

Successful application to protein data for generating novel proteins.

03

Improved generative performance over existing methods.

Abstract

We propose Pullback Flow Matching (PFM), a novel framework for generative modeling on data manifolds. Unlike existing methods that assume or learn restrictive closed-form manifold mappings for training Riemannian Flow Matching (RFM) models, PFM leverages pullback geometry and isometric learning to preserve the underlying manifold's geometry while enabling efficient generation and precise interpolation in latent space. This approach not only facilitates closed-form mappings on the data manifold but also allows for designable latent spaces, using assumed metrics on both data and latent manifolds. By enhancing isometric learning through Neural ODEs and proposing a scalable training objective, we achieve a latent space more suitable for interpolation, leading to improved manifold learning and generative performance. We demonstrate PFM's effectiveness through applications in synthetic data,…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 3Confidence 4

Strengths

- This paper demonstrates strong mathematical foundations and theoretical rigor, which helps ensure the method’s reliability and provides theoretical guarantees for its performance. - The neural ODE-based diffeomorphisms and design of product manifold eliminates need for expensive Riemannian metric tensor calculations and requires only geodesic distance approximations.

Weaknesses

- While PFM is presented as an improvement over RFM, there’s no quantitative evidence showing the relative performance. The paper does not provide direct experimental comparisons between PFM and RFM. - The paper mainly compare PFM with CFM and VAE, which misses comparasion with more state-of-the-art models, such as Diffusion, GAN, SLERP. - While the paper claims geometric preservation through pullback geometry and isometric learning, no formal theoretical proof is provided showing that PFM actua

Reviewer 02Rating 5Confidence 3

Strengths

The introductions and background read well, especially I find the method to be well motivated. Overall, the manuscript is clear and the presentation is easy to follow. The authors validate their method on toy and real data, and they conduct an ablation study on various loss terms.

Weaknesses

- The contributions of the paper appear somewhat limited, as it builds upon Diepeveen (2024). I think it would be beneficial if in the main body of the text the authors explained in more details how their work is different from Diepeveen (2024). - To improve on the current experiments, It would be great to add another toy dataset with known geodesic, e.g. a swiss roll rotated in higher dimensions.. - On line 200, the authors mention that the method requires fewer epochs, but I don't see an exper

Reviewer 03Rating 3Confidence 4

Strengths

- The technical aspects of the paper seem to be solid. - The approach for learning a Riemannian metric in the data space is interesting. - The experiments demonstrate the effectiveness of the proposed model.

Weaknesses

- The paper seems to rely heavily on Diepeveen (2024), with the primary difference being an extension of the diffeomorphism, while the Riemannian flow matching part is a direct application of Chen & Lipman (2024). Although the experiment with protein data is interesting, the technical contribution feels somewhat limited. - While the writing is technically accurate, the clarity could be improved. For example, adding figures would aid in explaining the modeling approach. - The objective function i

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Anomaly Detection Techniques and Applications