Mixed Continuous and Categorical Flow Matching for 3D De Novo Molecule Generation
Ian Dunn, David Ryan Koes

TL;DR
This paper introduces FlowMol, a flow matching model for 3D de novo molecule generation that outperforms previous methods, and investigates the challenges of modeling categorical data within flow matching frameworks.
Contribution
The paper extends flow matching to handle categorical data via SimplexFlow and demonstrates that a simpler approach can outperform complex extensions in 3D molecule generation.
Findings
SimplexFlow effectively models categorical data on the probability simplex.
A simpler approach without categorical accommodations performs as well or better.
FlowMol achieves improved performance over prior flow matching methods.
Abstract
Deep generative models that produce novel molecular structures have the potential to facilitate chemical discovery. Diffusion models currently achieve state of the art performance for 3D molecule generation. In this work, we explore the use of flow matching, a recently proposed generative modeling framework that generalizes diffusion models, for the task of de novo molecule generation. Flow matching provides flexibility in model design; however, the framework is predicated on the assumption of continuously-valued data. 3D de novo molecule generation requires jointly sampling continuous and categorical variables such as atom position and atom type. We extend the flow matching framework to categorical data by constructing flows that are constrained to exist on a continuous representation of categorical data known as the probability simplex. We call this extension SimplexFlow. We explore…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInnovative Microfluidic and Catalytic Techniques Innovation · Statistical and Computational Modeling · Machine Learning and Data Classification
MethodsDiffusion
