FlexiFlow: decomposable flow matching for generation of flexible molecular ensemble
Riccardo Tedoldi, Ola Engkvist, Patrick Bryant, Hossein Azizpour, Jon Paul Janet, Alessandro Tibo

TL;DR
FlexiFlow is a novel generative model that efficiently produces diverse molecular conformations and ensembles, improving over existing methods by capturing conformational diversity and enabling faster inference for drug discovery.
Contribution
We introduce FlexiFlow, a flow-matching architecture that jointly samples molecules and multiple conformations, preserving equivariance and permutation invariance, advancing 3D molecular generation techniques.
Findings
Achieves state-of-the-art results on QM9 and GEOM Drugs datasets.
Generates valid, unique, and diverse molecules with high fidelity.
Provides conformational ensembles comparable to physics-based methods at lower inference cost.
Abstract
Sampling useful three-dimensional molecular structures along with their most favorable conformations is a key challenge in drug discovery. Current state-of-the-art 3D de-novo design flow matching or diffusion-based models are limited to generating a single conformation. However, the conformational landscape of a molecule determines its observable properties and how tightly it is able to bind to a given protein target. By generating a representative set of low-energy conformers, we can more directly assess these properties and potentially improve the ability to generate molecules with desired thermodynamic observables. Towards this aim, we propose FlexiFlow, a novel architecture that extends flow-matching models, allowing for the joint sampling of molecules along with multiple conformations while preserving both equivariance and permutation invariance. We demonstrate the effectiveness of…
Peer Reviews
Decision·Submitted to ICLR 2026
1. Molecule generation (Significance) FlexiFlow shows competitive performance than prior works, shown in Table 1 and Table 2. 2. Ligand pose generation (Significance) Qualitative results on ligand pose generation shows better performance than the given data. A performance table along with figure 5 would strengthen the paper more.
1. Other data modality While the application of other data modalities, such as images, is mentioned in the introduction, it does not appear in the main paper and the appendix only. It would have been better if there was a short result also in the main paper. Alternatively, the authors could highlight the protein conditioning experiments in the introduction. 2. Methodology - training on a set of vectors The main purpose of this method is to generate diverse conformers given a molecule
1. The model innovatively employs conditional independence to factorize the flow matching objective, allowing it to generate multiple conformations per molecule and directly addressing the limitation of single-conformation output in existing methods. 2. The architecture effectively manages the interaction between a representative conformation and other conformations while maintaining equivariance, a design that adheres to the physical constraints of molecular systems. 3. The model's performance
1. The related work section lacks depth. It primarily emphasizes the motivation for FlexiFlow but fails to detail the core challenges and specific innovations of the proposed model, such as the use of conditional independence for objective factorization. 2. The model diagram is inadequately presented. The separation of the architecture across Figures 1 and 2 creates a disjointed narrative. Moreover, Figure 2 does not clearly illustrate the key differences between FlexiFlow and the baseline model
The method presented, FlexiFlow, is an innovative variant of standard flow matching methods, and exploits an interesting conditional independence of the flow matching objective to augment the model in a meaningful way. The theory behind the method, as well as the theoretical claims, are established well by the authors. It is promising to see this model achieve SOTA accuracy on QM9 and for several properties of GEOM-Drugs, and the exploration of the quality and diversity of the generated conforme
The paper lacks a bit of clarity around it's motivation. The authors claim that generation with conformers is an open problem to address, however there isn't a clear benefit to FlexiFlow over another molecular generation method followed sequentially by a conformer generation algorithm. If there is a significant advantage here, please do elaborate why the concurrent generation is particularly beneficial. The clarity of presentation of some of the results could also be improved.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Innovative Microfluidic and Catalytic Techniques Innovation
