TL;DR
FRIGID introduces a diffusion-based framework for molecular generation from mass spectra, leveraging large-scale training and inference-time scaling to significantly improve accuracy in molecular structure prediction.
Contribution
The paper presents FRIGID, a novel diffusion model framework that enhances molecular structure prediction from mass spectra through intermediate representations and inference-time refinement.
Findings
Achieves over 18% Top-1 accuracy on MassSpecGym benchmark.
Triples the Top-1 accuracy compared to leading methods on NPLIB1.
Performance scales log-linearly with inference compute.
Abstract
In this work, we present FRIGID, a framework with a novel diffusion language model that generates molecular structures conditioned on mass spectra via intermediate fingerprint representations and determined chemical formulae, training at the scale of hundreds of millions of unlabeled structures. We then demonstrate how forward fragmentation models enable inference-time scaling by identifying spectrum-inconsistent fragments and refining them through targeted remasking and denoising. While FRIGID already achieves strong performance with its diffusion base, inference-time scaling significantly improves its accuracy, surpassing 18% Top-1 accuracy on the challenging MassSpecGym benchmark and tripling the Top-1 accuracy of the leading methods on NPLIB1. Further empirical analyses show that FRIGID exhibits log-linear performance scaling with increasing inference-time compute, opening a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
