Latent Diffusion-Based 3D Molecular Recovery from Vibrational Spectra
Wenjin Wu, Ale\v{s} Leonardis, Linjiang Chen, Jianbo Jiao

TL;DR
This paper introduces IR-GeoDiff, a latent diffusion model that accurately recovers 3D molecular structures from IR spectra, capturing complex spectral-structural relationships and focusing on characteristic functional groups.
Contribution
The novel IR-GeoDiff model integrates spectral data into 3D molecular structure generation, addressing limitations of prior methods that used 1D or 2D representations.
Findings
Successfully recovers 3D molecular geometries from IR spectra.
Demonstrates focus on characteristic functional groups in spectra.
Validates the model's ability to generate molecular distributions consistent with spectra.
Abstract
Infrared (IR) spectroscopy, a type of vibrational spectroscopy, is widely used for molecular structure determination and provides critical structural information for chemists. However, existing approaches for recovering molecular structures from IR spectra typically rely on one-dimensional SMILES strings or two-dimensional molecular graphs, which fail to capture the intricate relationship between spectral features and three-dimensional molecular geometry. Recent advances in diffusion models have greatly enhanced the ability to generate molecular structures in 3D space. Yet, no existing model has explored the distribution of 3D molecular geometries corresponding to a single IR spectrum. In this work, we introduce IR-GeoDiff, a latent diffusion model that recovers 3D molecular geometries from IR spectra by integrating spectral information into both node and edge representations of…
Peer Reviews
Decision·Submitted to ICLR 2026
1. The idea of reconstructing 3D molecular geometries from vibrational spectra is scientifically interesting and potentially impactful for computational chemistry and molecular spectroscopy. 2. The paper is overall well written and logically organized. It's easy to follow and recognize the paper's contributions.
1. The motivation for adopting a latent diffusion model is not sufficiently convincing. Latent diffusion models rely heavily on strong VAE encoders/decoders to build meaningful latent representations, but such powerful autoencoders are not yet well established for molecular 3D geometry. In contrast, existing non-latent 3D molecular diffusion models already achieve strong and stable results directly in coordinate space. The paper should clearly justify why the latent-space formulation is preferab
1. The work is innovative in its application of a latent diffusion model to the challenging task of recovering molecular geometry from infrared spectra. The model architecture is novel, particularly in its use of a multi-head cross-attention mechanism for feature fusion and a Transformer-based classifier for joint feature extraction. 2. The model design is sound and the experimental evaluation is comprehensive, encompassing both spectral and structural dimensions. The inclusion of attention visu
1. The experimental setup has limitations. The chosen baseline models, EDM and GEOLDM, were proposed approximately three years ago. Given the rapid pace of development in this field, more recent and potentially superior models should be included for comparison.The speed of diffusion models is too slow, and in recent years, many new models have been developed. These new models should be compared. 2. The study relies on a single dataset, QM9S, for validation. This dataset is limited to small molec
1. The task is novel and introduces a fresh perspective to the field. 2. The *Background* and *Related Works* sections are well-organized and provide a thorough summary, which is appropriate for a benchmark-focused study. 3. The overall framework maintains good SE(3)-equivariance properties throughout the model design. 4. The integration of spectral features is reasonable and well-supported by the ablation studies.
1. The paper makes two main claims regarding the task: (1) it aims to model the distribution of 3D molecular geometries corresponding to a single IR spectrum, as stated in the *Abstract* and *Introduction*; and (2) it seeks to learn a probabilistic model $\theta$ that captures the conditional distribution of molecular geometries given an IR spectrum, i.e., $p_{\theta}(G|S)$, as described in the *Preliminaries* section. However, I question the accuracy of this task formulation, since in the actua
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpectroscopy and Chemometric Analyses · Spectroscopy and Quantum Chemical Studies · Molecular spectroscopy and chirality
