Graph VQ-Transformer (GVT): Fast and Accurate Molecular Generation via High-Fidelity Discrete Latents

Haozhuo Zheng; Cheng Wang; Yang Liu

arXiv:2512.02667·cs.LG·December 3, 2025

Graph VQ-Transformer (GVT): Fast and Accurate Molecular Generation via High-Fidelity Discrete Latents

Haozhuo Zheng, Cheng Wang, Yang Liu

PDF

Open Access 1 Video

TL;DR

The paper introduces GVT, a novel two-stage molecular generation framework that combines a high-fidelity discrete VQ-VAE with a Transformer, achieving state-of-the-art accuracy and efficiency in generating molecules.

Contribution

It proposes a new Graph VQ-Transformer framework that maps molecular graphs to discrete sequences, enabling efficient and accurate molecule generation with a novel VQ-VAE and sequence modeling.

Findings

01

Achieves near-perfect reconstruction rates with the VQ-VAE.

02

Outperforms diffusion models on key molecular similarity metrics.

03

Sets new benchmarks on ZINC250k, MOSES, and GuacaMol datasets.

Abstract

The de novo generation of molecules with desirable properties is a critical challenge, where diffusion models are computationally intensive and autoregressive models struggle with error propagation. In this work, we introduce the Graph VQ-Transformer (GVT), a two-stage generative framework that achieves both high accuracy and efficiency. The core of our approach is a novel Graph Vector Quantized Variational Autoencoder (VQ-VAE) that compresses molecular graphs into high-fidelity discrete latent sequences. By synergistically combining a Graph Transformer with canonical Reverse Cuthill-McKee (RCM) node ordering and Rotary Positional Embeddings (RoPE), our VQ-VAE achieves near-perfect reconstruction rates. An autoregressive Transformer is then trained on these discrete latents, effectively converting graph generation into a well-structured sequence modeling problem. Crucially, this mapping…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Graph VQ-Transformer (GVT): Fast and Accurate Molecular Generation via High-Fidelity Discrete Latents· underline

Taxonomy

TopicsAdvanced Graph Neural Networks · Machine Learning in Materials Science · Generative Adversarial Networks and Image Synthesis