MolGrapher: Graph-based Visual Recognition of Chemical Structures
Lucas Morin, Martin Danelljan, Maria Isabel Agea, Ahmed Nassar, Valery, Weber, Ingmar Meijer, Peter Staar, Fisher Yu

TL;DR
MolGrapher is a novel graph-based visual recognition system that detects and classifies chemical structures from images, leveraging synthetic data and a new benchmark to improve accuracy in chemical literature analysis.
Contribution
Introduces MolGrapher, combining deep keypoint detection and graph neural networks for chemical structure recognition, along with a synthetic data pipeline and a large-scale benchmark dataset.
Findings
Outperforms classical and learning-based methods on multiple datasets
Effective recognition of atoms and bonds in diverse chemical images
Provides a new benchmark dataset for future research
Abstract
The automatic analysis of chemical literature has immense potential to accelerate the discovery of new materials and drugs. Much of the critical information in patent documents and scientific articles is contained in figures, depicting the molecule structures. However, automatically parsing the exact chemical structure is a formidable challenge, due to the amount of detailed information, the diversity of drawing styles, and the need for training data. In this work, we introduce MolGrapher to recognize chemical structures visually. First, a deep keypoint detector detects the atoms. Second, we treat all candidate atoms and bonds as nodes and put them in a graph. This construct allows a natural graph representation of the molecule. Last, we classify atom and bond nodes in the graph with a Graph Neural Network. To address the lack of real training data, we propose a synthetic data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
MolGrapher: Graph-based Visual Recognition of Chemical Structures· youtube
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Machine Learning in Bioinformatics
MethodsGraph Neural Network
