Molecular Identification from AFM images using the IUPAC Nomenclature and Attribute Multimodal Recurrent Neural Networks
Jaime Carracedo-Cosme, Carlos Romero-Mu\~niz, Pablo Pou, Rub\'en P\'erez

TL;DR
This paper introduces a deep learning approach that uses multimodal recurrent neural networks to identify molecular structures from AFM images, framing it as an image captioning task and achieving high accuracy with a large dataset.
Contribution
It presents a novel deep learning architecture that translates AFM images into IUPAC molecular names, moving beyond traditional classification methods.
Findings
High accuracy in molecular identification from AFM images
Successful application of image captioning techniques to molecular data
Large-scale training with the QUAM-AFM dataset
Abstract
Despite being the main tool to visualize molecules at the atomic scale, AFM with CO-functionalized metal tips is unable to chemically identify the observed molecules. Here we present a strategy to address this challenging task using deep learning techniques. Instead of identifying a finite number of molecules following a traditional classification approach, we define the molecular identification as an image captioning problem. We design an architecture, composed of two multimodal recurrent neural networks, capable of identifying the structure and composition of an unknown molecule using a 3D-AFM image stack as input. The neural network is trained to provide the name of each molecule according to the IUPAC nomenclature rules. To train and test this algorithm we use the novel QUAM-AFM dataset, which contains almost 700,000 molecules and 165 million AFM images. The accuracy of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods
