Mapping the Space of Chemical Reactions Using Attention-Based Neural Networks
Philippe Schwaller, Daniel Probst, Alain C. Vaucher, Vishnu H. Nair,, David Kreutter, Teodoro Laino, Jean-Louis Reymond

TL;DR
This paper demonstrates that transformer-based neural networks can accurately classify chemical reactions from simple text representations and generate meaningful reaction fingerprints, facilitating navigation and understanding of chemical reaction space.
Contribution
The study introduces a transformer model that classifies reactions with 98.2% accuracy and produces reaction fingerprints that outperform traditional methods.
Findings
Achieved 98.2% reaction classification accuracy.
Generated reaction fingerprints capturing fine-grained differences.
Created an interactive reaction atlas for visualization and similarity search.
Abstract
Organic reactions are usually assigned to classes containing reactions with similar reagents and mechanisms. Reaction classes facilitate the communication of complex concepts and efficient navigation through chemical reaction space. However, the classification process is a tedious task. It requires the identification of the corresponding reaction class template via annotation of the number of molecules in the reactions, the reaction center, and the distinction between reactants and reagents. This work shows that transformer-based models can infer reaction classes from non-annotated, simple text-based representations of chemical reactions. Our best model reaches a classification accuracy of 98.2%. We also show that the learned representations can be used as reaction fingerprints that capture fine-grained differences between reaction classes better than traditional reaction fingerprints.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Machine Learning in Materials Science · Biomedical Text Mining and Ontologies
MethodsLinear Layer · Residual Connection · Weight Decay · Attention Dropout · Linear Warmup With Linear Decay · WordPiece · Adam · Dropout · Softmax · Dense Connections
