A Pre-trained Reaction Embedding Descriptor Capturing Bond Transformation Patterns
Weiqi Liu, Fenglei Cao, Yuan Qi, Li-Cheng Xu

TL;DR
This paper introduces RXNEmb, a pre-trained reaction embedding descriptor that captures bond transformation patterns, improving reaction classification, visualization, and mechanistic understanding in data-driven chemistry models.
Contribution
The study presents RXNEmb, a novel reaction descriptor derived from a pre-trained model, enhancing reaction analysis and visualization beyond traditional rule-based methods.
Findings
Improved reaction clustering based on bond-change similarity.
Visualization of reaction space diversity.
Model focuses on chemically critical sites, offering mechanistic insights.
Abstract
With the rise of data-driven reaction prediction models, effective reaction descriptors are crucial for bridging the gap between real-world chemistry and digital representations. However, general-purpose, reaction-wise descriptors remain scarce. This study introduces RXNEmb, a novel reaction-level descriptor derived from RXNGraphormer, a model pre-trained to distinguish real reactions from fictitious ones with erroneous bond changes, thereby learning intrinsic bond formation and cleavage patterns. We demonstrate its utility by data-driven re-clustering of the USPTO-50k dataset, yielding a classification that more directly reflects bond-change similarities than rule-based categories. Combined with dimensionality reduction, RXNEmb enables visualization of reaction space diversity. Furthermore, attention weight analysis reveals the model's focus on chemically critical sites, providing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Asymmetric Hydrogenation and Catalysis
