# Mechanism-Aware Deep Learning for Polar Reaction Prediction

**Authors:** Ryan J. Miller, Alexander E. Dashuta, Brayden Rudisill, David Van Vranken, Pierre Baldi

PMC · DOI: 10.1021/jacs.5c16838 · Journal of the American Chemical Society · 2025-10-22

## TL;DR

This paper introduces a deep learning model that predicts chemical reactions with detailed mechanistic insights, improving accuracy and interpretability.

## Contribution

The paper introduces PMechRP and ArrowFinder, which use mechanistic data to predict reactions and their electron flow mechanisms.

## Key findings

- PMechRP achieves strong predictive accuracy using a hybrid pipeline combining Chemformer and Siamese models.
- ArrowFinder successfully predicts arrow-pushing mechanisms for chemical reactions.
- The model performs well on benchmarks including PMechDB and a human-curated textbook dataset.

## Abstract

Accurately predicting chemical reactions is essential
for driving
innovation in synthetic chemistry, with broad applications in medicine,
manufacturing, and agriculture. Yet reaction prediction remains a
complex problem that is both time-consuming and resource-intensive
for chemists to solve. Deep learning offers an appealing solution
by enabling high-throughput prediction, but most existing models are
trained on the US Patent Office data set and treat reactions as recipes
or overall transformationsmapping reactants directly to products
with limited mechanistic insight. To address this, we introduce PMechRP
(Polar Mechanistic Reaction Predictor), trained on the PMechDB data
set of polar elementary steps that capture electron flow and mechanistic
detail. To broaden coverage and improve generalization, we augment
PMechDB with combinatorially generated reactions and train models
spanning transformer, graph, and two-stage Siamese architectures.
In addition to reaction prediction models, we also develop ArrowFinder,
a new model that directly predicts arrow-pushing mechanisms for a
set of reactants and products. Our best-performing approach is a hybrid
pipeline that combines an ensemble of Chemformer models with a two-stage
Siamese framework, leveraging the accuracy of transformers while filtering
away “alchemical” products using the two-step network
and generating mechanistic annotations using ArrowFinder. This approach
achieves strong predictive accuracy while also providing interpretable
predictions. We evaluate performance across multiple benchmarks: PMechDB
test splits, a curated USPTO subset from the Open Reaction Database,
and a human benchmark of mechanistic pathways from an intermediate-level
textbook.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12593370/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12593370/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/PMC12593370/full.md

---
Source: https://tomesphere.com/paper/PMC12593370