Learning Continuous Solvent Effects from Transient Flow Data: A Graph Neural Network Benchmark on Catechol Rearrangement
Hongsheng Xing, Qiuxin Si

TL;DR
This paper introduces a new benchmark dataset and a graph neural network model for predicting reaction yields across continuous solvent mixtures, significantly improving accuracy over traditional methods.
Contribution
The work presents the Catechol Benchmark dataset and a hybrid GNN architecture that effectively models continuous solvent effects in reaction outcome prediction.
Findings
Hybrid GNN achieves an MSE of 0.0039, outperforming baselines.
Explicit molecular graph message-passing is crucial for generalization.
Continuous mixture encoding enhances model robustness.
Abstract
Predicting reaction outcomes across continuous solvent composition ranges remains a critical challenge in organic synthesis and process chemistry. Traditional machine learning approaches often treat solvent identity as a discrete categorical variable, which prevents systematic interpolation and extrapolation across the solvent space. This work introduces the \textbf{Catechol Benchmark}, a high-throughput transient flow chemistry dataset comprising 1,227 experimental yield measurements for the rearrangement of allyl-substituted catechol in 24 pure solvents and their binary mixtures, parameterized by continuous volume fractions (). We evaluate various architectures under rigorous leave-one-solvent-out and leave-one-mixture-out protocols to test generalization to unseen chemical environments. Our results demonstrate that classical tabular methods (e.g., Gradient-Boosted Decision…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Advanced Graph Neural Networks · Computational Drug Discovery Methods
