# Conditional Variational AutoEncoder to Predict Suitable Conditions for Hydrogenation Reactions

**Authors:** Daniyar Mazitov, Timur Gimadiev, Assima Poyezzhayeva, Valentina Afonina, Timur Madzhidov

PMC · DOI: 10.3390/molecules31010075 · Molecules · 2025-12-24

## TL;DR

This paper introduces a new AI model to predict optimal chemical reaction conditions for hydrogenation, improving accuracy and flexibility in chemical synthesis planning.

## Contribution

A conditional variational autoencoder (CVAE) is proposed for generating diverse and accurate reaction conditions without exhaustive search.

## Key findings

- The CVAE model successfully predicted catalysts, additives, temperature, and pressure for hydrogenation reactions.
- The h-CVAE variant showed the best performance across multiple prediction scenarios.
- The model outperformed existing methods in benchmarking analyses.

## Abstract

Reaction conditions (RCs) are a crucial part of reaction definition, and their accurate prediction is an important component of chemical synthesis planning. The existence of multiple combinations of RCs capable of achieving the desired result complicates the task of condition recommendation. Herein, we propose a conditional variational autoencoder (CVAE) generative model to predict suitable RCs. The CVAE model has been customized to generate diverse sets of valid conditions, ensuring high flexibility and accuracy, while circumventing the necessity for enumeration or combinatorial search of potential RCs. The efficacy of the CVAE approaches was evaluated using hydrogenation reactions and other H2-mediated reactions, predicting the set of catalysts, additives (acid, base, and catalytic poison), ranges of temperature, and pressure. The CVAE models predicted conditions with different “heads”, each corresponding to specific condition components, and their respective losses. CVAE models were tested on two datasets: a small one containing 31K reactions with 2232 potential conditions’ combinations and a big one having 196K reactions with ~7 × 1042 potential conditions’ combinations to evaluate the model’s ability to predict varying complexity and diversity conditions. To optimize the accuracy of the models, we experimented with three latent distribution variants—Gaussian (g-CVAE), Riemannian Normalizing Flow (rnf-CVAE), and Hyperspherical Uniform (h-CVAE). In our experiments, the h-CVAE model demonstrated robust overall performance, making it the optimal choice for scenarios requiring high accuracy across multiple top-k predictions. Benchmarking analyses demonstrated the high performance of the CVAE models compared to state-of-the-art reaction condition prediction approaches.

## Full-text entities

- **Chemicals:** H2 (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12786955/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12786955/full.md

## References

46 references — full list in the complete paper: https://tomesphere.com/paper/PMC12786955/full.md

---
Source: https://tomesphere.com/paper/PMC12786955