# Bayesian Meta‐Learning for Few‐Shot Reaction Outcome Prediction of Asymmetric Hydrogenation of Olefins

**Authors:** Sukriti Singh, José Miguel Hernández‐Lobato

PMC · DOI: 10.1002/anie.202503821 · 2025-05-02

## TL;DR

This paper introduces a Bayesian meta-learning framework that predicts chemical reaction outcomes using limited data, outperforming traditional methods in accuracy.

## Contribution

The novel contribution is a Bayesian meta-learning workflow that improves prediction accuracy for asymmetric hydrogenation reactions with sparse data.

## Key findings

- Bayesian meta-learning methods (DKT and ADKF) outperformed single-task models like random forest and graph neural networks.
- The proposed ADKF-prior method further improved performance in low-data scenarios.
- The meta-model generalized well on substrate- and time-based splits.

## Abstract

Recent years have witnessed the increasing application of machine learning (ML) in chemical reaction development. These ML methods, in general, require huge training set examples. The published literature has large amounts of data, but there are modelling challenges due to the sparse nature of these datasets. Herein, we report a meta‐learning workflow that can utilize the literature‐mined data and return accurate predictions with limited data. A literature dataset comprising of over 12 000 transition metal catalyzed asymmetric hydrogenation of olefins (AHO) is chosen to demonstrate the utility of our protocol. A meta‐model is trained in a binary classification setting to identify highly enantioselective AHO reactions. Two Bayesian meta‐learning approaches are considered, namely, deep kernel transfer (DKT) and adaptive deep kernel fitting (ADKF). Both these methods returned better predictions compared to prototypical network, which is another popular meta‐learning approach. Single‐task methods, such as random forest, graph neural network, and deep kernel learning, performed worse than meta‐learning methods even when trained on full training data. Additionally, we propose another meta‐learning approach called ADKF‐prior that is shown to further improve the performance in low‐data settings. The generalizability of our meta‐model is also evaluated on substrate‐ and time‐based splits. Our meta‐learning workflow can be utilized to build a pretrained meta‐model for any reaction of interest, which can then be useful to predict the outcome of new but related reactions in a few‐shot manners.

A Bayesian meta‐learning framework is proposed for reaction outcome prediction in low‐data settings. It utilizes a literature‐mined dataset on asymmetric hydrogenation of olefins to build a pre‐trained model. This meta‐model is shown to return better predictions as compared to single‐tasks methods, such as random forest and graph neural networks, given minimal training data.

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12207359/full.md

---
Source: https://tomesphere.com/paper/PMC12207359