Bayesian Optimization of Catalysis With In-Context Learning

Mayk Caldas Ramos; Shane S. Michtavy; Marc D. Porosoff; Andrew D. White

arXiv:2304.05341·physics.chem-ph·May 16, 2025·31 cites

Bayesian Optimization of Catalysis With In-Context Learning

Mayk Caldas Ramos, Shane S. Michtavy, Marc D. Porosoff, Andrew D. White

PDF

Open Access 2 Repos

TL;DR

This paper introduces a novel approach that uses large language models for Bayesian optimization in materials discovery, enabling efficient catalyst design without explicit models or feature engineering.

Contribution

It extends in-context learning to regression with uncertainty estimation, allowing language-based Bayesian optimization directly in natural language space for materials science.

Findings

01

BO-ICL matches or outperforms Gaussian processes on benchmarks.

02

It identifies near-optimal catalysts within six iterations in live experiments.

03

The method operates effectively without structural or electronic descriptors.

Abstract

Large language models (LLMs) can perform accurate classification with zero or few examples through in-context learning. We extend this capability to regression with uncertainty estimation using frozen LLMs (e.g., GPT-3.5, Gemini), enabling Bayesian optimization (BO) in natural language without explicit model training or feature engineering. We apply this to materials discovery by representing experimental catalyst synthesis and testing procedures as natural language prompts. A key challenge in materials discovery is the need to characterize suboptimal candidates, which slows progress. While BO is effective for navigating large design spaces, standard surrogate models like Gaussian processes assume smoothness and continuity, an assumption that fails in highly non-linear domains such as heterogeneous catalysis. Our task-agnostic BO workflow overcomes this by operating directly in language…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Materials Science · Topic Modeling · Machine Learning and Data Classification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Attention Is All You Need · Linear Layer · Residual Connection · Cosine Annealing · Linear Warmup With Cosine Annealing · Dense Connections · Attention Dropout · 15 Ways to Contact How can i speak to someone at Delta Airlines