Gaussian Process Bandit Optimization with Machine Learning Predictions and Application to Hypothesis Generation
Xin Jennifer Chen, Yunjin Tong

TL;DR
This paper introduces PA-GP-UCB, a Bayesian optimization method that effectively combines expensive ground-truth evaluations, cheap predictions, and offline data to improve sample efficiency in hypothesis generation tasks.
Contribution
It proposes a novel algorithm that leverages both oracles and offline data, with theoretical guarantees and empirical validation showing faster convergence.
Findings
Faster convergence than baseline methods on synthetic benchmarks.
Provable regret bounds with improved constants.
Effective in real-world hypothesis evaluation with language models.
Abstract
Many real-world optimization problems involve an expensive ground-truth oracle (e.g., human evaluation, physical experiments) and a cheap, low-fidelity prediction oracle (e.g., machine learning models, simulations). Meanwhile, abundant offline data (e.g., past experiments and predictions) are often available and can be used to pretrain powerful predictive models, as well as to provide an informative prior. We propose Prediction-Augmented Gaussian Process Upper Confidence Bound (PA-GP-UCB), a novel Bayesian optimization algorithm that leverages both oracles and offline data to achieve provable gains in sample efficiency for the ground-truth oracle queries. PA-GP-UCB employs a control-variates estimator derived from a joint Gaussian process posterior to correct prediction bias and reduce uncertainty. We prove that PA-GP-UCB preserves the standard regret rate of GP-UCB while achieving a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Advanced Bandit Algorithms Research · Advanced Multi-Objective Optimization Algorithms
