Exploring validation metrics for offline model-based optimisation with diffusion models
Christopher Beckham, Alexandre Piche, David Vazquez, Christopher Pal

TL;DR
This paper investigates how different validation metrics correlate with true reward in offline model-based optimisation using diffusion models, aiming to improve evaluation methods without relying on expensive real-world testing.
Contribution
It proposes a comprehensive evaluation framework for validation metrics in offline MBO and assesses their effectiveness specifically for diffusion models, providing insights into metric ranking and hyperparameter effects.
Findings
Certain validation metrics show higher correlation with ground truth rewards.
The evaluation framework effectively compares metrics across multiple datasets.
Hyperparameters significantly influence metric performance.
Abstract
In model-based optimisation (MBO) we are interested in using machine learning to design candidates that maximise some measure of reward with respect to a black box function called the (ground truth) oracle, which is expensive to compute since it involves executing a real world process. In offline MBO we wish to do so without assuming access to such an oracle during training or validation, with makes evaluation non-straightforward. While an approximation to the ground oracle can be trained and used in place of it during model validation to measure the mean reward over generated candidates, the evaluation is approximate and vulnerable to adversarial examples. Measuring the mean reward of generated candidates over this approximation is one such `validation metric', whereas we are interested in a more fundamental question which is finding which validation metrics correlate the most with the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning in Materials Science · Advanced Multi-Objective Optimization Algorithms
MethodsDiffusion
