TL;DR
This paper introduces a novel Bayesian experimental design method that leverages diffusion models and a pooled posterior distribution to efficiently optimize expected information gain, enabling practical high-dimensional experimental planning.
Contribution
It presents a new EIG gradient expression and a joint sampling-optimization approach using diffusion models, extending BOED to complex, high-dimensional problems.
Findings
Demonstrates improved efficiency over existing methods
Enables application of BOED with generative diffusion models
Shows promising results in numerical experiments
Abstract
Bayesian Optimal Experimental Design (BOED) is a powerful tool to reduce the cost of running a sequence of experiments. When based on the Expected Information Gain (EIG), design optimization corresponds to the maximization of some intractable expected contrast between prior and posterior distributions. Scaling this maximization to high dimensional and complex settings has been an issue due to BOED inherent computational complexity. In this work, we introduce a pooled posterior distribution with cost-effective sampling properties and provide a tractable access to the EIG contrast maximization via a new EIG gradient expression. Diffusion-based samplers are used to compute the dynamics of the pooled posterior and ideas from bi-level optimization are leveraged to derive an efficient joint sampling-optimization loop. The resulting efficiency gain allows to extend BOED to the well-tested…
Peer Reviews
Decision·ICLR 2025 Spotlight
The paper advances the field of BOED by making it more computationally efficient, whilst broadening the scope of problems that can be tackled by BOED by extending it to diffusion-based generative models. The idea of performing joint sampling and optimization using bi-level optimization seems novel to, although I am not very familiar with this literature. The experimental results are significant and demonstrate the efficacy of CoDiff compared to other baselines. The method is presented clearly al
Some minor points: * It would be nice to see some uncertainty bars around the results in Figure 3. * Parts of Section 4 and 5 were a bit difficult for me to follow, but that could just be because I am not familiar with the relevant literature. However, if possible, it might be a good exercise for the authors to think about making these parts a bit more accessible to the BOED community who are not familiar with the sampling-as-optimization literature. * It would have been nice to empirically test
- rigorously cites and discusses related ideas - merging multiple nested loops into a single loop is intuitive idea and has been applied in multiple separate fields (training a VAE is effectively the EM algorithm was simplified into a single step, in Bayesian optimzation few-shot Knowledge Gradient) - the writing is clear and each idea is easy enough to follow.
### Technical Comments - __expected posterior__ the paper states that they aim to find one proposal distribution that covers all possible future posteriors and proposes a distribution that is a geometric mixing of hallucinated posteriors $q(\theta) \propto \prod_i p(\theta|y_i, \xi)^{\nu_i}$. The prior is the true expected posterior $p(\theta) = \mathbb{E}_Y[p(\theta|y)]$, exactly the average density of all possible posterior densities. Why not use the prior $p(\theta)$ as the one global propos
- The authors make a great connection to optimization by sampling via diffusion models. - Thorough gradient analysis. - The connection to inverse problems e.g. inpainting in MNIST images as a BOED problem is an interesting new task in the BOED setting and could see BOED objectives (really mutual information objectives) gain further adoption in generative modeling.
While the authors make a great connection to optimization by sampling, theoretical analysis of the convergence of the desired goal distribution is missing. This would e.g. especially be helpful in the multi-step BOED setting to show how to improve bounds of the mutual information. I list out my critiques and questions below: - The authors miss that they are also optimizing a lower bound of the mutual information by the InfoNCE bound, thus their comments that their approximation is exact is inco
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDiffusion
