Three Costs of Amortizing Gaussian Process Inference with Neural Processes
Robin Young

TL;DR
This paper analyzes the costs involved in amortizing Gaussian process inference using neural processes, providing bounds and insights into the sources of approximation errors.
Contribution
It decomposes the KL divergence into three interpretable sources and links architecture size to kernel properties, offering practical architectural recommendations.
Findings
Bound the KL divergence into three sources: label contamination, information bottleneck, and amortization error.
Show the bottleneck truncation decays exponentially with representation dimension for squared-exponential kernels.
Identify persistent label contamination cost and suggest architectural improvements to mitigate amortization gap.
Abstract
Neural processes amortize Gaussian process inference, replacing the exact posterior with a learned map from context sets to predictive distributions. For a class of latent neural processes, we bound the Kullback--Leibler (KL) divergence between the GP and LNP predictives, decomposing it into three interpretable sources, namely label contamination as the neural process uses label values to estimate a quantity that is label-independent in the exact GP, an information bottleneck because the finite-dimensional representation cannot resolve the full context geometry, and amortization error from a single encoder network shared across all contexts. The bottleneck truncation term decays in the representation dimension as for squared-exponential kernels on where is a kernel-dependent constant and as for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
