Three Costs of Amortizing Gaussian Process Inference with Neural Processes

Robin Young

arXiv:2605.21798·cs.LG·May 22, 2026

Three Costs of Amortizing Gaussian Process Inference with Neural Processes

Robin Young

PDF

TL;DR

This paper analyzes the costs involved in amortizing Gaussian process inference using neural processes, providing bounds and insights into the sources of approximation errors.

Contribution

It decomposes the KL divergence into three interpretable sources and links architecture size to kernel properties, offering practical architectural recommendations.

Findings

01

Bound the KL divergence into three sources: label contamination, information bottleneck, and amortization error.

02

Show the bottleneck truncation decays exponentially with representation dimension for squared-exponential kernels.

03

Identify persistent label contamination cost and suggest architectural improvements to mitigate amortization gap.

Abstract

Neural processes amortize Gaussian process inference, replacing the exact $O (n^{3})$ posterior with a learned $O (n)$ map from context sets to predictive distributions. For a class of latent neural processes, we bound the Kullback--Leibler (KL) divergence between the GP and LNP predictives, decomposing it into three interpretable sources, namely label contamination as the neural process uses label values to estimate a quantity that is label-independent in the exact GP, an information bottleneck because the finite-dimensional representation cannot resolve the full context geometry, and amortization error from a single encoder network shared across all contexts. The bottleneck truncation term decays in the representation dimension $d$ as $O (e^{- c d^{2/ d_{x}}})$ for squared-exponential kernels on $R^{d_{x}}$ where $c > 0$ is a kernel-dependent constant and as $O (d^{- 2 ν / d_{x}})$ for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.