Wasserstein Neural Processes
Andrew Carr, Jared Nielsen, David Wingate

TL;DR
Wasserstein Neural Processes (WNPs) improve upon traditional Neural Processes by using Wasserstein distance, enabling better learning of distributions, especially with disjoint supports, and demonstrating enhanced performance in experiments.
Contribution
The paper introduces Wasserstein Neural Processes, a novel approach that replaces the KL divergence with Wasserstein distance for training NPs, addressing limitations with disjoint support distributions.
Findings
WNPs outperform traditional NPs on certain tasks.
WNPs effectively handle distributions with disjoint support.
Experimental results validate the advantages of WNPs.
Abstract
Neural Processes (NPs) are a class of models that learn a mapping from a context set of input-output pairs to a distribution over functions. They are traditionally trained using maximum likelihood with a KL divergence regularization term. We show that there are desirable classes of problems where NPs, with this loss, fail to learn any reasonable distribution. We also show that this drawback is solved by using approximations of Wasserstein distance which calculates optimal transport distances even for distributions of disjoint support. We give experimental justification for our method and demonstrate performance. These Wasserstein Neural Processes (WNPs) maintain all of the benefits of traditional NPs while being able to approximate a new class of function mappings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks
