Wasserstein Neural Processes

Andrew Carr; Jared Nielsen; David Wingate

arXiv:1910.00668·cs.LG·January 13, 2020

Wasserstein Neural Processes

Andrew Carr, Jared Nielsen, David Wingate

PDF

Open Access

TL;DR

Wasserstein Neural Processes (WNPs) improve upon traditional Neural Processes by using Wasserstein distance, enabling better learning of distributions, especially with disjoint supports, and demonstrating enhanced performance in experiments.

Contribution

The paper introduces Wasserstein Neural Processes, a novel approach that replaces the KL divergence with Wasserstein distance for training NPs, addressing limitations with disjoint support distributions.

Findings

01

WNPs outperform traditional NPs on certain tasks.

02

WNPs effectively handle distributions with disjoint support.

03

Experimental results validate the advantages of WNPs.

Abstract

Neural Processes (NPs) are a class of models that learn a mapping from a context set of input-output pairs to a distribution over functions. They are traditionally trained using maximum likelihood with a KL divergence regularization term. We show that there are desirable classes of problems where NPs, with this loss, fail to learn any reasonable distribution. We also show that this drawback is solved by using approximations of Wasserstein distance which calculates optimal transport distances even for distributions of disjoint support. We give experimental justification for our method and demonstrate performance. These Wasserstein Neural Processes (WNPs) maintain all of the benefits of traditional NPs while being able to approximate a new class of function mappings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks