How Neural Networks Extrapolate: From Feedforward to Graph Neural   Networks

Keyulu Xu; Mozhi Zhang; Jingling Li; Simon S. Du; Ken-ichi; Kawarabayashi; Stefanie Jegelka

arXiv:2009.11848·cs.LG·March 4, 2021·110 cites

How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks

Keyulu Xu, Mozhi Zhang, Jingling Li, Simon S. Du, Ken-ichi, Kawarabayashi, Stefanie Jegelka

PDF

Open Access 3 Repos 1 Video

TL;DR

This paper investigates how neural networks, including feedforward and graph neural networks, extrapolate beyond their training data, providing theoretical conditions for successful extrapolation and highlighting the importance of architecture and feature encoding.

Contribution

It offers a theoretical framework connecting neural tangent kernels to extrapolation capabilities and explains when GNNs succeed in extrapolating complex tasks.

Findings

01

ReLU MLPs quickly become linear, limiting nonlinear extrapolation.

02

Diverse training data enables MLPs to learn linear functions effectively.

03

GNNs' success in extrapolation depends on encoding task-specific nonlinearities.

Abstract

We study how neural networks trained by gradient descent extrapolate, i.e., what they learn outside the support of the training distribution. Previous works report mixed empirical results when extrapolating with neural networks: while feedforward neural networks, a.k.a. multilayer perceptrons (MLPs), do not extrapolate well in certain simple tasks, Graph Neural Networks (GNNs) -- structured networks with MLP modules -- have shown some success in more complex tasks. Working towards a theoretical explanation, we identify conditions under which MLPs and GNNs extrapolate well. First, we quantify the observation that ReLU MLPs quickly converge to linear functions along any direction from the origin, which implies that ReLU MLPs do not extrapolate most nonlinear functions. But, they can provably learn a linear target function when the training distribution is sufficiently "diverse". Second,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques · Machine Learning and ELM

Methods*Communicated@Fast*How Do I Communicate to Expedia?