Global Convergence of Sobolev Training for Overparameterized Neural   Networks

Jorio Cocola; Paul Hand

arXiv:2006.07928·cs.LG·August 18, 2020

Global Convergence of Sobolev Training for Overparameterized Neural Networks

Jorio Cocola, Paul Hand

PDF

TL;DR

This paper proves that overparameterized two-layer ReLU neural networks trained with Sobolev loss and gradient flow can accurately fit both function values and derivatives at specific points, under certain data separation conditions.

Contribution

It provides a theoretical guarantee for the convergence of Sobolev training in overparameterized neural networks, extending understanding of their expressive power.

Findings

01

Neural networks can fit both function values and derivatives using Sobolev loss.

02

Gradient flow training converges under data separation conditions.

03

Theoretical proof of universal approximation with Sobolev training.

Abstract

Sobolev loss is used when training a network to approximate the values and derivatives of a target function at a prescribed set of input points. Recent works have demonstrated its successful applications in various tasks such as distillation or synthetic gradient prediction. In this work we prove that an overparameterized two-layer relu neural network trained on the Sobolev loss with gradient flow from random initialization can fit any given function values and any given directional derivatives, under a separation condition on the input data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods*Communicated@Fast*How Do I Communicate to Expedia?