Convergence of backpropagation with momentum for network architectures   with skip connections

Chirag Agarwal; Joe Klobusicky; and Dan Schonfeld

arXiv:1705.07404·cs.CV·January 22, 2020·1 cites

Convergence of backpropagation with momentum for network architectures with skip connections

Chirag Agarwal, Joe Klobusicky, and Dan Schonfeld

PDF

Open Access

TL;DR

This paper proves convergence of backpropagation with momentum in deep DAG neural networks and demonstrates the effectiveness of such architectures through an autoencoder example.

Contribution

It extends convergence results to deep DAG architectures with skip connections, generalizing previous work on shallow networks.

Findings

01

Weights converge for a large class of nonlinear activations

02

DAG architectures outperform sequential networks in compression tasks

03

Autoencoders with skip connections are effective for data compression

Abstract

We study a class of deep neural networks with networks that form a directed acyclic graph (DAG). For backpropagation defined by gradient descent with adaptive momentum, we show weights converge for a large class of nonlinear activation functions. The proof generalizes the results of Wu et al. (2008) who showed convergence for a feed forward network with one hidden layer. For an example of the effectiveness of DAG architectures, we describe an example of compression through an autoencoder, and compare against sequential feed forward networks under several metrics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications