How far can we go without convolution: Improving fully-connected   networks

Zhouhan Lin; Roland Memisevic; Kishore Konda

arXiv:1511.02580·cs.LG·November 10, 2015·31 cites

How far can we go without convolution: Improving fully-connected networks

Zhouhan Lin, Roland Memisevic, Kishore Konda

PDF

Open Access 5 Repos

TL;DR

This paper explores methods to enhance fully connected neural networks, achieving high accuracy on CIFAR-10 without convolutional layers by using linear bottlenecks and autoencoder pre-training.

Contribution

It introduces two effective techniques—linear bottleneck layers and bias-free autoencoder pre-training—to significantly improve fully connected network performance.

Findings

01

Achieved approximately 70% accuracy on CIFAR-10 without convolution.

02

Enhanced accuracy to 78% with data augmentation, nearing convolutional network performance.

03

Linked improvements to better gradient flow and reduced sparsity in networks.

Abstract

We propose ways to improve the performance of fully connected networks. We found that two approaches in particular have a strong effect on performance: linear bottleneck layers and unsupervised pre-training using autoencoders without hidden unit biases. We show how both approaches can be related to improving gradient flow and reducing sparsity in the network. We show that a fully connected network can yield approximately 70% classification accuracy on the permutation-invariant CIFAR-10 task, which is much higher than the current state-of-the-art. By adding deformations to the training data, the fully connected network achieves 78% accuracy, which is just 10% short of a decent convolutional network.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition