Training CNNs with Low-Rank Filters for Efficient Image Classification

Yani Ioannou; Duncan Robertson; Jamie Shotton; Roberto Cipolla,; Antonio Criminisi

arXiv:1511.06744·cs.CV·November 30, 2016·ICLR

Training CNNs with Low-Rank Filters for Efficient Image Classification

Yani Ioannou, Duncan Robertson, Jamie Shotton, Roberto Cipolla,, Antonio Criminisi

PDF

TL;DR

This paper introduces a method for training CNNs with low-rank filters from scratch, leading to models that are computationally efficient and maintain high accuracy across various datasets and architectures.

Contribution

The authors develop a novel low-rank filter learning approach with a new weight initialization scheme, enabling efficient CNN training from scratch with reduced computation and parameters.

Findings

01

Achieved similar or higher accuracy with less compute on multiple datasets.

02

Reduced parameters and computation by up to 55% in tested architectures.

03

Maintained high accuracy while significantly decreasing model size.

Abstract

We propose a new method for creating computationally efficient convolutional neural networks (CNNs) by using low-rank representations of convolutional filters. Rather than approximating filters in previously-trained networks with more efficient versions, we learn a set of small basis filters from scratch; during training, the network learns to combine these basis filters into more complex filters that are discriminative for image classification. To train such networks, a novel weight initialization scheme is used. This allows effective initialization of connection weights in convolutional layers composed of groups of differently-shaped filters. We validate our approach by applying it to several existing CNN architectures and training these networks from scratch using the CIFAR, ILSVRC and MIT Places datasets. Our results show similar or higher accuracy than conventional CNNs with much…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.