A guide to convolution arithmetic for deep learning

Vincent Dumoulin; Francesco Visin

arXiv:1603.07285·stat.ML·January 12, 2018·1.2k cites

A guide to convolution arithmetic for deep learning

Vincent Dumoulin, Francesco Visin

PDF

Open Access 5 Repos

TL;DR

This paper provides a comprehensive guide to convolution arithmetic, clarifying how input shape, kernel size, padding, strides, and layer types interact in deep learning architectures, with illustrative derivations for better understanding.

Contribution

It offers a detailed, intuitive explanation of convolutional and transposed convolutional layer relationships, aiding practitioners in designing and understanding CNN architectures.

Findings

01

Derived formulas for convolutional layer output shapes

02

Clarified relationships between convolutional and transposed convolutional layers

03

Provided illustrative examples for intuitive understanding

Abstract

We introduce a guide to help deep learning practitioners understand and manipulate convolutional neural network architectures. The guide clarifies the relationship between various properties (input shape, kernel shape, zero padding, strides and output shape) of convolutional, pooling and transposed convolutional layers, as well as the relationship between convolutional and transposed convolutional layers. Relationships are derived for various cases, and are illustrated in order to make them intuitive.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Image and Signal Denoising Methods · Advanced Image Processing Techniques