A guide to convolution arithmetic for deep learning
Vincent Dumoulin, Francesco Visin

TL;DR
This paper provides a comprehensive guide to convolution arithmetic, clarifying how input shape, kernel size, padding, strides, and layer types interact in deep learning architectures, with illustrative derivations for better understanding.
Contribution
It offers a detailed, intuitive explanation of convolutional and transposed convolutional layer relationships, aiding practitioners in designing and understanding CNN architectures.
Findings
Derived formulas for convolutional layer output shapes
Clarified relationships between convolutional and transposed convolutional layers
Provided illustrative examples for intuitive understanding
Abstract
We introduce a guide to help deep learning practitioners understand and manipulate convolutional neural network architectures. The guide clarifies the relationship between various properties (input shape, kernel shape, zero padding, strides and output shape) of convolutional, pooling and transposed convolutional layers, as well as the relationship between convolutional and transposed convolutional layers. Relationships are derived for various cases, and are illustrated in order to make them intuitive.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image and Signal Denoising Methods · Advanced Image Processing Techniques
