SIRe-Networks: Convolutional Neural Networks Architectural Extension for Information Preservation via Skip/Residual Connections and Interlaced Auto-Encoders
Danilo Avola, Luigi Cinque, Alessio Fagioli, Gian Luca Foresti

TL;DR
This paper introduces SIRe, an interlaced multi-task learning strategy that enhances CNN architectures by preserving input information and reducing vanishing gradients, leading to improved performance across multiple datasets.
Contribution
The paper proposes a novel SIRe strategy combining auto-encoders and skip/residual connections to improve CNN training and accuracy, addressing vanishing gradients.
Findings
SIRe improves CNN performance on multiple datasets
Enhanced architectures outperform baseline models
Method effectively reduces vanishing gradient issues
Abstract
Improving existing neural network architectures can involve several design choices such as manipulating the loss functions, employing a diverse learning strategy, exploiting gradient evolution at training time, optimizing the network hyper-parameters, or increasing the architecture depth. The latter approach is a straightforward solution, since it directly enhances the representation capabilities of a network; however, the increased depth generally incurs in the well-known vanishing gradient problem. In this paper, borrowing from different methods addressing this issue, we introduce an interlaced multi-task learning strategy, defined SIRe, to reduce the vanishing gradient in relation to the object classification task. The presented methodology directly improves a convolutional neural network (CNN) by preserving information from the input image through interlaced auto-encoders (AEs), and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
