MaskConnect: Connectivity Learning by Gradient Descent
Karim Ahmed, Lorenzo Torresani

TL;DR
MaskConnect introduces a method to learn optimal connectivity patterns between modules in deep networks during training, replacing predefined rules with a gradient descent-based approach, improving accuracy and parameter efficiency.
Contribution
This work presents a novel algorithm for learning module connectivity in deep networks jointly with weight optimization, removing reliance on manually designed connectivity rules.
Findings
Achieves higher accuracy than traditional connectivity rules.
Reduces number of parameters in certain settings.
Demonstrates effectiveness on multiple datasets with ResNet and ResNeXt.
Abstract
Although deep networks have recently emerged as the model of choice for many computer vision problems, in order to yield good results they often require time-consuming architecture search. To combat the complexity of design choices, prior work has adopted the principle of modularized design which consists in defining the network in terms of a composition of topologically identical or similar building blocks (a.k.a. modules). This reduces architecture search to the problem of determining the number of modules to compose and how to connect such modules. Again, for reasons of design complexity and training cost, previous approaches have relied on simple rules of connectivity, e.g., connecting each module to only the immediately preceding module or perhaps to all of the previous ones. Such simple connectivity rules are unlikely to yield the optimal architecture for the given problem. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAverage Pooling · ResNeXt Block · Grouped Convolution · ResNeXt · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Global Average Pooling · Residual Block
