Disentangling Neural Architectures and Weights: A Case Study in Supervised Classification
Nicolo Colombo, Yang Gao

TL;DR
This paper explores the separation of neural architecture and weights in supervised classification, demonstrating that well-trained architectures can perform effectively without link-specific weight fine-tuning, using a novel spectral and optimization approach.
Contribution
It introduces a new method for optimizing binary, weight-agnostic neural architectures and provides theoretical guarantees and practical evaluation for disentangling structure from weights.
Findings
Weight-free architectures can achieve competitive performance.
A novel spectral method measures structural similarities.
The proposed optimization converges with theoretical guarantees.
Abstract
The history of deep learning has shown that human-designed problem-specific networks can greatly improve the classification performance of general neural models. In most practical cases, however, choosing the optimal architecture for a given task remains a challenging problem. Recent architecture-search methods are able to automatically build neural models with strong performance but fail to fully appreciate the interaction between neural architecture and weights. This work investigates the problem of disentangling the role of the neural structure and its edge weights, by showing that well-trained architectures may not need any link-specific fine-tuning of the weights. We compare the performance of such weight-free networks (in our case these are binary networks with {0, 1}-valued weights) with random, weight-agnostic, pruned and standard fully connected networks. To find the optimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
