TL;DR
The paper introduces SWAP-Score, a novel zero-shot metric applicable to both CNNs and Transformers, which accurately predicts neural network performance across vision and language tasks, improving efficiency in neural architecture search.
Contribution
Proposes SWAP-Score, a universal zero-shot metric that outperforms existing metrics in predicting network performance across multiple domains and architectures.
Findings
SWAP-Score achieves a Spearman's correlation of 0.93 with CIFAR-10 accuracy for DARTS CNNs.
SWAP-Score attains a correlation of 0.71 with GLUE tasks for FlexiBERT Transformers.
SWAP-NAS reduces neural architecture search time to approximately 6-9 minutes on CIFAR-10 and ImageNet.
Abstract
Zero-shot proxies, also known as training-free metrics, are widely adopted to reduce the computational overhead in neural network evaluation for scenarios such as Neural Architecture Search (NAS), as they do not require any training. Existing zero-shot metrics have several limitations, including weak correlation with the true performance and poor generalisation across different networks or downstream tasks. For example, most of these metrics apply only to either convolutional neural networks (CNNs) or Transformers, but not both. To address these limitations, we propose Sample-Wise Activation Patterns (SWAP), and its derivative, SWAP-Score, a novel and highly effective zero-shot metric. SWAP-Score is broadly applicable across both architecture families and task domains, demonstrating strong predictive performance in the majority of tasks. This metric measures the expressivity of neural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
