Unleashing the power of Neural Collapse for Transferability Estimation

Yuhe Ding; Bo Jiang; Lijun Sheng; Aihua Zheng; Jian Liang

arXiv:2310.05754·cs.LG·October 10, 2023

Unleashing the power of Neural Collapse for Transferability Estimation

Yuhe Ding, Bo Jiang, Lijun Sheng, Aihua Zheng, Jian Liang

PDF

Open Access 3 Reviews

TL;DR

This paper introduces FaCe, a transferability estimation method based on Neural Collapse, which effectively predicts the suitability of pre-trained models for various downstream tasks by measuring class separation, fairness, and neural collapse.

Contribution

The paper proposes FaCe, a novel neural collapse-based metric for transferability estimation, demonstrating state-of-the-art performance across multiple tasks and architectures.

Findings

01

FaCe correlates strongly with transfer performance.

02

FaCe outperforms existing transferability metrics.

03

The method generalizes across tasks and architectures.

Abstract

Transferability estimation aims to provide heuristics for quantifying how suitable a pre-trained model is for a specific downstream task, without fine-tuning them all. Prior studies have revealed that well-trained models exhibit the phenomenon of Neural Collapse. Based on a widely used neural collapse metric in existing literature, we observe a strong correlation between the neural collapse of pre-trained models and their corresponding fine-tuned models. Inspired by this observation, we propose a novel method termed Fair Collapse (FaCe) for transferability estimation by comprehensively measuring the degree of neural collapse in the pre-trained model. Typically, FaCe comprises two different terms: the variance collapse term, which assesses the class separation and within-class compactness, and the class fairness term, which quantifies the fairness of the pre-trained model towards each…

Peer Reviews

Decision·Submitted to ICLR 2024

Reviewer 01Rating 6· marginally above the acceptance thresholdConfidence 2

Strengths

- Motivation is clear and well supported by method design. - Paper is well organized and easy to understand - The performance of FaCe looks promising.

Weaknesses

- Between-class covariance, $\Sigma_B$, is defined as the distance between the class avg and the global avg, which does not align with the definition of between-class, a distance between classes. - Class Fairness term might be related to Variance Collapse term. Class Fairness term uses intra-class and inter-class covariance, which are also used in Variance Collapse term. But, the relationship between the two terms is not studied enough.

Reviewer 02Rating 5· marginally below the acceptance thresholdConfidence 5

Strengths

- The authors explore the impact of Neural Collapse (NC) in the "pre-training then fine-tuning" paradigm and observe that the ranking of NC in the pre-trained models remains mostly consistent during the fine-tuning process. - The authors employ a metric Fair Collapse (FaCe) to estimate the transferability of pre-trained models.

Weaknesses

- The first term of NC is common. This idea is commonly used in various classification tasks. - The class fairness score F is used to make any class distribution has a similar overlap with the distribution of other classes. This is quite similar to making the distances between these distributions equal. - The fine-tuning hyperparameters of different pre-trained models may have a significant impact, and the phenomena observed in the paper might lack persuasiveness. - The paper lacks experiments

Reviewer 03Rating 6· marginally above the acceptance thresholdConfidence 3

Strengths

1.The paper addresses a significant challenge in transfer learning: the selection of the optimal pre-trained model for a designated downstream task. 2.The correlation between neural collapse in pre-trained models and their subsequent fine-tuned versions is intriguing and serves as the foundation for the method proposed. 3.The introduced FaCe method is innovative and incorporates both class separation and class fairness, potentially preventing biases during model selection. 4.The array of experim

Weaknesses

1.The introduction contains repetitive statements regarding transferability estimation and its objectives. For clarity and brevity, such redundancy should be avoided. 2.The meaning of the variable 't' in Equation (6) is not defined in the surrounding context. 3.When calculating the Variance Collapse, the authors assign equal weight to each category. However, during performance evaluation, categories with larger sample sizes play a more significant role. To ensure FaCe genuinely represents the po

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Explainable Artificial Intelligence (XAI)