Redesigning the classification layer by randomizing the class representation vectors
Gabi Shalev, Gal-Lev Shalev, Joseph Keshet

TL;DR
This paper proposes a novel approach to neural image classification by randomizing class vectors in the classification layer, which improves accuracy and robustness by increasing class separability and reducing the influence of visual similarities.
Contribution
The paper introduces a simple yet effective method of fixing randomly drawn class vectors during training, challenging the conventional design of learned class representations.
Findings
Fixing class vectors increases inter-class separability.
The method improves overall model accuracy.
It maintains robustness to image corruptions and generalization.
Abstract
Neural image classification models typically consist of two components. The first is an image encoder, which is responsible for encoding a given raw image into a representative vector. The second is the classification component, which is often implemented by projecting the representative vector onto target class vectors. The target class vectors, along with the rest of the model parameters, are estimated so as to minimize the loss function. In this paper, we analyze how simple design choices for the classification layer affect the learning dynamics. We show that the standard cross-entropy training implicitly captures visual similarities between different classes, which might deteriorate accuracy or even prevents some models from converging. We propose to draw the class vectors randomly and set them as fixed during training, thus invalidating the visual similarities encoded in these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis
