How Does Fine-tuning Affect the Geometry of Embedding Space: A Case Study on Isotropy
Sara Rajaee, Mohammad Taher Pilehvar

TL;DR
This paper investigates how fine-tuning affects the geometric structure of embedding spaces, revealing that it often increases elongated directions and alters local structures, which challenges existing isotropy improvement methods.
Contribution
It provides a detailed analysis of the structural changes in embedding space after fine-tuning, highlighting limitations of current isotropy enhancement techniques.
Findings
Fine-tuning does not necessarily improve isotropy.
Local structures in embeddings change significantly after fine-tuning.
Increased elongated directions carry essential linguistic information.
Abstract
It is widely accepted that fine-tuning pre-trained language models usually brings about performance improvements in downstream tasks. However, there are limited studies on the reasons behind this effectiveness, particularly from the viewpoint of structural changes in the embedding space. Trying to fill this gap, in this paper, we analyze the extent to which the isotropy of the embedding space changes after fine-tuning. We demonstrate that, even though isotropy is a desirable geometrical property, fine-tuning does not necessarily result in isotropy enhancements. Moreover, local structures in pre-trained contextual word representations (CWRs), such as those encoding token types or frequency, undergo a massive change during fine-tuning. Our experiments show dramatic growth in the number of elongated directions in the embedding space, which, in contrast to pre-trained CWRs, carry the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
