Layer-Wise Evolution of Representations in Fine-Tuned Transformers: Insights from Sparse AutoEncoders
Suneel Nadipalli

TL;DR
This paper investigates how representations in fine-tuned BERT transformers evolve across layers, revealing a progression from general to task-specific features through activation analysis and autoencoder experiments.
Contribution
It provides novel insights into layer-wise representation changes during fine-tuning, using activation similarity and autoencoder analysis to understand adaptation mechanisms.
Findings
Early layers retain general representations
Middle layers transition between general and task-specific features
Later layers fully specialize in task adaptation
Abstract
Fine-tuning pre-trained transformers is a powerful technique for enhancing the performance of base models on specific tasks. From early applications in models like BERT to fine-tuning Large Language Models (LLMs), this approach has been instrumental in adapting general-purpose architectures for specialized downstream tasks. Understanding the fine-tuning process is crucial for uncovering how transformers adapt to specific objectives, retain general representations, and acquire task-specific features. This paper explores the underlying mechanisms of fine-tuning, specifically in the BERT transformer, by analyzing activation similarity, training Sparse AutoEncoders (SAEs), and visualizing token-level activations across different layers. Based on experiments conducted across multiple datasets and BERT layers, we observe a steady progression in how features adapt to the task at hand: early…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Parallel Computing and Optimization Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Adam · Softmax · Dropout · Weight Decay · Linear Layer · Layer Normalization · WordPiece · Dense Connections
