Layer-Wise Evolution of Representations in Fine-Tuned Transformers:   Insights from Sparse AutoEncoders

Suneel Nadipalli

arXiv:2502.16722·cs.CL·February 25, 2025

Layer-Wise Evolution of Representations in Fine-Tuned Transformers: Insights from Sparse AutoEncoders

Suneel Nadipalli

PDF

Open Access

TL;DR

This paper investigates how representations in fine-tuned BERT transformers evolve across layers, revealing a progression from general to task-specific features through activation analysis and autoencoder experiments.

Contribution

It provides novel insights into layer-wise representation changes during fine-tuning, using activation similarity and autoencoder analysis to understand adaptation mechanisms.

Findings

01

Early layers retain general representations

02

Middle layers transition between general and task-specific features

03

Later layers fully specialize in task adaptation

Abstract

Fine-tuning pre-trained transformers is a powerful technique for enhancing the performance of base models on specific tasks. From early applications in models like BERT to fine-tuning Large Language Models (LLMs), this approach has been instrumental in adapting general-purpose architectures for specialized downstream tasks. Understanding the fine-tuning process is crucial for uncovering how transformers adapt to specific objectives, retain general representations, and acquire task-specific features. This paper explores the underlying mechanisms of fine-tuning, specifically in the BERT transformer, by analyzing activation similarity, training Sparse AutoEncoders (SAEs), and visualizing token-level activations across different layers. Based on experiments conducted across multiple datasets and BERT layers, we observe a steady progression in how features adapt to the task at hand: early…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Parallel Computing and Optimization Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Adam · Softmax · Dropout · Weight Decay · Linear Layer · Layer Normalization · WordPiece · Dense Connections