Echo-LoRA: Parameter-Efficient Fine-Tuning via Cross-Layer Representation Injection
Yihang Peng, Peng Jin, Jie Gong, Xingyuan Chen, Lingjiao Xu, Ning Su, and Yan Ran

TL;DR
Echo-LoRA introduces a cross-layer representation injection technique for parameter-efficient fine-tuning of large language models, significantly improving performance on commonsense reasoning benchmarks without increasing inference costs.
Contribution
It proposes a novel cross-layer injection method that enhances LoRA-style fine-tuning by utilizing deeper layer representations during training.
Findings
Echo-LoRA outperforms LoRA baselines by 5.7 percentage points on average.
The method achieves a 3.0 point average gain under reproduced baselines.
Combining Echo-LoRA with DoRA yields an additional 2.7 point improvement.
Abstract
Parameter-efficient fine-tuning (PEFT) has become a practical route for adapting large language models to downstream tasks, with LoRA-style methods being particularly attractive because they are inexpensive to train and easy to deploy. Most LoRA variants, however, revise the update rule within the weight space of each layer and leave the intermediate representations formed by deeper layers largely unused. We propose Echo-LoRA, a cross-layer representation injection method for parameter-efficient fine-tuning. During training, Echo-LoRA collects boundary hidden states from deeper source layers, aggregates them into a sample-level echo representation, and uses lightweight projection and gating networks to inject the resulting signal into shallow LoRA or DoRA modules. Answer-only masking, masked distillation, and stochastic routing are used to keep this auxiliary path stable and to reduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
