PARA: Parameter-Efficient Fine-tuning with Prompt Aware Representation Adjustment
Zequan Liu, Yi Zhao, Ming Tan, Wei Zhu, Aaron Xuxiang Tian

TL;DR
PARA is a new parameter-efficient fine-tuning method that uses prompt-aware representation adjustment, outperforming existing methods like LoRA in efficiency and performance for multi-tenant transformer applications.
Contribution
Introduces PARA, a lightweight PEFT technique with prompt-aware vector generators, improving efficiency and performance in single-backbone multi-tenant scenarios.
Findings
Surpasses current PEFT benchmarks in performance.
More efficient than LoRA in multi-tenant settings.
Effective across diverse tasks.
Abstract
In the realm of parameter-efficient fine-tuning (PEFT) methods, while options like LoRA are available, there is a persistent demand in the industry for a PEFT approach that excels in both efficiency and performance within the context of single-backbone multi-tenant applications. This paper introduces a new and straightforward PEFT technique, termed \underline{P}rompt \underline{A}ware \underline{R}epresentation \underline{A}djustment (PARA). The core of our proposal is to integrate a lightweight vector generator within each Transformer layer. This generator produces vectors that are responsive to input prompts, thereby adjusting the hidden representations accordingly. Our extensive experimentation across diverse tasks has yielded promising results. Firstly, the PARA method has been shown to surpass current PEFT benchmarks in terms of performance, despite having a similar number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmbedded Systems Design Techniques · Parallel Computing and Optimization Techniques · Interconnection Networks and Systems
