Efficient Attribute Injection for Pretrained Language Models

Reinald Kim Amplayo; Kang Min Yoo; Sang-Woo Lee

arXiv:2109.07953·cs.CL·September 17, 2021

Efficient Attribute Injection for Pretrained Language Models

Reinald Kim Amplayo, Kang Min Yoo, Sang-Woo Lee

PDF

Open Access

TL;DR

This paper introduces a lightweight, memory-efficient attribute injection method for pretrained language models, enhancing their performance across diverse NLP tasks by extending adapters with low-rank and hypercomplex techniques.

Contribution

It proposes a novel attribute injection approach using adapters with low-rank and hypercomplex methods, improving efficiency and effectiveness over prior techniques.

Findings

01

Outperforms previous attribute injection methods

02

Achieves state-of-the-art results on multiple datasets

03

Reduces parameter increase with low-rank and hypercomplex approximations

Abstract

Metadata attributes (e.g., user and product IDs from reviews) can be incorporated as additional inputs to neural-based NLP models, by modifying the architecture of the models, in order to improve their performance. Recent models however rely on pretrained language models (PLMs), where previously used techniques for attribute injection are either nontrivial or ineffective. In this paper, we propose a lightweight and memory-efficient method to inject attributes to PLMs. We extend adapters, i.e. tiny plug-in feed-forward modules, to include attributes both independently of or jointly with the text. To limit the increase of parameters especially when the attribute vocabulary is large, we use low-rank approximations and hypercomplex multiplications, significantly decreasing the total parameters. We also introduce training mechanisms to handle domains in which attributes can be multi-labeled…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis