Enabling Lightweight Fine-tuning for Pre-trained Language Model   Compression based on Matrix Product Operators

Peiyu Liu; Ze-Feng Gao; Wayne Xin Zhao; Z.Y. Xie; Zhong-Yi Lu; Ji-Rong; Wen

arXiv:2106.02205·cs.LG·June 7, 2021·1 cites

Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators

Peiyu Liu, Ze-Feng Gao, Wayne Xin Zhao, Z.Y. Xie, Zhong-Yi Lu, Ji-Rong, Wen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel method for compressing pre-trained language models using matrix product operators from quantum physics, enabling significant parameter reduction during fine-tuning.

Contribution

It proposes a new MPO-based decomposition and fine-tuning strategy that reduces the number of parameters needed to adapt pre-trained models.

Findings

01

Achieves 91% average reduction in fine-tuning parameters.

02

Effective compression applicable to original and compressed models.

03

Significantly maintains model performance after compression.

Abstract

This paper presents a novel pre-trained language models (PLM) compression approach based on the matrix product operator (short as MPO) from quantum many-body physics. It can decompose an original matrix into central tensors (containing the core information) and auxiliary tensors (with only a small proportion of parameters). With the decomposed MPO structure, we propose a novel fine-tuning strategy by only updating the parameters from the auxiliary tensors, and design an optimization algorithm for MPO-based approximation over stacked network architectures. Our approach can be applied to the original or the compressed PLMs in a general way, which derives a lighter network and significantly reduces the parameters to be fine-tuned. Extensive experiments have demonstrated the effectiveness of the proposed approach in model compression, especially the reduction in finetuning parameters (91%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RUCAIBox/MPOP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Tensor decomposition and applications · Quantum many-body systems