Reweighted Proximal Pruning for Large-Scale Language Representation

Fu-Ming Guo; Sijia Liu; Finlay S. Mungall; Xue Lin; Yanzhi Wang

arXiv:1909.12486·cs.LG·December 24, 2019·45 cites

Reweighted Proximal Pruning for Large-Scale Language Representation

Fu-Ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang

PDF

Open Access

TL;DR

This paper introduces Reweighted Proximal Pruning (RPP), a novel method for compressing large-scale language models like BERT, maintaining high accuracy across tasks and enabling deployment on diverse devices.

Contribution

The paper presents RPP, a new pruning technique tailored for large language models, demonstrating effective compression while preserving performance on multiple NLP benchmarks.

Findings

01

RPP maintains high accuracy at high prune ratios.

02

Pruned BERT performs well on SQuAD and GLUE tasks.

03

Enables deployment of large models on various devices.

Abstract

Recently, pre-trained language representation flourishes as the mainstay of the natural language understanding community, e.g., BERT. These pre-trained language representations can create state-of-the-art results on a wide range of downstream tasks. Along with continuous significant performance improvement, the size and complexity of these pre-trained neural models continue to increase rapidly. Is it possible to compress these large-scale language representation models? How will the pruned language representation affect the downstream multi-task transfer learning objectives? In this paper, we propose Reweighted Proximal Pruning (RPP), a new pruning method specifically designed for a large-scale language representation model. Through experiments on SQuAD and the GLUE benchmark suite, we show that proximal pruned BERT keeps high accuracy for both the pre-training task and the downstream…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques

MethodsPruning · Linear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · WordPiece