ElitePLM: An Empirical Study on General Language Ability Evaluation of   Pretrained Language Models

Junyi Li; Tianyi Tang; Zheng Gong; Lixin Yang; Zhuohao Yu; Zhipeng; Chen; Jingyuan Wang; Wayne Xin Zhao; Ji-Rong Wen

arXiv:2205.01523·cs.CL·May 4, 2022

ElitePLM: An Empirical Study on General Language Ability Evaluation of Pretrained Language Models

Junyi Li, Tianyi Tang, Zheng Gong, Lixin Yang, Zhuohao Yu, Zhipeng, Chen, Jingyuan Wang, Wayne Xin Zhao, Ji-Rong Wen

PDF

Open Access 1 Repo

TL;DR

This paper conducts a comprehensive empirical evaluation of ten popular pretrained language models across four key language ability dimensions, providing insights into their strengths, limitations, and transferability for NLP tasks.

Contribution

It introduces a large-scale, systematic evaluation framework for assessing the general language abilities of various PLMs across multiple dimensions.

Findings

01

PLMs excel in different ability tests based on their training objectives.

02

Fine-tuning sensitivity varies with data size and distribution.

03

PLMs show strong transferability between similar tasks.

Abstract

Nowadays, pretrained language models (PLMs) have dominated the majority of NLP tasks. While, little research has been conducted on systematically evaluating the language abilities of PLMs. In this paper, we present a large-scale empirical study on general language ability evaluation of PLMs (ElitePLM). In our study, we design four evaluation dimensions, i.e. memory, comprehension, reasoning, and composition, to measure ten widely-used PLMs within five categories. Our empirical results demonstrate that: (1) PLMs with varying training objectives and strategies are good at different ability tests; (2) fine-tuning PLMs in downstream tasks is usually sensitive to the data size and distribution; (3) PLMs have excellent transferability between similar tasks. Moreover, the prediction results of PLMs in our experiments are released as an open resource for more deep and detailed analysis on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rucaibox/eliteplm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification