Intrinsic Knowledge Evaluation on Chinese Language Models

Zhiruo Wang; Renfen Hu

arXiv:2011.14277·cs.CL·December 1, 2020·1 cites

Intrinsic Knowledge Evaluation on Chinese Language Models

Zhiruo Wang, Renfen Hu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a comprehensive Chinese language model knowledge evaluation benchmark covering syntactic, semantic, commonsense, and factual knowledge, providing insights into what and how well models encode various knowledge types.

Contribution

It presents four new evaluation tasks and a large question dataset to assess Chinese language models' encoding of different knowledge aspects, filling a gap in current evaluation methods.

Findings

01

Proposed a reliable benchmark for Chinese LM knowledge evaluation

02

Demonstrated models' strengths and weaknesses across knowledge types

03

Provided publicly available dataset for future research

Abstract

Recent NLP tasks have benefited a lot from pre-trained language models (LM) since they are able to encode knowledge of various aspects. However, current LM evaluations focus on downstream performance, hence lack to comprehensively inspect in which aspect and to what extent have they encoded knowledge. This paper addresses both queries by proposing four tasks on syntactic, semantic, commonsense, and factual knowledge, aggregating to a total of $39, 308$ questions covering both linguistic and world knowledge in Chinese. Throughout experiments, our probes and knowledge data prove to be a reliable benchmark for evaluating pre-trained Chinese LMs. Our work is publicly available at https://github.com/ZhiruoWang/ChnEval.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ZhiruoWang/ChnEval
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications