CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
Ningyu Zhang, Mosha Chen, Zhen Bi, Xiaozhuan Liang, Lei Li, Xin Shang,, Kangping Yin, Chuanqi Tan, Jian Xu, Fei Huang, Luo Si, Yuan Ni, Guotong Xie,, Zhifang Sui, Baobao Chang, Hui Zong, Zheng Yuan, Linfeng Li, Jun Yan,, Hongying Zan, Kunli Zhang, Buzhou Tang, Qingcai Chen

TL;DR
CBLUE is the first comprehensive Chinese biomedical language understanding benchmark, providing diverse tasks and evaluation platform to advance AI research in Chinese medical NLP, revealing current models lag behind human performance.
Contribution
This paper introduces CBLUE, a new Chinese biomedical NLP benchmark with multiple tasks and an evaluation platform, filling a gap in non-English biomedical AI research.
Findings
Current models perform significantly worse than humans on CBLUE tasks.
The benchmark covers diverse biomedical NLP tasks including NER, information extraction, and classification.
Empirical results highlight the need for improved Chinese biomedical language models.
Abstract
Artificial Intelligence (AI), along with the recent progress in biomedical language understanding, is gradually changing medical practice. With the development of biomedical language understanding benchmarks, AI applications are widely used in the medical field. However, most benchmarks are limited to English, which makes it challenging to replicate many of the successes in English for other languages. To facilitate research in this direction, we collect real-world biomedical data and present the first Chinese Biomedical Language Understanding Evaluation (CBLUE) benchmark: a collection of natural language understanding tasks including named entity recognition, information extraction, clinical diagnosis normalization, single-sentence/sentence-pair classification, and an associated online platform for model evaluation, comparison, and analysis. To establish evaluation on these tasks, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Natural Language Processing Techniques
MethodsALBERT · RoBERTa · BERT
