Overview of the PromptCBLUE Shared Task in CHIP2023
Wei Zhu, Xiaoling Wang, Mosha Chen, Buzhou Tang

TL;DR
This paper overviews the PromptCBLUE shared task at CHIP-2023, which tests Chinese medical NLP models through prompt tuning and in-context learning, highlighting top system performances and evaluation methods.
Contribution
It introduces a new benchmark reformulation of CBLUE for Chinese medical NLP and details the shared task setup, datasets, evaluation metrics, and top-performing approaches.
Findings
Top teams achieved high test results
Effective prompt tuning strategies were identified
Open-source LLMs showed promising in-context learning capabilities
Abstract
This paper presents an overview of the PromptCBLUE shared task (http://cips-chip.org.cn/2023/eval1) held in the CHIP-2023 Conference. This shared task reformualtes the CBLUE benchmark, and provide a good testbed for Chinese open-domain or medical-domain large language models (LLMs) in general medical natural language processing. Two different tracks are held: (a) prompt tuning track, investigating the multitask prompt tuning of LLMs, (b) probing the in-context learning capabilities of open-sourced LLMs. Many teams from both the industry and academia participated in the shared tasks, and the top teams achieved amazing test results. This paper describes the tasks, the datasets, evaluation metrics, and the top systems for both tasks. Finally, the paper summarizes the techniques and results of the evaluation of the various approaches explored by the participating teams.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning in Healthcare · Natural Language Processing Techniques
