CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks for Chinese Large Language Models
Dan Shi, Chaobin You, Jiantao Huang, Taihao Li, Deyi Xiong

TL;DR
CORECODE is a comprehensive Chinese dialogue dataset with annotated commonsense knowledge designed to evaluate and improve large language models' reasoning and conflict detection capabilities in everyday conversations.
Contribution
The paper introduces CORECODE, a large, annotated dialogue dataset with benchmark tasks for evaluating commonsense reasoning in Chinese LLMs, including a standardized annotation scheme and diverse reasoning tasks.
Findings
Existing Chinese LLMs perform poorly on CORECODE tasks.
ChatGPT achieves only 0.275 accuracy on domain identification in zero-shot setting.
The dataset facilitates future research in commonsense reasoning for LLMs.
Abstract
As an indispensable ingredient of intelligence, commonsense reasoning is crucial for large language models (LLMs) in real-world scenarios. In this paper, we propose CORECODE, a dataset that contains abundant commonsense knowledge manually annotated on dyadic dialogues, to evaluate the commonsense reasoning and commonsense conflict detection capabilities of Chinese LLMs. We categorize commonsense knowledge in everyday conversations into three dimensions: entity, event, and social interaction. For easy and consistent annotation, we standardize the form of commonsense knowledge annotation in open-domain dialogues as "domain: slot = value". A total of 9 domains and 37 slots are defined to capture diverse commonsense knowledge. With these pre-defined domains and slots, we collect 76,787 commonsense knowledge annotations from 19,700 dialogues through crowdsourcing. To evaluate and enhance the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
