Fostering Natural Conversation in Large Language Models with NICO: a Natural Interactive COnversation dataset
Renliang Sun, Mengyuan Liu, Shiping Yang, Rui Wang, Junqing He,, Jiaxing Zhang

TL;DR
This paper introduces NICO, a Chinese dataset of natural dialogues created to improve the human-likeness of responses generated by large language models, especially in conversational and social contexts.
Contribution
The paper presents NICO, a new dataset with human revisions for natural Chinese dialogues, and evaluates LLMs on tasks to enhance their conversational naturalness.
Findings
NICO dataset covers 20 daily-life topics and 5 social interaction types.
LLMs face significant challenges in generating natural, colloquial responses.
NICO helps improve LLMs' ability to produce human-like dialogues.
Abstract
Benefiting from diverse instruction datasets, contemporary Large Language Models (LLMs) perform effectively as AI assistants in collaborating with humans. However, LLMs still struggle to generate natural and colloquial responses in real-world applications such as chatbots and psychological counseling that require more human-like interactions. To address these limitations, we introduce NICO, a Natural Interactive COnversation dataset in Chinese. We first use GPT-4-turbo to generate dialogue drafts and make them cover 20 daily-life topics and 5 types of social interactions. Then, we hire workers to revise these dialogues to ensure that they are free of grammatical errors and unnatural utterances. We define two dialogue-level natural conversation tasks and two sentence-level tasks for identifying and rewriting unnatural sentences. Multiple open-source and closed-source LLMs are tested and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
