Baichuan2-Sum: Instruction Finetune Baichuan2-7B Model for Dialogue   Summarization

Jianfei Xiao; Yancan Chen; Yimin Ou; Hanyi Yu; Kai Shu; Yiyong Xiao

arXiv:2401.15496·cs.CL·April 5, 2024·1 cites

Baichuan2-Sum: Instruction Finetune Baichuan2-7B Model for Dialogue Summarization

Jianfei Xiao, Yancan Chen, Yimin Ou, Hanyi Yu, Kai Shu, Yiyong Xiao

PDF

Open Access

TL;DR

This paper introduces Baichuan2-Sum, an instruction fine-tuned large language model designed for role-oriented dialogue summarization, achieving state-of-the-art results on public datasets.

Contribution

The paper presents a novel instruction fine-tuning approach for large models in dialogue summarization, incorporating role-specific instructions and NEFTune noise techniques.

Findings

01

Achieves new state-of-the-art results on CSDS and SAMSUM datasets.

02

Demonstrates effectiveness of role-specific instructions in dialogue summarization.

03

Provides open-source model and code for future research.

Abstract

Large language models (LLMs) like Llama, Baichuan and Bloom models show remarkable ability with instruction fine-tuning in many natural language tasks. Nevertheless, for the dialogue summarization task, which aims to generate summaries for different roles in dialogue, most of the state-of-the-art methods conduct on small models (e.g Bart and Bert). Existing methods try to add task specified optimization on small models like adding global-local centrality score to models. In this paper, we propose an instruction fine-tuning model: Baichuan2-Sum, for role-oriented diaglouge summarization. By setting different instructions for different roles, the model can learn from the dialogue interactions and output the expected summaries. Furthermore, we applied NEFTune technique to add suitable noise during training to improve the results. The experiments demonstrate that the proposed model achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Dropout · Layer Normalization · Multi-Head Attention · Byte Pair Encoding · Residual Connection · Adam · Softmax