Effectiveness of French Language Models on Abstractive Dialogue Summarization Task
Yongxin Zhou, Fran\c{c}ois Portet, Fabien Ringeval

TL;DR
This study evaluates French pre-trained language models for abstractive dialogue summarization, demonstrating that French-specific models outperform multilingual ones on the DECODA call center corpus, highlighting current limitations and future challenges.
Contribution
It provides a comparative analysis of French and multilingual pre-trained models for spontaneous dialogue summarization, with new state-of-the-art results on the DECODA dataset.
Findings
BARThez models outperform others significantly.
French-specific models outperform multilingual models.
Identified limitations and future challenges in spontaneous dialogue summarization.
Abstract
Pre-trained language models have established the state-of-the-art on various natural language processing tasks, including dialogue summarization, which allows the reader to quickly access key information from long conversations in meetings, interviews or phone calls. However, such dialogues are still difficult to handle with current models because the spontaneity of the language involves expressions that are rarely present in the corpora used for pre-training the language models. Moreover, the vast majority of the work accomplished in this field has been focused on English. In this work, we present a study on the summarization of spontaneous oral dialogues in French using several language specific pre-trained models: BARThez, and BelGPT-2, as well as multilingual pre-trained models: mBART, mBARThez, and mT5. Experiments were performed on the DECODA (Call Center) dialogue corpus whose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Layer Normalization · Gated Linear Unit · Dense Connections · SentencePiece · Inverse Square Root Schedule · Softmax
