BARThez: a Skilled Pretrained French Sequence-to-Sequence Model
Moussa Kamal Eddine, Antoine J.-P. Tixier, Michalis Vazirgiannis

TL;DR
This paper introduces BARThez, a large-scale pretrained French sequence-to-sequence model based on BART, demonstrating strong performance on discriminative and generative NLP tasks, and enhancing generative capabilities through continued pretraining.
Contribution
The paper presents BARThez, the first large-scale pretrained French seq2seq model, and shows its effectiveness on various NLP tasks, along with improvements via multilingual BART pretraining.
Findings
BARThez is competitive with French BERT-based models.
Continued pretraining of multilingual BART improves generative performance.
BARThez performs well on both discriminative and generative tasks.
Abstract
Inductive transfer learning has taken the entire NLP field by storm, with models such as BERT and BART setting new state of the art on countless NLU tasks. However, most of the available models and research have been conducted for English. In this work, we introduce BARThez, the first large-scale pretrained seq2seq model for French. Being based on BART, BARThez is particularly well-suited for generative tasks. We evaluate BARThez on five discriminative tasks from the FLUE benchmark and two generative tasks from a novel summarization dataset, OrangeSum, that we created for this research. We show BARThez to be very competitive with state-of-the-art BERT-based French language models such as CamemBERT and FlauBERT. We also continue the pretraining of a multilingual BART on BARThez' corpus, and show our resulting model, mBARThez, to significantly boost BARThez' generative performance. Code,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗ccdv/lsg-barthez-4096model· 9 dl· ♡ 19 dl♡ 1
- 🤗lincoln/barthez-squadFR-fquad-piaf-question-generationmodel· 26 dl· ♡ 426 dl♡ 4
- 🤗moussaKam/barthez-orangesum-abstractmodel· 1.1k dl· ♡ 71.1k dl♡ 7
- 🤗moussaKam/barthez-orangesum-titlemodel· 19 dl· ♡ 319 dl♡ 3
- 🤗moussaKam/barthez-sentiment-classificationmodel· 9 dl· ♡ 29 dl♡ 2
- 🤗moussaKam/barthezmodel· 763 dl· ♡ 19763 dl♡ 19
- 🤗moussaKam/mbarthezmodel· 123k dl· ♡ 7123k dl♡ 7
- 🤗moussaKam/mbarthez-dialogue-summarizationmodel· 9 dl· ♡ 19 dl♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · mBARTHez · Sigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence · Byte Pair Encoding · Adam · Softmax · Layer Normalization
