GreekBART: The First Pretrained Greek Sequence-to-Sequence Model

Iakovos Evdaimon; Hadi Abdine; Christos Xypolopoulos; Stamatis; Outsios; Michalis Vazirgiannis; Giorgos Stamou

arXiv:2304.00869·cs.CL·April 4, 2023·6 cites

GreekBART: The First Pretrained Greek Sequence-to-Sequence Model

Iakovos Evdaimon, Hadi Abdine, Christos Xypolopoulos, Stamatis, Outsios, Michalis Vazirgiannis, Giorgos Stamou

PDF

Open Access 2 Repos 8 Models

TL;DR

GreekBART is the first pretrained sequence-to-sequence model for Greek, demonstrating strong performance on various NLP tasks and introducing a new Greek summarization dataset.

Contribution

It introduces GreekBART, the first Greek-specific Seq2Seq model based on BART, and provides evaluation on multiple NLP tasks and a new Greek summarization dataset.

Findings

01

GreekBART outperforms Greek-BERT and XLM-R on discriminative tasks.

02

GreekBART shows promising results on Greek summarization tasks.

03

The new GreekSUM dataset facilitates future research in Greek NLP.

Abstract

The era of transfer learning has revolutionized the fields of Computer Vision and Natural Language Processing, bringing powerful pretrained models with exceptional performance across a variety of tasks. Specifically, Natural Language Processing tasks have been dominated by transformer-based language models. In Natural Language Inference and Natural Language Generation tasks, the BERT model and its variants, as well as the GPT model and its successors, demonstrated exemplary performance. However, the majority of these models are pretrained and assessed primarily for the English language or on a multilingual corpus. In this paper, we introduce GreekBART, the first Seq2Seq model based on BART-base architecture and pretrained on a large-scale Greek corpus. We evaluate and compare GreekBART against BART-random, Greek-BERT, and XLM-R on a variety of discriminative tasks. In addition, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsMulti-Head Attention · Attention Is All You Need · Sigmoid Activation · Tanh Activation · Cosine Annealing · Linear Layer · Long Short-Term Memory · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Cosine Annealing · Byte Pair Encoding