Sequence-to-Sequence Spanish Pre-trained Language Models
Vladimir Araujo, Maria Mihaela Trusca, Rodrigo Tufi\~no,, Marie-Francine Moens

TL;DR
This paper introduces and evaluates Spanish encoder-decoder pre-trained language models like BART, T5, and BERT2BERT, demonstrating their effectiveness across various sequence-to-sequence tasks and making them publicly available.
Contribution
It presents the first Spanish encoder-decoder models trained specifically for sequence-to-sequence tasks, filling a gap in Spanish NLP resources.
Findings
BART- and T5-based models outperform others across tasks
Models achieve competitive results in summarization, QA, translation
All models are publicly available for research use
Abstract
In recent years, significant advancements in pre-trained language models have driven the creation of numerous non-English language variants, with a particular emphasis on encoder-only and decoder-only architectures. While Spanish language models based on BERT and GPT have demonstrated proficiency in natural language understanding and generation, there remains a noticeable scarcity of encoder-decoder models explicitly designed for sequence-to-sequence tasks, which aim to map input sequences to generate output sequences conditionally. This paper breaks new ground by introducing the implementation and evaluation of renowned encoder-decoder architectures exclusively pre-trained on Spanish corpora. Specifically, we present Spanish versions of BART, T5, and BERT2BERT-style models and subject them to a comprehensive assessment across various sequence-to-sequence tasks, including summarization,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · WordPiece · Linear Warmup With Linear Decay · Inverse Square Root Schedule · Attention Dropout · RoBERTa · BART · BERT · Residual Connection
