Multilingual Sequence-to-Sequence Models for Hebrew NLP
Matan Eyal, Hila Noga, Roee Aharoni, Idan Szpektor, Reut Tsarfaty

TL;DR
This paper advocates for using multilingual sequence-to-sequence models like mT5 for Hebrew NLP, demonstrating significant improvements over previous models by treating tasks as text-to-text problems, especially for morphologically rich languages.
Contribution
It introduces a novel approach of casting Hebrew NLP tasks as text-to-text problems using multilingual seq2seq models, outperforming previous encoder-only models.
Findings
Substantial performance improvements on Hebrew NLP benchmarks.
Seq2seq models are more suitable for morphologically rich languages.
Multilingual models can effectively replace specialized Hebrew models.
Abstract
Recent work attributes progress in NLP to large language models (LMs) with increased model size and large quantities of pretraining data. Despite this, current state-of-the-art LMs for Hebrew are both under-parameterized and under-trained compared to LMs in other languages. Additionally, previous work on pretrained Hebrew LMs focused on encoder-only models. While the encoder-only architecture is beneficial for classification tasks, it does not cater well for sub-word prediction tasks, such as Named Entity Recognition, when considering the morphologically rich nature of Hebrew. In this paper we argue that sequence-to-sequence generative architectures are more suitable for LLMs in the case of morphologically rich languages (MRLs) such as Hebrew. We demonstrate that by casting tasks in the Hebrew NLP pipeline as text-to-text tasks, we can leverage powerful multilingual, pretrained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsGated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Inverse Square Root Schedule · Dense Connections · Attention Dropout · Residual Connection · Dropout
