DIETA: A Decoder-only transformer-based model for Italian-English machine TrAnslation
Pranav Kasela, Marco Braga, Alessandro Ghiotto, Andrea Pilzer, Marco Viviani, Alessandro Raganato

TL;DR
DIETA is a decoder-only transformer model with 0.5 billion parameters, trained on a large Italian-English corpus, achieving competitive translation performance and providing resources for future research.
Contribution
Introduces DIETA, a novel small-scale transformer model for Italian-English translation, with a large curated dataset and evaluation set, advancing specialized MT research.
Findings
DIETA ranks in the second quartile on a 32-system leaderboard.
Outperforms most sub-3B models on four out of five benchmarks.
Provides publicly available training data, models, and evaluation resources.
Abstract
In this paper, we present DIETA, a small, decoder-only Transformer model with 0.5 billion parameters, specifically designed and trained for Italian-English machine translation. We collect and curate a large parallel corpus consisting of approximately 207 million Italian-English sentence pairs across diverse domains, including parliamentary proceedings, legal texts, web-crawled content, subtitles, news, literature and 352 million back-translated data using pretrained models. Additionally, we create and release a new small-scale evaluation set, consisting of 450 sentences, based on 2025 WikiNews articles, enabling assessment of translation quality on contemporary text. Comprehensive evaluations show that DIETA achieves competitive performance on multiple Italian-English benchmarks, consistently ranking in the second quartile of a 32-system leaderboard and outperforming most other sub-3B…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification · Topic Modeling
