Adapting LLMs to Hebrew: Unveiling DictaLM 2.0 with Enhanced Vocabulary and Instruction Capabilities
Shaltiel Shmidman, Avi Shmidman, Amir DN Cohen, Moshe Koppel

TL;DR
This paper presents DictaLM 2.0 and DictaLM 2.0-Instruct, innovative Hebrew LLMs trained on a large corpus, with novel adaptation techniques and a new benchmark suite for evaluation in low-resource language settings.
Contribution
Introduction of new training methodologies for adapting pre-trained LLMs to Hebrew, along with the development of DictaLM 2.0 models and a comprehensive Hebrew NLP benchmark.
Findings
Models perform well across diverse Hebrew NLP tasks.
Novel adaptation techniques improve Hebrew language learning in LLMs.
Benchmark provides a new standard for Hebrew NLP evaluation.
Abstract
Training large language models (LLMs) in low-resource languages such as Hebrew poses unique challenges. In this paper, we introduce DictaLM2.0 and DictaLM2.0-Instruct, two LLMs derived from the Mistral model, trained on a substantial corpus of approximately 200 billion tokens in both Hebrew and English. Adapting a pre-trained model to a new language involves specialized techniques that differ significantly from training a model from scratch or further training existing models on well-resourced languages such as English. We outline these novel training methodologies, which facilitate effective learning and adaptation to the linguistic properties of Hebrew. Additionally, we fine-tuned DictaLM2.0-Instruct on a comprehensive instruct dataset to enhance its performance on task-specific instructions. To rigorously evaluate our models, we introduce a new benchmark suite for Hebrew LLM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗dicta-il/dictalm2.0model· 5.4k dl· ♡ 265.4k dl♡ 26
- 🤗dicta-il/dictalm2.0-AWQmodel· 7 dl7 dl
- 🤗dicta-il/dictalm2.0-GPTQmodel· 7 dl7 dl
- 🤗dicta-il/dictalm2.0-instruct-AWQmodel· 36 dl36 dl
- 🤗dicta-il/dictalm2.0-instruct-GPTQmodel· 25 dl25 dl
- 🤗dicta-il/dictalm2.0-instructmodel· 6.8k dl· ♡ 256.8k dl♡ 25
- 🤗dicta-il/dictalm2.0-instruct-GGUFmodel· 293 dl· ♡ 7293 dl♡ 7
- 🤗dicta-il/dictalm2.0-GGUFmodel· 184 dl· ♡ 5184 dl♡ 5
- 🤗RichardErkhov/dicta-il_-_dictalm2.0-instruct-ggufmodel· 120 dl120 dl
- 🤗RichardErkhov/dicta-il_-_dictalm2.0-ggufmodel· 73 dl73 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification · Mathematics, Computing, and Information Processing
MethodsSparse Evolutionary Training
