Knesset-DictaBERT: A Hebrew Language Model for Parliamentary Proceedings
Gili Goldin (1), Shuly Wintner (1) ((1) Department of Computer, Science, University of Haifa, Israel)

TL;DR
Knesset-DictaBERT is a Hebrew language model tailored for parliamentary texts, showing enhanced understanding of parliamentary language through fine-tuning on Israeli Knesset proceedings.
Contribution
This paper introduces Knesset-DictaBERT, a novel Hebrew language model specifically fine-tuned on parliamentary data, improving NLP performance in this domain.
Findings
Significant perplexity reduction over baseline
Improved accuracy in MLM tasks
Enhanced understanding of parliamentary language
Abstract
We present Knesset-DictaBERT, a large Hebrew language model fine-tuned on the Knesset Corpus, which comprises Israeli parliamentary proceedings. The model is based on the DictaBERT architecture and demonstrates significant improvements in understanding parliamentary language according to the MLM task. We provide a detailed evaluation of the model's performance, showing improvements in perplexity and accuracy over the baseline DictaBERT model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Legal Language and Interpretation
