Large Malaysian Language Model Based on Mistral for Enhanced Local Language Understanding
Husein Zolkepli, Aisyah Razak, Kamarul Adha, Ariff Nazhan

TL;DR
This paper introduces Malaysian Mistral, a large-scale Malaysian language model with extended context lengths, demonstrating improved language understanding and superior performance on grammar benchmarks compared to existing models.
Contribution
The paper presents the development of Malaysian Mistral with extended context lengths and instruction tuning, advancing Malaysian language processing capabilities.
Findings
Extended context lengths improve language understanding.
Malaysian Mistral outperforms ChatGPT3.5 and Claude 2 on grammar tests.
Instruction tuning enhances model performance.
Abstract
In this paper, we present significant advancements in the pretraining of Mistral 7B, a large-scale language model, using a dataset of 32.6 GB, equivalent to 1.1 billion tokens. We explore the impact of extending the context length, releasing models with context lengths of 4096 and 32768 tokens, and further refining performance with a specialized 16384 context length instruction-tuned model, we called it Malaysian Mistral. Our experiments demonstrate the efficacy of continue pretraining and the influence of extended context lengths on Mistral 7B's language understanding capabilities. Additionally, we release a model specifically tuned with a 16384 context length instruction, showcasing its potential for capturing nuanced language intricacies. Furthermore, our research contributes to the benchmarking of Malaysian Mistral against prominent language models, including ChatGPT3.5 and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
