Komodo: A Linguistic Expedition into Indonesia's Regional Languages
Louis Owen, Vishesh Tripathi, Abhay Kumar, Biddwan Ahmed

TL;DR
Komodo-7B is a new multilingual large language model designed to support Indonesian and 11 regional languages, achieving state-of-the-art performance and improving cross-language understanding and translation for under-resourced languages.
Contribution
Introduces Komodo-7B, a family of LLMs that excels in linguistic diversity and outperforms existing models in Indonesian and regional languages.
Findings
Komodo-7B-Instruct achieves state-of-the-art performance.
Model demonstrates superior cross-language understanding.
Significant improvements in translation for regional languages.
Abstract
The recent breakthroughs in Large Language Models (LLMs) have mostly focused on languages with easily available and sufficient resources, such as English. However, there remains a significant gap for languages that lack sufficient linguistic resources in the public domain. Our work introduces Komodo-7B, 7-billion-parameter Large Language Models designed to address this gap by seamlessly operating across Indonesian, English, and 11 regional languages in Indonesia. Komodo-7B is a family of LLMs that consist of Komodo-7B-Base and Komodo-7B-Instruct. Komodo-7B-Instruct stands out by achieving state-of-the-art performance in various tasks and languages, outperforming the benchmarks set by OpenAI's GPT-3.5, Cohere's Aya-101, Llama-2-Chat-13B, Mixtral-8x7B-Instruct-v0.1, Gemma-7B-it , and many more. This model not only demonstrates superior performance in both language-specific and overall…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLinguistic Variation and Morphology
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Sparse Evolutionary Training · Residual Connection · Weight Decay · Dropout · Softmax · Linear Layer
