Personal Intelligence System UniLM: Hybrid On-Device Small Language Model and Server-Based Large Language Model for Malay Nusantara
Azree Nazri, Olalekan Agbolade, Faisal Aziz

TL;DR
This paper presents a hybrid personal intelligence system combining a small on-device language model and a large server-based model to efficiently support Malay language tasks with limited resources.
Contribution
It introduces a novel hybrid system integrating SLiM-34M and MANYAK-1.3B models, optimized for resource-constrained environments and specific Malay language applications.
Findings
SLiM-34M outperforms other LLMs in accuracy with half the pre-training tokens
The system effectively handles machine translation and question-answering tasks
Resource efficiency is achieved without sacrificing performance
Abstract
In contexts with limited computational and data resources, high-resource language models often prove inadequate, particularly when addressing the specific needs of Malay languages. This paper introduces a Personal Intelligence System designed to efficiently integrate both on-device and server-based models. The system incorporates SLiM-34M for on-device processing, optimized for low memory and power usage, and MANYAK-1.3B for server-based tasks, allowing for scalable, high-performance language processing. The models achieve significant results across various tasks, such as machine translation, question-answering, and translate IndoMMLU. Particularly noteworthy is SLiM-34M's ability to achieve a high improvement in accuracy compared to other LLMs while using 2 times fewer pre-training tokens. This work challenges the prevailing assumption that large-scale computational resources are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEdcuational Technology Systems · Cognitive Computing and Networks
