SlimLM: An Efficient Small Language Model for On-Device Document Assistance
Thang M. Pham, Phat T. Nguyen, Seunghyun Yoon, Viet Dac Lai, Franck, Dernoncourt, Trung Bui

TL;DR
SlimLM introduces a series of small, efficient language models optimized for document assistance on mobile devices, balancing size, speed, and performance for practical on-device deployment.
Contribution
The paper presents SlimLM, a new family of small language models tailored for mobile document assistance, with extensive experiments and a practical Android application demonstrating real-world viability.
Findings
Small models achieve efficient performance on high-end smartphones.
Larger models provide better capabilities within mobile constraints.
SlimLM outperforms existing small language models in benchmarks.
Abstract
While small language models (SLMs) show promises for mobile deployment, their real-world performance and applications on smartphones remains underexplored. We present SlimLM, a series of SLMs optimized for document assistance tasks on mobile devices. Through extensive experiments on a Samsung Galaxy S24, we identify the optimal trade-offs between model size (ranging from 125M to 7B parameters), context length, and inference time for efficient on-device processing. SlimLM is pre-trained on SlimPajama-627B and fine-tuned on DocAssist, our constructed dataset for summarization, question answering and suggestion tasks. Our smallest model demonstrates efficient performance on S24, while larger variants offer enhanced capabilities within mobile constraints. We evaluate SlimLM against existing SLMs, showing comparable or superior performance and offering a benchmark for future research in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsService-Oriented Architecture and Web Services · Peer-to-Peer Network Technologies · Scientific Computing and Data Management
