Benchmarking On-Device Machine Learning on Apple Silicon with MLX
Oluwaseun A. Ajayi, Ogundepo Odunayo

TL;DR
This paper evaluates MLX, a framework optimized for on-device machine learning on Apple Silicon, benchmarking transformer model inference latency and demonstrating its efficiency compared to traditional frameworks and GPU performance.
Contribution
It introduces MLX-transformers, enabling seamless execution of transformer models on Apple Silicon without checkpoint conversion, and provides a comprehensive performance evaluation against NVIDIA GPUs.
Findings
MLX achieves lower inference latency on Apple Silicon compared to Pytorch implementations.
Transformer models run efficiently on Apple Silicon, approaching GPU performance levels.
MLX facilitates easier deployment of transformer models directly from Hugging Face.
Abstract
The recent widespread adoption of Large Language Models (LLMs) and machine learning in general has sparked research interest in exploring the possibilities of deploying these models on smaller devices such as laptops and mobile phones. This creates a need for frameworks and approaches that are capable of taking advantage of on-device hardware. The MLX framework was created to address this need. It is a framework optimized for machine learning (ML) computations on Apple silicon devices, facilitating easier research, experimentation, and prototyping. This paper presents a performance evaluation of MLX, focusing on inference latency of transformer models. We compare the performance of different transformer architecture implementations in MLX with their Pytorch counterparts. For this research we create a framework called MLX-transformers which includes different transformer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Digital Economy · Advanced Neural Network Applications · Machine Learning and Algorithms
