MaLV-OS: Rethinking the Operating System Architecture for Machine Learning in Virtualized Clouds
Stella Bitchebe, Oana Balmau

TL;DR
MaLV-OS proposes a novel OS architecture tailored for machine learning workloads in virtualized clouds, integrating GPU virtualization and ML-specific resource management to enhance ML performance.
Contribution
This work introduces MaLV-OS, the first OS architecture designed specifically to optimize machine learning workloads in virtualized cloud environments.
Findings
Design of a micro-kernel, Micro-LAKE, enabling GPU access in kernel space.
Integration of an MLaaS subsystem for improved resource management.
Open-source GPU virtualization merged into the hypervisor for flexibility.
Abstract
A large body of research has employed Machine Learning (ML) models to develop learned operating systems (OSes) and kernels. The latter dynamically adapts to the job load and dynamically adjusts resources (CPU, IO, memory, network bandwidth) allocation to respond to the actual user demand. What this work has in common is that it utilizes ML to improve kernel decisions. To this day, and to the best of our knowledge, no work has taken the opposite direction, i.e., using OS to improve ML. While some work proposes applying system-level optimizations to ML algorithms, they do not tailor the OS to adapt to the ML context. To address this limitation, we take an orthogonal approach in this paper by leveraging the OS to enhance the performance of ML models and algorithms. We explore the path towards an ML-specialized OS, MaLV-OS. MaLV-OS rethinks the OS architecture to make it specifically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
