MLOS: An Infrastructure for Automated Software Performance Engineering

Carlo Curino; Neha Godwal; Brian Kroth; Sergiy Kuryata; Greg Lapinski,; Siqi Liu; Slava Oks; Olga Poppe; Adam Smiechowski; Ed Thayer; Markus Weimer,; Yiwen Zhu

arXiv:2006.02155·cs.DC·June 5, 2020

MLOS: An Infrastructure for Automated Software Performance Engineering

Carlo Curino, Neha Godwal, Brian Kroth, Sergiy Kuryata, Greg Lapinski,, Siqi Liu, Slava Oks, Olga Poppe, Adam Smiechowski, Ed Thayer, Markus Weimer,, Yiwen Zhu

PDF

1 Repo

TL;DR

MLOS is an ML-driven infrastructure designed to automate and improve software performance engineering, enabling continuous and instance-specific system optimization to unlock significant performance gains.

Contribution

The paper introduces MLOS, a novel ML-powered framework that automates and democratizes software performance tuning, addressing current manual and fragile SPE practices.

Findings

01

Component-level optimizations yield 20%-90% performance improvements.

02

MLOS enables continuous, robust, and trackable system tuning.

03

Open-sourcing MLOS fosters community and academic engagement.

Abstract

Developing modern systems software is a complex task that combines business logic programming and Software Performance Engineering (SPE). The later is an experimental and labor-intensive activity focused on optimizing the system for a given hardware, software, and workload (hw/sw/wl) context. Today's SPE is performed during build/release phases by specialized teams, and cursed by: 1) lack of standardized and automated tools, 2) significant repeated work as hw/sw/wl context changes, 3) fragility induced by a "one-size-fit-all" tuning (where improvements on one workload or component may impact others). The net result: despite costly investments, system software is often outside its optimal operating point - anecdotally leaving 30% to 40% of performance on the table. The recent developments in Data Science (DS) hints at an opportunity: combining DS tooling and methodologies with a new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

microsoft/MLOS
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.