cedar: Optimized and Unified Machine Learning Input Data Pipelines
Mark Zhao, Emanuel Adamiak, Christos Kozyrakis

TL;DR
Cedar is a unified framework that optimizes machine learning data pipelines by applying multiple performance-enhancing techniques automatically, significantly improving throughput and efficiency across various workflows.
Contribution
It introduces Cedar, a flexible framework with an optimizer that systematically applies multiple optimizations to enhance ML data pipeline performance.
Findings
Performance improvements of up to 10.65x over existing systems
Supports arbitrary ML frameworks and libraries
Automatically orchestrates optimizations without user intervention
Abstract
The input data pipeline is an essential component of each machine learning (ML) training job. It is responsible for reading massive amounts of training data, processing batches of samples using complex transformations, and loading them onto training nodes at low latency and high throughput. Performant input data systems are becoming increasingly critical, driven by skyrocketing data volumes and training throughput demands. Unfortunately, current input data systems cannot fully leverage key performance optimizations, resulting in hugely inefficient infrastructures that require significant resources - or worse - underutilize expensive accelerators. To address these demands, we present cedar, an optimized and unified programming framework for ML input data pipelines. cedar allows users to define input data pipelines using composable operators that support arbitrary ML frameworks and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · IoT and Edge/Fog Computing · Advanced Data Storage Technologies
MethodsSparse Evolutionary Training
