Helix: Accelerating Human-in-the-loop Machine Learning

Doris Xin; Litian Ma; Jialin Liu; Stephen Macke; Shuchen Song; Aditya; Parameswaran

arXiv:1808.01095·cs.LG·August 6, 2018

Helix: Accelerating Human-in-the-loop Machine Learning

Doris Xin, Litian Ma, Jialin Liu, Stephen Macke, Shuchen Song, Aditya, Parameswaran

PDF

TL;DR

Helix is a system designed to accelerate iterative machine learning development by optimizing workflow execution, reusing previous results, and providing visualization tools, significantly reducing total runtime.

Contribution

Helix introduces end-to-end optimization and result reuse for iterative ML workflows, addressing the dynamic nature of ML development.

Findings

01

Achieved up to 10x reduction in total runtime.

02

Enabled faster iterative development with visualization tools.

03

Demonstrated effectiveness on classification and structured prediction tasks.

Abstract

Data application developers and data scientists spend an inordinate amount of time iterating on machine learning (ML) workflows -- by modifying the data pre-processing, model training, and post-processing steps -- via trial-and-error to achieve the desired model performance. Existing work on accelerating machine learning focuses on speeding up one-shot execution of workflows, failing to address the incremental and dynamic nature of typical ML development. We propose Helix, a declarative machine learning system that accelerates iterative development by optimizing workflow execution end-to-end and across iterations. Helix minimizes the runtime per iteration via program analysis and intelligent reuse of previous results, which are selectively materialized -- trading off the cost of materialization for potential future benefits -- to speed up future iterations. Additionally, Helix offers a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings