BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees
Yongjoo Park, Jingyi Qing, Xiaoyang Shen, Barzan Mozafari

TL;DR
BlinkML is a system that enables fast, approximate machine learning model training with probabilistic guarantees on prediction accuracy, significantly reducing training time for large datasets while maintaining model quality.
Contribution
It introduces a novel approach for training ML models on samples with probabilistic guarantees, applicable to models based on maximum likelihood estimation.
Findings
Speeds up training by up to 629x on large datasets
Provides high-probability guarantees on prediction consistency
Supports a wide range of MLE-based models
Abstract
The rising volume of datasets has made training machine learning (ML) models a major computational cost in the enterprise. Given the iterative nature of model and parameter tuning, many analysts use a small sample of their entire data during their initial stage of analysis to make quick decisions (e.g., what features or hyperparameters to use) and use the entire dataset only in later stages (i.e., when they have converged to a specific model). This sampling, however, is performed in an ad-hoc fashion. Most practitioners cannot precisely capture the effect of sampling on the quality of their model, and eventually on their decision-making process during the tuning phase. Moreover, without systematic support for sampling operators, many optimizations and reuse opportunities are lost. In this paper, we introduce BlinkML, a system for fast, quality-guaranteed ML training. BlinkML allows…
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
