Biathlon: Harnessing Model Resilience for Accelerating ML Inference   Pipelines

Chaokun Chang; Eric Lo; Chunxiao Ye

arXiv:2405.11191·cs.DB·May 21, 2024

Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines

Chaokun Chang, Eric Lo, Chunxiao Ye

PDF

Open Access 1 Repo

TL;DR

Biathlon is a new machine learning serving system that exploits model resilience to approximate input features, significantly accelerating inference pipelines while maintaining accuracy within guaranteed bounds.

Contribution

It introduces a novel approach to optimize ML inference speed by leveraging model resilience and adaptive approximation of input features.

Findings

01

Achieves 5.3x to 16.6x speedup in real pipelines

02

Maintains near-original accuracy with approximation

03

Demonstrates effectiveness on industry and competition datasets

Abstract

Machine learning inference pipelines commonly encountered in data science and industries often require real-time responsiveness due to their user-facing nature. However, meeting this requirement becomes particularly challenging when certain input features require aggregating a large volume of data online. Recent literature on interpretable machine learning reveals that most machine learning models exhibit a notable degree of resilience to variations in input. This suggests that machine learning models can effectively accommodate approximate input features with minimal discernible impact on accuracy. In this paper, we introduce Biathlon, a novel ML serving system that leverages the inherent resilience of models and determines the optimal degree of approximation for each aggregation feature. This approach enables maximum speedup while ensuring a guaranteed bound on accuracy loss. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ChaokunChang/Biathlon
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Topic Modeling · Explainable Artificial Intelligence (XAI)