Parallelizing Machine Learning as a Service for the End-User
Daniela Loreti, Marco Lippi, Paolo Torroni

TL;DR
This paper presents a distributed architecture to parallelize machine learning services, enabling scalable, efficient processing for growing user bases, demonstrated through a text mining case study.
Contribution
It introduces a scalable distributed architecture for ML services and validates its effectiveness through extensive experiments on a text mining application.
Findings
Significant computational gains with distributed architecture
Effective parallelization of ML pipeline stages
Generalizable approach for similar ML applications
Abstract
As ML applications are becoming ever more pervasive, fully-trained systems are made increasingly available to a wide public, allowing end-users to submit queries with their own data, and to efficiently retrieve results. With increasingly sophisticated such services, a new challenge is how to scale up to evergrowing user bases. In this paper, we present a distributed architecture that could be exploited to parallelize a typical ML system pipeline. We propose a case study consisting of a text mining service and discuss how the method can be generalized to many similar applications. We demonstrate the significance of the computational gain boosted by the distributed architecture by way of an extensive experimental evaluation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
