Parallelization of Machine Learning Algorithms Respectively on Single   Machine and Spark

Jiajun Shen

arXiv:2206.07090·cs.DC·April 14, 2023

Parallelization of Machine Learning Algorithms Respectively on Single Machine and Spark

Jiajun Shen

PDF

Open Access

TL;DR

This paper explores the parallelization of classic machine learning algorithms on single machines and Spark, demonstrating significant improvements in runtime and efficiency for large data analysis.

Contribution

It presents a comparative study of traditional versus parallelized machine learning algorithms on single machine and Spark platform, highlighting performance gains.

Findings

01

Parallelized algorithms outperform traditional ones in runtime.

02

Significant efficiency improvements observed on Spark platform.

03

Parallelization benefits are consistent across different algorithms.

Abstract

With the rapid development of big data technologies, how to dig out useful information from massive data becomes an essential problem. However, using machine learning algorithms to analyze large data may be time-consuming and inefficient on the traditional single machine. To solve these problems, this paper has made some research on the parallelization of several classic machine learning algorithms respectively on the single machine and the big data platform Spark. We compare the runtime and efficiency of traditional machine learning algorithms with parallelized machine learning algorithms respectively on the single machine and Spark platform. The research results have shown significant improvement in runtime and efficiency of parallelized machine learning algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Big Data Technologies and Applications · Data Mining Algorithms and Applications