A Big Data Analysis Framework Using Apache Spark and Deep Learning

Anand Gupta; Hardeo Thakur; Ritvik Shrivastava; Pulkit Kumar; Sreyashi; Nag

arXiv:1711.09279·cs.DB·November 28, 2017

A Big Data Analysis Framework Using Apache Spark and Deep Learning

Anand Gupta, Hardeo Thakur, Ritvik Shrivastava, Pulkit Kumar, Sreyashi, Nag

PDF

TL;DR

This paper introduces a novel framework combining Apache Spark's distributed computing with deep learning via cascade learning, enabling efficient analysis of large datasets with improved performance over traditional methods.

Contribution

The paper presents a new framework integrating Spark and deep multi-layer perceptrons using cascade learning, enhancing big data analysis capabilities.

Findings

01

Empirical results show improved analysis efficiency.

02

Framework outperforms traditional Spark or deep learning alone.

03

Encouraging results on real-world datasets.

Abstract

With the spreading prevalence of Big Data, many advances have recently been made in this field. Frameworks such as Apache Hadoop and Apache Spark have gained a lot of traction over the past decades and have become massively popular, especially in industries. It is becoming increasingly evident that effective big data analysis is key to solving artificial intelligence problems. Thus, a multi-algorithm library was implemented in the Spark framework, called MLlib. While this library supports multiple machine learning algorithms, there is still scope to use the Spark setup efficiently for highly time-intensive and computationally expensive procedures like deep learning. In this paper, we propose a novel framework that combines the distributive computational abilities of Apache Spark and the advanced machine learning architecture of a deep multi-layer perceptron (MLP), using the popular…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.