Scalable Formal Concept Analysis algorithm for large datasets using   Spark

Raghavendra K Chunduri; Aswani Kumar Cherukuri

arXiv:1807.02258·cs.AI·July 9, 2018

Scalable Formal Concept Analysis algorithm for large datasets using Spark

Raghavendra K Chunduri, Aswani Kumar Cherukuri

PDF

Open Access

TL;DR

This paper introduces scalable distributed algorithms for formal concept analysis on large datasets using Apache Spark, addressing the inefficiencies of previous methods based on MapReduce and Open MP.

Contribution

The paper presents novel Spark-based algorithms for formal concept generation and lattice construction, optimized for large datasets and iterative processes.

Findings

01

Proposed algorithms outperform existing methods in efficiency.

02

Algorithms effectively handle large formal contexts.

03

Evaluation shows improved performance metrics.

Abstract

In the process of knowledge discovery and representation in large datasets using formal concept analysis, complexity plays a major role in identifying all the formal concepts and constructing the concept lattice(digraph of the concepts). For identifying the formal concepts and constructing the digraph from the identified concepts in very large datasets, various distributed algorithms are available in the literature. However, the existing distributed algorithms are not very well suitable for concept generation because it is an iterative process. The existing algorithms are implemented using distributed frameworks like MapReduce and Open MP, these frameworks are not appropriate for iterative applications. Hence, in this paper we proposed efficient distributed algorithms for both formal concept generation and concept lattice digraph construction in large formal contexts using Apache Spark.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRough Sets and Fuzzy Logic · Data Mining Algorithms and Applications · Cognitive Computing and Networks