Triclustering in Big Data Setting

Dmitry Egurnov; Dmitry I. Ignatov; and Dmitry Tochilkin

arXiv:2010.12933·cs.DC·October 27, 2020·Complex Data Analytics with Formal Concept Analysi

Triclustering in Big Data Setting

Dmitry Egurnov, Dmitry I. Ignatov, and Dmitry Tochilkin

PDF

Open Access

TL;DR

This paper presents scalable triclustering algorithms adapted for distributed computing environments, demonstrating their efficiency, parallelization capabilities, and scalability through complexity analysis and performance comparisons.

Contribution

It introduces distributed versions of OAC-family triclustering algorithms optimized for MapReduce and parallel environments, with complexity analysis and performance evaluation.

Findings

01

Algorithms show good parallelization capabilities.

02

Distributed implementation improves performance and scalability.

03

Complexity analysis justifies algorithm efficiency.

Abstract

In this paper, we describe versions of triclustering algorithms adapted for efficient calculations in distributed environments with MapReduce model or parallelisation mechanism provided by modern programming languages. OAC-family of triclustering algorithms shows good parallelisation capabilities due to the independent processing of triples of a triadic formal context. We provide the time and space complexity of the algorithms and justify their relevance. We also compare performance gain from using a distributed system and scalability.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management · Data Management and Algorithms · Advanced Database Systems and Queries