# Empirical Analysis on Comparing the Performance of Alpha Miner Algorithm   in SQL Query Language and NoSQL Column-Oriented Databases Using Apache   Phoenix

**Authors:** Kunal Gupta, Astha Sachdev, Ashish Sureka

arXiv: 1703.05481 · 2017-03-17

## TL;DR

This paper compares the performance of the Alpha Miner algorithm implemented in SQL and NoSQL databases, specifically Apache Phoenix, to determine which database type better supports process discovery in large event logs.

## Contribution

It provides an empirical performance comparison of Alpha Miner in relational versus NoSQL column-oriented databases for process mining tasks.

## Key findings

- NoSQL databases outperform relational databases in handling large event logs.
- Alpha Miner runs significantly faster on NoSQL column-oriented databases.
- The study offers insights into database selection for scalable process mining applications.

## Abstract

Process-Aware Information Systems (PAIS) is an IT system that support business processes and generate large amounts of event logs from the execution of business processes. An event log is represented as a tuple of CaseID, Timestamp, Activity and Actor. Process Mining is a new and emerging field that aims at analyzing the event logs to discover, enhance and improve business processes and check conformance between run time and design time business processes. The large volume of event logs generated are stored in the databases. Relational databases perform well for a certain class of applications. However, there are a certain class of applications for which relational databases are not able to scale. To handle such class of applications, NoSQL database systems emerged. Discovering a process model (workflow model) from event logs is one of the most challenging and important Process Mining task. The $\alpha$-miner algorithm is one of the first and most widely used Process Discovery technique. Our objective is to investigate which of the databases (Relational or NoSQL) performs better for a Process Discovery application under Process Mining. We implement the $\alpha$-miner algorithm on relational (row-oriented) and NoSQL (column-oriented) databases in database query languages so that our algorithm is tightly coupled to the database. We present a performance benchmarking and comparison of the $\alpha$-miner algorithm on row-oriented database and NoSQL column-oriented database so that we can compare which database can efficiently store massive event logs and analyze it in seconds to discover a process model.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.05481/full.md

## Figures

35 figures with captions in the complete paper: https://tomesphere.com/paper/1703.05481/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/1703.05481/full.md

---
Source: https://tomesphere.com/paper/1703.05481