# Leveraging the process mining technique to optimize data preparation time in a database used as an automated data delivery center

**Authors:** Seyed Hossein Abrehdari

PMC · DOI: 10.1016/j.mex.2025.103428 · MethodsX · 2025-06-12

## TL;DR

This paper shows how process mining can cut database creation time from 25 to 8 days by automating steps in seismic data centers.

## Contribution

An innovative process mining approach was applied to optimize database creation time in seismological data centers.

## Key findings

- Process mining reduced database creation time from 25 days to 8 days.
- Custom scripts were used to automate time-consuming database tasks.
- The method is applicable to large scientific datasets in seismology and other fields.

## Abstract

•Technical diagrams (e.g., performance analysis) were depicted for process mining.•An innovative approach was applied to optimize the time of process operations.•A specialized map of the process mining was depicted using the particular event-logs.

Technical diagrams (e.g., performance analysis) were depicted for process mining.

An innovative approach was applied to optimize the time of process operations.

A specialized map of the process mining was depicted using the particular event-logs.

This study investigates the development and implementation of a seismic database utilizing process mining techniques. This data format is generated and stored in seismic centers, such as the U.S. Geological Survey (USGS). The study explored the various stages involved in the preparation, delivery, and processing of a database containing almost 900 earthquake waveform records (considered big data) by utilizing process mining techniques. The data were gathered from a region spanning 388,111.5 km², located between 44°–51°E and 38°–42.5°N, over the period from 1999 to 2018, and sourced from the USGS. The findings of this study indicate that the use of process mining methodologies decreases the time needed for database creation, including request, collection, preparation, and delivery, from 25 days with manual processing to approximately 8 days. In parallel, custom-built software scripts (computer codes) were deployed as unmanned tools to streamline the time-consuming phases of database creation. The idea presented in this study can help optimize the time for creating, storing, and delivering the database in seismological centers or other data centers, especially in an era where efficient management of large scientific datasets is increasingly vital. In total, process mining techniques were employed to analyze the workflow involved in creating a large database, including the steps of data request, preparation, and delivery.

Image, graphical abstract

## Full-text entities

- **Genes:** PGR (progesterone receptor) [NCBI Gene 5241] {aka NR3C3, PR}, PDC (phosducin) [NCBI Gene 5132] {aka MEKA, PHD, PhLOP, PhLP}
- **Diseases:** PR (MESH:D008151), PDS (MESH:C536648)
- **Chemicals:** BPMN (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12268689/full.md

## Figures

16 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12268689/full.md

## References

11 references — full list in the complete paper: https://tomesphere.com/paper/PMC12268689/full.md

---
Source: https://tomesphere.com/paper/PMC12268689