PM-LLM-Benchmark: Evaluating Large Language Models on Process Mining   Tasks

Alessandro Berti; Humam Kourani; Wil M.P. van der Aalst

arXiv:2407.13244·cs.CL·July 19, 2024·1 cites

PM-LLM-Benchmark: Evaluating Large Language Models on Process Mining Tasks

Alessandro Berti, Humam Kourani, Wil M.P. van der Aalst

PDF

Open Access 1 Repo

TL;DR

This paper introduces PM-LLM-Benchmark, a comprehensive evaluation framework for assessing large language models' capabilities in process mining tasks, highlighting current strengths and limitations of open-source models.

Contribution

It presents the first dedicated benchmark for process mining with LLMs, addressing domain-specific knowledge and implementation strategies, and discusses challenges in evaluation and data availability.

Findings

01

Most LLMs perform adequately on process mining tasks

02

Tiny models are still inadequate for edge device deployment

03

Evaluation biases affect the ranking of LLMs in process mining

Abstract

Large Language Models (LLMs) have the potential to semi-automate some process mining (PM) analyses. While commercial models are already adequate for many analytics tasks, the competitive level of open-source LLMs in PM tasks is unknown. In this paper, we propose PM-LLM-Benchmark, the first comprehensive benchmark for PM focusing on domain knowledge (process-mining-specific and process-specific) and on different implementation strategies. We focus also on the challenges in creating such a benchmark, related to the public availability of the data and on evaluation biases by the LLMs. Overall, we observe that most of the considered LLMs can perform some process mining tasks at a satisfactory level, but tiny models that would run on edge devices are still inadequate. We also conclude that while the proposed benchmark is useful for identifying LLMs that are adequate for process mining tasks,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fit-alessandro-berti/pm-llm-benchmark
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBusiness Process Modeling and Analysis · Big Data and Business Intelligence

MethodsFocus