PM-LLM-Benchmark: Evaluating Large Language Models on Process Mining Tasks
Alessandro Berti, Humam Kourani, Wil M.P. van der Aalst

TL;DR
This paper introduces PM-LLM-Benchmark, a comprehensive evaluation framework for assessing large language models' capabilities in process mining tasks, highlighting current strengths and limitations of open-source models.
Contribution
It presents the first dedicated benchmark for process mining with LLMs, addressing domain-specific knowledge and implementation strategies, and discusses challenges in evaluation and data availability.
Findings
Most LLMs perform adequately on process mining tasks
Tiny models are still inadequate for edge device deployment
Evaluation biases affect the ranking of LLMs in process mining
Abstract
Large Language Models (LLMs) have the potential to semi-automate some process mining (PM) analyses. While commercial models are already adequate for many analytics tasks, the competitive level of open-source LLMs in PM tasks is unknown. In this paper, we propose PM-LLM-Benchmark, the first comprehensive benchmark for PM focusing on domain knowledge (process-mining-specific and process-specific) and on different implementation strategies. We focus also on the challenges in creating such a benchmark, related to the public availability of the data and on evaluation biases by the LLMs. Overall, we observe that most of the considered LLMs can perform some process mining tasks at a satisfactory level, but tiny models that would run on edge devices are still inadequate. We also conclude that while the proposed benchmark is useful for identifying LLMs that are adequate for process mining tasks,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis · Big Data and Business Intelligence
MethodsFocus
