Human-Centered Evaluation of an LLM-Based Process Modeling Copilot: A Mixed-Methods Study with Domain Experts
Chantale Lauer, Peter Pfeiffer, Nijat Mehdiyev

TL;DR
This study evaluates an LLM-based BPMN modeling assistant through human-centered methods, revealing usability-trust gaps and emphasizing the importance of human factors in assessing AI tools for process modeling.
Contribution
It provides the first human-centered evaluation of an LLM-powered process modeling copilot, highlighting trust issues and output quality challenges in real-world expert use.
Findings
Usability score was moderate (67.2/100)
Trust in the system was notably lower (48.8%)
Output quality issues and need for clarifying questions were identified
Abstract
Integrating Large Language Models (LLMs) into business process management tools promises to democratize Business Process Model and Notation (BPMN) modeling for non-experts. While automated frameworks assess syntactic and semantic quality, they miss human factors like trust, usability, and professional alignment. We conducted a mixed-methods evaluation of our proposed solution, an LLM-powered BPMN copilot, with five process modeling experts using focus groups and standardized questionnaires. Our findings reveal a critical tension between acceptable perceived usability (mean CUQ score: 67.2/100) and notably lower trust (mean score: 48.8\%), with reliability rated as the most critical concern (M=1.8/5). Furthermore, we identified output-quality issues, prompting difficulties, and a need for the LLM to ask more in-depth clarifying questions about the process. We envision five use cases…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis · Artificial Intelligence in Law · Robotic Process Automation Applications
