Leveraging Generative AI for Extracting Process Models from Multimodal   Documents

Marvin Voelter; Raheleh Hadian; Timotheus Kampik; Marius Breitmayer,; Manfred Reichert

arXiv:2406.04959·cs.SE·June 10, 2024

Leveraging Generative AI for Extracting Process Models from Multimodal Documents

Marvin Voelter, Raheleh Hadian, Timotheus Kampik, Marius Breitmayer,, Manfred Reichert

PDF

Open Access 1 Repo

TL;DR

This paper explores the use of Generative Pre-trained Transformers (GPTs) to automatically generate graphical process models from combined text and image inputs, providing a new dataset and evaluation framework.

Contribution

It introduces a novel multi-modal dataset, evaluation metrics, and open-source code for assessing GPTs in process model generation from multimodal data.

Findings

01

GPTs show potential for semi-automated process modeling

02

Evaluation metrics enable systematic assessment

03

Open-source tools facilitate future research

Abstract

This paper presents an investigation of the capabilities of Generative Pre-trained Transformers (GPTs) to auto-generate graphical process models from multi-modal (i.e., text- and image-based) inputs. More precisely, we first introduce a small dataset as well as a set of evaluation metrics that allow for a ground truth-based evaluation of multi-modal process model generation capabilities. We then conduct an initial evaluation of commercial GPT capabilities using zero-, one-, and few-shot prompting strategies. Our results indicate that GPTs can be useful tools for semi-automated process modeling based on multi-modal inputs. More importantly, the dataset and evaluation metrics as well as the open-source evaluation code provide a structured framework for continued systematic evaluations moving forward.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SAP-samples/multimodal-generative-ai-for-bpm
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBusiness Process Modeling and Analysis · Semantic Web and Ontologies · Service-Oriented Architecture and Web Services