Structured Extraction from Business Process Diagrams Using Vision-Language Models

Pritam Deka; Barry Devereux

arXiv:2511.22448·cs.AI·December 1, 2025

Structured Extraction from Business Process Diagrams Using Vision-Language Models

Pritam Deka, Barry Devereux

PDF

Open Access 1 Datasets

TL;DR

This paper introduces a novel pipeline that uses vision-language models combined with OCR to extract structured representations of BPMN diagrams directly from images, enabling analysis without source files.

Contribution

It presents a new method leveraging VLMs and OCR for extracting structured BPMN data from images, bypassing the need for source XML files and enhancing robustness.

Findings

01

OCR improves VLM performance in component extraction

02

Benchmarking shows varying model effectiveness

03

Statistical analysis clarifies OCR impact

Abstract

Business Process Model and Notation (BPMN) is a widely adopted standard for representing complex business workflows. While BPMN diagrams are often exchanged as visual images, existing methods primarily rely on XML representations for computational analysis. In this work, we present a pipeline that leverages Vision-Language Models (VLMs) to extract structured JSON representations of BPMN diagrams directly from images, without requiring source model files or textual annotations. We also incorporate optical character recognition (OCR) for textual enrichment and evaluate the generated element lists against ground truth data derived from the source XML files. Our approach enables robust component extraction in scenarios where original source files are unavailable. We benchmark multiple VLMs and observe performance improvements in several models when OCR is used for text enrichment. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

pritamdeka/BPMN-VLM
dataset· 56 dl
56 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBusiness Process Modeling and Analysis · Data Visualization and Analytics · Robotic Process Automation Applications