WONDERBREAD: A Benchmark for Evaluating Multimodal Foundation Models on Business Process Management Tasks
Michael Wornow, Avanika Narayan, Ben Viggiano, Ishan S. Khare,, Tathagat Verma, Tibor Thompson, Miguel Angel Fuentes Hernandez, Sudharsan, Sundar, Chloe Trujillo, Krrish Chawla, Rongfei Lu, Justin Shen, Divya, Nagaraj, Joshua Martinez, Vardhan Agrawal, Althea Hudson

TL;DR
WONDERBREAD introduces a comprehensive benchmark with a dataset and tasks for evaluating multimodal foundation models on diverse business process management activities beyond automation, highlighting current model strengths and weaknesses.
Contribution
It provides the first dataset, tasks, and evaluation tools for assessing multimodal models on BPM tasks beyond automation, addressing a significant research gap.
Findings
State-of-the-art models recall 88% of workflow steps in videos
Models struggle with fine-grained validation, F1 < 0.3
Benchmark encourages development of human-centered enterprise AI tools
Abstract
Existing ML benchmarks lack the depth and diversity of annotations needed for evaluating models on business process management (BPM) tasks. BPM is the practice of documenting, measuring, improving, and automating enterprise workflows. However, research has focused almost exclusively on one task - full end-to-end automation using agents based on multimodal foundation models (FMs) like GPT-4. This focus on automation ignores the reality of how most BPM tools are applied today - simply documenting the relevant workflow takes 60% of the time of the typical process optimization project. To address this gap we present WONDERBREAD, the first benchmark for evaluating multimodal FMs on BPM tasks beyond automation. Our contributions are: (1) a dataset containing 2928 documented workflow demonstrations; (2) 6 novel BPM tasks sourced from real-world applications ranging from workflow documentation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsBusiness Process Modeling and Analysis · Service-Oriented Architecture and Web Services · Collaboration in agile enterprises
MethodsAttention Is All You Need · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Position-Wise Feed-Forward Layer · Dropout · Adam · Linear Layer · Absolute Position Encodings
