Towards a Benchmark for Causal Business Process Reasoning with LLMs
Fabiana Fournier, Lior Limonad, Inna Skarbovsky

TL;DR
This paper introduces a benchmark for evaluating and improving large language models' ability to reason about causal and process-oriented aspects of business operations, facilitating better decision-making and process management.
Contribution
It proposes a novel benchmark dataset for assessing LLMs' reasoning in business process contexts, including a systematic approach to generate questions and answers based on causal business scenarios.
Findings
Benchmark enables evaluation of LLMs' reasoning skills in business processes.
Provides a scalable dataset for training LLMs to improve causal reasoning.
Accessible at https://huggingface.co/datasets/ibm/BPC for testing and training.
Abstract
Large Language Models (LLMs) are increasingly used for boosting organizational efficiency and automating tasks. While not originally designed for complex cognitive processes, recent efforts have further extended to employ LLMs in activities such as reasoning, planning, and decision-making. In business processes, such abilities could be invaluable for leveraging on the massive corpora LLMs have been trained on for gaining deep understanding of such processes. In this work, we plant the seeds for the development of a benchmark to assess the ability of LLMs to reason about causal and process perspectives of business operations. We refer to this view as Causally-augmented Business Processes (BP^C). The core of the benchmark comprises a set of BP^C related situations, a set of questions about these situations, and a set of deductive rules employed to systematically resolve the ground truth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis · Semantic Web and Ontologies · Multi-Agent Systems and Negotiation
MethodsSparse Evolutionary Training
