Towards a Benchmark for Causal Business Process Reasoning with LLMs

Fabiana Fournier; Lior Limonad; Inna Skarbovsky

arXiv:2406.05506·cs.AI·August 13, 2024

Towards a Benchmark for Causal Business Process Reasoning with LLMs

Fabiana Fournier, Lior Limonad, Inna Skarbovsky

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces a benchmark for evaluating and improving large language models' ability to reason about causal and process-oriented aspects of business operations, facilitating better decision-making and process management.

Contribution

It proposes a novel benchmark dataset for assessing LLMs' reasoning in business process contexts, including a systematic approach to generate questions and answers based on causal business scenarios.

Findings

01

Benchmark enables evaluation of LLMs' reasoning skills in business processes.

02

Provides a scalable dataset for training LLMs to improve causal reasoning.

03

Accessible at https://huggingface.co/datasets/ibm/BPC for testing and training.

Abstract

Large Language Models (LLMs) are increasingly used for boosting organizational efficiency and automating tasks. While not originally designed for complex cognitive processes, recent efforts have further extended to employ LLMs in activities such as reasoning, planning, and decision-making. In business processes, such abilities could be invaluable for leveraging on the massive corpora LLMs have been trained on for gaining deep understanding of such processes. In this work, we plant the seeds for the development of a benchmark to assess the ability of LLMs to reason about causal and process perspectives of business operations. We refer to this view as Causally-augmented Business Processes (BP^C). The core of the benchmark comprises a set of BP^C related situations, a set of questions about these situations, and a set of deductive rules employed to systematically resolve the ground truth…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

IBM/SAX/tree/main/NLP4BPM2024
pytorchOfficial

Datasets

ibm-research/BPC
dataset· 16 dl
16 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBusiness Process Modeling and Analysis · Semantic Web and Ontologies · Multi-Agent Systems and Negotiation

MethodsSparse Evolutionary Training