CELLO: Causal Evaluation of Large Vision-Language Models

Meiqi Chen; Bo Peng; Yan Zhang; Chaochao Lu

arXiv:2406.19131·cs.CV·June 28, 2024

CELLO: Causal Evaluation of Large Vision-Language Models

Meiqi Chen, Bo Peng, Yan Zhang, Chaochao Lu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces CELLO, a comprehensive dataset and evaluation framework for assessing and improving the causal reasoning abilities of large vision-language models, highlighting current limitations and potential enhancements.

Contribution

The paper presents a new dataset, CELLO, with explicit causal graphs and a novel prompting strategy, CELLO-CoT, to evaluate and enhance causal reasoning in LVLMs.

Findings

01

LVLMs struggle with causal reasoning tasks.

02

CELLO-CoT improves model performance on causal questions.

03

Explicit causal graphs aid in understanding model reasoning.

Abstract

Causal reasoning is fundamental to human intelligence and crucial for effective decision-making in real-world environments. Despite recent advancements in large vision-language models (LVLMs), their ability to comprehend causality remains unclear. Previous work typically focuses on commonsense causality between events and/or actions, which is insufficient for applications like embodied agents and lacks the explicitly defined causal graphs required for formal causal reasoning. To overcome these limitations, we introduce a fine-grained and unified definition of causality involving interactions between humans and/or objects. Building on the definition, we construct a novel dataset, CELLO, consisting of 14,094 causal questions across all four levels of causality: discovery, association, intervention, and counterfactual. This dataset surpasses traditional commonsense causality by including…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

opencausalab/cello
pytorchOfficial

Videos

CELLO: Causal Evaluation of Large Vision-Language Models· underline

Taxonomy

TopicsMultimodal Machine Learning Applications