CauSight: Learning to Supersense for Visual Causal Discovery

Yize Zhang; Meiqi Chen; Sirui Chen; Bo Peng; Yanxi Zhang; Tianyu Li; Chaochao Lu

arXiv:2512.01827·cs.CV·December 2, 2025

CauSight: Learning to Supersense for Visual Causal Discovery

Yize Zhang, Meiqi Chen, Sirui Chen, Bo Peng, Yanxi Zhang, Tianyu Li, Chaochao Lu

PDF

Open Access 1 Models 1 Datasets

TL;DR

CauSight is a novel vision-language model designed for visual causal discovery, leveraging a large annotated dataset and causal reasoning techniques to infer cause-effect relations among visual entities, significantly outperforming GPT-4.1.

Contribution

The paper introduces CauSight, a new model for visual causal discovery, and creates the VCG-32K dataset with annotated causal graphs, advancing AI's causal reasoning capabilities.

Findings

01

CauSight outperforms GPT-4.1 on visual causal discovery tasks.

02

Achieves over 21% absolute gain in performance.

03

Introduces a new dataset and reasoning framework for causal inference.

Abstract

Causal thinking enables humans to understand not just what is seen, but why it happens. To replicate this capability in modern AI systems, we introduce the task of visual causal discovery. It requires models to infer cause-and-effect relations among visual entities across diverse scenarios instead of merely perceiving their presence. To this end, we first construct the Visual Causal Graph dataset (VCG-32K), a large-scale collection of over 32,000 images annotated with entity-level causal graphs, and further develop CauSight, a novel vision-language model to perform visual causal discovery through causally aware reasoning. Our training recipe integrates three components: (1) training data curation from VCG-32K, (2) Tree-of-Causal-Thought (ToCT) for synthesizing reasoning trajectories, and (3) reinforcement learning with a designed causal reward to refine the reasoning policy. Experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
OpenCausaLab/CauSight
model· 11 dl· ♡ 5
11 dl♡ 5

Datasets

OpenCausaLab/VCG-32K
dataset· 35 dl
35 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference