# ChainReaction: Causal Chain-Guided Reasoning for Modular and Explainable Causal-Why Video Question Answering

**Authors:** Paritosh Parmar, Eric Peh, Basura Fernando

arXiv: 2508.21010 · 2025-12-25

## TL;DR

ChainReaction introduces a modular causal reasoning framework for VideoQA that uses interpretable causal chains, improving explainability, accuracy, and generalization over existing black-box models.

## Contribution

It proposes a novel two-stage architecture with causal chain extraction and reasoning, along with a scalable method for generating causal annotations and a new causality-focused evaluation metric.

## Key findings

- Outperforms state-of-the-art models on three benchmarks.
- Enhances explainability and user trust in VideoQA.
- Demonstrates the reusable nature of causal chains across domains.

## Abstract

Existing Causal-Why Video Question Answering (VideoQA) models often struggle with higher-order reasoning, relying on opaque, monolithic pipelines that entangle video understanding, causal inference, and answer generation. These black-box approaches offer limited interpretability and tend to depend on shallow heuristics. We propose a novel, modular paradigm that explicitly decouples causal reasoning from answer generation, introducing natural language causal chains as interpretable intermediate representations. Inspired by human cognitive models, these structured cause-effect sequences bridge low-level video content with high-level causal reasoning, enabling transparent and logically coherent inference. Our two-stage architecture comprises a Causal Chain Extractor (CCE) that generates causal chains from video-question pairs, and a Causal Chain-Driven Answerer (CCDA) that derives answers grounded in these chains. To address the lack of annotated reasoning traces, we introduce a scalable method for generating accurate causal chains from existing datasets. We construct human verified causal chains for 46K samples. We also propose CauCo, a new evaluation metric for causality-oriented captioning. Experiments on three large-scale benchmarks demonstrate that our approach not only outperforms state-of-the-art models, but also yields substantial gains in explainability, user trust, and generalization -- positioning the CCE as a reusable causal reasoning engine across diverse domains. Project page: https://paritoshparmar.github.io/chainreaction/

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.21010/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/2508.21010/full.md

---
Source: https://tomesphere.com/paper/2508.21010