CaFlow: Enhancing Long-Term Action Quality Assessment with Causal Counterfactual Flow

Ruisheng Han; Kanglei Zhou; Shuang Chen; Amir Atapour-Abarghouei; Hubert P. H. Shum

arXiv:2511.21653·cs.CV·November 27, 2025

CaFlow: Enhancing Long-Term Action Quality Assessment with Causal Counterfactual Flow

Ruisheng Han, Kanglei Zhou, Shuang Chen, Amir Atapour-Abarghouei, Hubert P. H. Shum

PDF

Open Access

TL;DR

CaFlow introduces a novel framework combining causal de-confounding and bidirectional flow modeling to improve long-term action quality assessment, achieving state-of-the-art results in challenging video analysis tasks.

Contribution

The paper presents CaFlow, a unified approach that integrates counterfactual de-confounding with bidirectional flow to enhance robustness and accuracy in long-term action quality assessment.

Findings

01

Achieves state-of-the-art performance on multiple benchmarks.

02

Effectively disentangles causal and confounding features.

03

Produces smoother, more coherent representations of long-term actions.

Abstract

Action Quality Assessment (AQA) predicts fine-grained execution scores from action videos and is widely applied in sports, rehabilitation, and skill evaluation. Long-term AQA, as in figure skating or rhythmic gymnastics, is especially challenging since it requires modeling extended temporal dynamics while remaining robust to contextual confounders. Existing approaches either depend on costly annotations or rely on unidirectional temporal modeling, making them vulnerable to spurious correlations and unstable long-term representations. To this end, we propose CaFlow, a unified framework that integrates counterfactual de-confounding with bidirectional time-conditioned flow. The Causal Counterfactual Regularization (CCR) module disentangles causal and confounding features in a self-supervised manner and enforces causal robustness through counterfactual interventions, while the BiT-Flow…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Explainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis