CaFlow: Enhancing Long-Term Action Quality Assessment with Causal Counterfactual Flow
Ruisheng Han, Kanglei Zhou, Shuang Chen, Amir Atapour-Abarghouei, Hubert P. H. Shum

TL;DR
CaFlow introduces a novel framework combining causal de-confounding and bidirectional flow modeling to improve long-term action quality assessment, achieving state-of-the-art results in challenging video analysis tasks.
Contribution
The paper presents CaFlow, a unified approach that integrates counterfactual de-confounding with bidirectional flow to enhance robustness and accuracy in long-term action quality assessment.
Findings
Achieves state-of-the-art performance on multiple benchmarks.
Effectively disentangles causal and confounding features.
Produces smoother, more coherent representations of long-term actions.
Abstract
Action Quality Assessment (AQA) predicts fine-grained execution scores from action videos and is widely applied in sports, rehabilitation, and skill evaluation. Long-term AQA, as in figure skating or rhythmic gymnastics, is especially challenging since it requires modeling extended temporal dynamics while remaining robust to contextual confounders. Existing approaches either depend on costly annotations or rely on unidirectional temporal modeling, making them vulnerable to spurious correlations and unstable long-term representations. To this end, we propose CaFlow, a unified framework that integrates counterfactual de-confounding with bidirectional time-conditioned flow. The Causal Counterfactual Regularization (CCR) module disentangles causal and confounding features in a self-supervised manner and enforces causal robustness through counterfactual interventions, while the BiT-Flow…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Explainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis
