Actionable Interpretability via Causal Hypergraphs: Unravelling Batch Size Effects in Deep Learning

Zhongtian Sun; Anoushka Harit; Pietro Lio

arXiv:2506.17826·cs.LG·June 24, 2025

Actionable Interpretability via Causal Hypergraphs: Unravelling Batch Size Effects in Deep Learning

Zhongtian Sun, Anoushka Harit, Pietro Lio

PDF

TL;DR

This paper introduces HGCNet, a hypergraph-based causal framework that uncovers how batch size affects deep learning generalisation through higher-order interactions, providing actionable insights for training strategies.

Contribution

HGCNet is the first to use hypergraphs and deep structural causal models to analyze batch size effects in deep learning, capturing higher-order training dynamics.

Findings

01

Smaller batch sizes causally improve generalisation.

02

Increased stochasticity leads to flatter minima.

03

HGCNet outperforms existing models on multiple datasets.

Abstract

While the impact of batch size on generalisation is well studied in vision tasks, its causal mechanisms remain underexplored in graph and text domains. We introduce a hypergraph-based causal framework, HGCNet, that leverages deep structural causal models (DSCMs) to uncover how batch size influences generalisation via gradient noise, minima sharpness, and model complexity. Unlike prior approaches based on static pairwise dependencies, HGCNet employs hypergraphs to capture higher-order interactions across training dynamics. Using do-calculus, we quantify direct and mediated effects of batch size interventions, providing interpretable, causally grounded insights into optimisation. Experiments on citation networks, biomedical text, and e-commerce reviews show that HGCNet outperforms strong baselines including GCN, GAT, PI-GNN, BERT, and RoBERTa. Our analysis reveals that smaller batch sizes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.