Actionable Interpretability via Causal Hypergraphs: Unravelling Batch Size Effects in Deep Learning
Zhongtian Sun, Anoushka Harit, Pietro Lio

TL;DR
This paper introduces HGCNet, a hypergraph-based causal framework that uncovers how batch size affects deep learning generalisation through higher-order interactions, providing actionable insights for training strategies.
Contribution
HGCNet is the first to use hypergraphs and deep structural causal models to analyze batch size effects in deep learning, capturing higher-order training dynamics.
Findings
Smaller batch sizes causally improve generalisation.
Increased stochasticity leads to flatter minima.
HGCNet outperforms existing models on multiple datasets.
Abstract
While the impact of batch size on generalisation is well studied in vision tasks, its causal mechanisms remain underexplored in graph and text domains. We introduce a hypergraph-based causal framework, HGCNet, that leverages deep structural causal models (DSCMs) to uncover how batch size influences generalisation via gradient noise, minima sharpness, and model complexity. Unlike prior approaches based on static pairwise dependencies, HGCNet employs hypergraphs to capture higher-order interactions across training dynamics. Using do-calculus, we quantify direct and mediated effects of batch size interventions, providing interpretable, causally grounded insights into optimisation. Experiments on citation networks, biomedical text, and e-commerce reviews show that HGCNet outperforms strong baselines including GCN, GAT, PI-GNN, BERT, and RoBERTa. Our analysis reveals that smaller batch sizes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
