Do Deep Ensembles Actually Capture Uncertainty in Graph Neural Networks?
Pedro C. Vieira, Pedro Ribeiro, Viacheslav Borovitskiy

TL;DR
This study critically evaluates deep ensembles for graph neural networks, revealing they offer limited uncertainty quantification improvements due to a collapse in model diversity.
Contribution
The paper demonstrates that deep ensembles in graph neural networks suffer from epistemic collapse, reducing their effectiveness for uncertainty estimation.
Findings
Ensembles provide minimal gains over single models in GNNs.
Disagreement among ensemble members is significantly reduced, indicating collapse.
The collapse is driven by functional rather than weight-space convexity.
Abstract
While deep ensembles are widely considered to be the default method for uncertainty quantification in deep learning, their effectiveness for graph-structured data is often simply assumed based on successes in domains like computer vision. We investigate standard deep ensembles specifically for message-passing graph neural networks. Benchmarking across seven datasets representing varied tasks and complexities, we reveal that ensembles provide surprisingly little improvement over a single model. Instead, the observed marginal gains stem primarily from stabilizing optimization noise in point predictions rather than yielding meaningfully better uncertainty estimates. Through an aleatoric-epistemic decomposition, we identify epistemic collapse: independently trained networks consistently converge to overly similar predictions. Because disagreement is the fundamental mechanism through which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
