TL;DR
This paper investigates first-mover bias in gradient boosting explanations, demonstrating how it causes attribution instability under multicollinearity and proposing methods like DASH and seed-averaging to improve stability.
Contribution
It identifies the mechanistic cause of attribution instability, introduces DASH for bias mitigation, and provides diagnostic tools for detecting first-mover bias.
Findings
Scaling up a single model worsens attribution reproducibility.
DASH and seed-averaging significantly improve stability under high correlation.
DASH outperforms stochastic retraining on real and simulated data.
Abstract
We identify first-mover bias -- path-dependent concentration of SHAP feature importance from sequential residual fitting in gradient boosting -- as a mechanistic contributor to attribution instability under multicollinearity. Scaling up a single model amplifies this effect: a Large Single Model matching our method's total tree count produces the poorest attribution reproducibility of any approach tested. We show that model independence largely neutralizes first-mover bias. Both DASH (Diversified Aggregation of SHAP) and simple seed-averaging (Stochastic Retrain) restore stability by breaking the sequential dependency chain. At rho=0.9, both achieve stability ~0.977, while Single Best degrades to 0.958 and LSM to 0.938. On Breast Cancer, DASH improves stability from 0.376 to 0.925 (+0.549), outperforming Stochastic Retrain by +0.063. Under nonlinear DGPs, the advantage emerges at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
