First-Mover Bias in Gradient Boosting Explanations: Mechanism, Detection, and Resolution

Drake Caraker; Bryan Arnold; David Rhoads

arXiv:2603.22346·cs.LG·April 8, 2026

First-Mover Bias in Gradient Boosting Explanations: Mechanism, Detection, and Resolution

Drake Caraker, Bryan Arnold, David Rhoads

PDF

1 Repo

TL;DR

This paper investigates first-mover bias in gradient boosting explanations, demonstrating how it causes attribution instability under multicollinearity and proposing methods like DASH and seed-averaging to improve stability.

Contribution

It identifies the mechanistic cause of attribution instability, introduces DASH for bias mitigation, and provides diagnostic tools for detecting first-mover bias.

Findings

01

Scaling up a single model worsens attribution reproducibility.

02

DASH and seed-averaging significantly improve stability under high correlation.

03

DASH outperforms stochastic retraining on real and simulated data.

Abstract

We identify first-mover bias -- path-dependent concentration of SHAP feature importance from sequential residual fitting in gradient boosting -- as a mechanistic contributor to attribution instability under multicollinearity. Scaling up a single model amplifies this effect: a Large Single Model matching our method's total tree count produces the poorest attribution reproducibility of any approach tested. We show that model independence largely neutralizes first-mover bias. Both DASH (Diversified Aggregation of SHAP) and simple seed-averaging (Stochastic Retrain) restore stability by breaking the sequential dependency chain. At rho=0.9, both achieve stability ~0.977, while Single Best degrades to 0.958 and LSM to 0.938. On Breast Cancer, DASH improves stability from 0.376 to 0.925 (+0.549), outperforming Stochastic Retrain by +0.063. Under nonlinear DGPs, the advantage emerges at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

DrakeCaraker/dash-shap
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.