Deciphering Shortcut Learning from an Evolutionary Game Theory Perspective
Xiayang Li, Kuo Gai, Shihua Zhang

TL;DR
This paper uses evolutionary game theory to analyze how shortcut learning bias forms in neural networks, revealing the roles of gradient descent methods and noise in this process.
Contribution
It provides a formal definition of core and shortcut features and models their competition using evolutionary game theory, offering new insights into the origins of shortcut bias.
Findings
GD and SGD lead to different stable states favoring shortcut or core features
Data noise and optimization noise influence the formation of shortcut bias
Theoretical framework characterizes shortcut bias dynamics and mitigation strategies
Abstract
Shortcut learning causes deep learning models to rely on non-essential features within the data. However, its formation in deep neural network training still lacks theoretical understanding. In this paper, we provide a formal definition of core and shortcut features and employ evolutionary game theory to analyze the origins of shortcut bias by modeling data samples as players and their corresponding neural tangent features as strategies, assuming the existence of core and shortcut subnetworks. We find that gradient descent (GD) and stochastic gradient descent (SGD) lead to two distinct stochastically stable states, each corresponding to a different strategy. The former primarily optimizes the shortcut subnetwork, while the latter primarily optimizes the core subnetwork. We investigate the influence of these strategies on shortcut bias through a continuous stochastic differential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
