Suboptimal and trait-like reinforcement learning strategies correlate with midbrain encoding of prediction errors
Liran Szlak, Kristoffer Aberg, Rony Paz

TL;DR
This study reveals that probability-matching in reinforcement learning is linked to midbrain encoding of negative prediction errors and is a stable trait within individuals, challenging the notion that it is purely sub-optimal behavior.
Contribution
The paper introduces a computational model explaining the full spectrum of reinforcement learning strategies and links probability-matching to specific neural encoding patterns.
Findings
Probability-matching correlates with increased integration of negative outcomes.
Midbrain BOLD signal couples more strongly with negative prediction errors during probability-matching.
Individual probability-matching tendencies are consistent across multiple conditions.
Abstract
During probabilistic learning organisms often apply a sub-optimal "probability-matching" strategy, where selection rates match reward probabilities, rather than engaging in the optimal "maximization" strategy, where the option with the highest reward probability is always selected. Despite decades of research, the mechanisms contributing to probability-matching are still under debate, and particularly noteworthy is that no differences between probability-matching and maximization strategies have been reported at the level of the brain. Here, we provide theoretical proof for a computational model that explains the complete range of behaviors between pure maximization and pure probability-matching. Fitting this model to behavior of 60 participants performing a probabilistic reinforcement learning task during fMRI scanning confirmed the model-derived prediction that probability-matching…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural dynamics and brain function · Functional Brain Connectivity Studies · Animal Behavior and Reproduction
