# Pupil size response within direct and random exploration and exploitation behaviors selectively reflects value of control

**Authors:** Gili Barkay, Shai Gabay, Uri Hertz

PMC · DOI: 10.3389/fpsyg.2026.1752586 · Frontiers in Psychology · 2026-03-03

## TL;DR

This study shows how pupil size changes during decision-making reflect the value of control in exploration and exploitation behaviors.

## Contribution

The study links pupil-linked arousal to strategic control in exploration and exploitation decisions.

## Key findings

- Participants exploited more when value gaps were larger and explored more in long-horizon conditions.
- Pupil size increased during exploitative choices with short horizons or small value differences, indicating higher arousal.
- Pupillary responses showed sustained pre-decision modulation rather than discrete peaks.

## Abstract

Balancing exploration and exploitation is central to adaptive decision-making and is thought to depend on interactions between arousal-related neuromodulation and strategic control. The present study examined how pupil-indexed arousal corresponds to different aspects of exploration and exploitation decisions.

We used the Horizon Task, which independently manipulated value of control through value uncertainty, information asymmetries, and choice horizon. Thirty-five participants completed 320 mini-games while pupil diameter was continuously recorded, with analyses focused on the first free-choice trial.

Behaviorally, participants exploited more when value gaps were larger, preferentially sampled the option with fewer prior observations and showed increased exploration in long-horizon conditions, where additional choices enabled the use of newly acquired information. These patterns replicate established patterns of directed and random exploration. Pupillary responses, however, showed a selective profile. For exploitative choices, though not for exploratory choices, pupil size increased when horizons were short and when value differences were small, indicating greater arousal during decisions with higher immediate importance or increased discrimination demands, reflecting increased value of control. Trial-by-trial analyses revealed sustained pre-decision modulation rather than discrete phasic peaks.

Together, these findings allow integration of value of control approach and exploitative and exploratory control modes, indicating highlighting how strategic demands within each mode shape pupil-linked arousal.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13040367/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13040367/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/PMC13040367/full.md

---
Source: https://tomesphere.com/paper/PMC13040367