# Fluctuations in Sequential Many-Alternative Decisions Reveal Strategies Beyond Immediate Reward Maximisation

**Authors:** Alice Vidal, Francesco Damiani, Alireza Valyan, Salvador Soto-Faraco, Rubén Moreno-Bote

PMC · DOI: 10.5334/joc.467 · Journal of Cognition · 2025-11-18

## TL;DR

People make decisions that fluctuate over time, and these fluctuations may help in adapting to changing environments even if they don't maximize immediate rewards.

## Contribution

The study introduces a novel experimental protocol to reveal that decision fluctuations reflect adaptive strategies beyond reward maximization.

## Key findings

- Participants' resource allocation fluctuates more than optimal and adapts to environmental context.
- Strategies like 'save-for-later' and reward history-dependent choices contribute to decision variability.
- Excess fluctuations can improve future decisions by reducing uncertainty and increasing strategy diversity.

## Abstract

Humans are strategic animals. We constantly make prospective choices, allocating limited resources in situations of uncertain, future outcomes. The management of our finite monthly budget, financial investments, or the allocation of time to the different questions in an exam are just a few examples. In these scenarios, both decision-making and resource allocation tend to fluctuate over time even under invariable set of constraints. However, it is unclear whether these fluctuations affect performance and whether they underlie additional objectives beyond pure reward maximisation. We address these questions using the breadth-depth dilemma, a novel ecological protocol where participants engage in sequential multiple-choice scenarios characterised by limited capacity. We designed two experimental environments. In one environment, optimal performance, formalised with an ideal allocator model, is associated with homogeneous resource allocation across consecutive choices. In contrast, the other environment entails that fluctuating resource allocation leads to greater expected rewards. Our study evaluates participants’ adherence to these scenarios and measures fluctuations as deviation from homogeneous allocations. The results revealed that participants’ behaviour fluctuates more than optimal, but critically, behavioural fluctuations adapt to the available capacity and the environmental context. Moreover, our findings unveil pronounced sequential strategies, such as save-for-later and reward history-dependent choice, further implying that these strategies contribute to decision variability. An extension of the optimal allocator model demonstrates that the characteristic excess fluctuations facilitate better-informed future choices (information gain), reduce uncertainty (risk avoidance), and generate diverse potential strategies (entropy seeking). Although having a modest impact on performance, these strategies may reflect advantageous behaviours in the long run under ever changing real-world environments.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12636281/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12636281/full.md

## References

84 references — full list in the complete paper: https://tomesphere.com/paper/PMC12636281/full.md

---
Source: https://tomesphere.com/paper/PMC12636281