Multiregional representations of intertemporal decision making in human single neurons

Jay L. Gill; Mahmoud Omidbeigi; Jihye Ryu; Nanthia Suthana; Jonathan C. Kao; Ausaf Bari

PMC · DOI:10.1038/s41598-025-00012-7·July 8, 2025

Multiregional representations of intertemporal decision making in human single neurons

Jay L. Gill, Mahmoud Omidbeigi, Jihye Ryu, Nanthia Suthana, Jonathan C. Kao, Ausaf Bari

PDF

Open Access

TL;DR

This study explores how single neurons in three brain regions help humans make decisions about immediate versus delayed rewards, shedding light on impulsive behaviors.

Contribution

The study reveals distinct neural correlates of decision-making in the orbitofrontal cortex, hippocampus, and amygdala during intertemporal choices.

Findings

01

Single neurons in the orbitofrontal cortex preferentially encode choice preferences.

02

Hippocampal activity reflects individual differences in discounting rates.

03

Decision difficulty is represented across all three brain regions.

Abstract

Understanding the neural mechanisms underlying delay discounting—the tendency to prefer smaller, immediate rewards over larger, delayed rewards—is critical for elucidating the etiology of impulsive decision-making, a hallmark of several psychiatric conditions including substance use and impulse control disorders. Here, we investigate single-neuron activity in the orbitofrontal cortex (OFC), hippocampus, and amygdala of nine human participants performing a delay discounting task. Intracranial recordings yielded a total of 193 single units (50 OFC, 68 amygdala, and 75 hippocampus) and reveal distinct neural correlates of decision-making, including representations of choice preferences and decision difficulty across all three regions. Analyses demonstrate preferential encoding of choice in the OFC. Additionally, we report that hippocampal activity reflects interindividual differences in…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens

Diseases4

substance use impulsive behaviors psychiatric impulse control disorders

Figures3

Click any figure to enlarge with its caption.

Individual Preferences and Proximity in Subjective Value of Offered Rewards Affect Intertemporal Choice (a) Sagittal (left [L]; right [R]) and axial views of microelectrodes from all participants localized in Montreal Neurological Institute (MNI) space. OFC: orbitofrontal cortex, AMY: amygdala, HPC: hippocampus. 3D visualization generated using BrainNet Viewer, Version 1.7 (Release 20,191,031; The software is publicly available at: [https://www.nitrc.org/projects/bnv/)](https://www.nitrc.org/projects/bnv/))^[11](#CR11)^ (Methods) (b) Schematic of delay discounting task. There was a 1000 millis

Intertemporal Choice is Reflected in Human Single Neurons in the Orbitofrontal Cortex, Amygdala, and Hippocampus (a) Spike density (average +/− standard error of mean [s.e.m.]) and raster plot (b) for an example OFC unit with differential responses during immediate (red) and delayed (blue) choices. Average +/− standard error of all spike waveforms (a) left. (c) Schematic of single neuron decoding approach. During decoding, the firing rate of a neuron on a single trial was binned into 100 ms time bins. The firing rate was averaged across these 100 ms. We then selected 5 bins (features) as the i

Peak Population and Single Neuron Differences in Choice and Decision Difficulty Encoding. (Right column) Histogram of decoding performance across cross validations from choice (darker shade of all colors, solid line) and difficulty (lighter shade of all colors, dashed line) decoders built using activity from units (features) that significantly encoded choice and difficulty, respectively, in the (a) OFC (blue; choice mean +/− standard deviation [s.d.]: 78.00% +/− 12.72 s.d., difficulty mean +/− s.d.: 66.97% +/− 9.79; *p* < .001), (c) amygdala (red; choice mean +/− standard deviation [s.d.]: 72.

Funding3

—http://dx.doi.org/10.13039/100000025National Institute of Mental Health
—http://dx.doi.org/10.13039/100000065National Institute of Neurological Disorders and Stroke
—NIMH

Keywords

Cognitive neuroscienceAddiction

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural and Behavioral Psychology Studies · Decision-Making and Behavioral Economics · Mental Health Research Topics

Full text

Introduction

Though patience is a virtue, humans generally prefer smaller, immediate rewards over larger, delayed rewards. A strong bias towards instant gratification results in heightened impulsivity—a core feature of addiction, suicide, and attention and impulse control related disorders^1^. Understanding the human intracranial neural activity that supports option appraisal, and how it is altered during impulsive decisions, is essential to the development of targeted therapeutics for individuals with pathologically risky behaviors.

Human neuroimaging and animal findings suggest that option contemplation and selection are highly distributed processes^2–4^. The orbitofrontal cortex (OFC) encodes subjective value and influences choice, the amygdala facilitates acquisition of stimulus-affect relationships^5,6^, and the hippocampus uses past experiences to qualify prospective outcomes that are distant in time^7^. Inter-temporal decision-making and the valuation of future rewards have been extensively studied in non-human primates using single-unit electrophysiology. Previous research has demonstrated that single neurons in the OFC, amygdala, and striatum encode subjective values and choice preferences for delayed rewards. Specifically, Kobayashi and Schultz^8^ identified neurons in the primate OFC that represent the discounted value of future rewards, demonstrating that single neurons are capable of tracking subjective value. Additionally, Cai et al^9^ report that single-neuron activity in the lateral prefrontal cortex predicted choices for delayed versus immediate rewards, indicating a key role for value-based deliberation during inter-temporal choice. However, the human single unit correlates of these functions remain largely uncharacterized, presenting a challenge in translating basic scientific findings to human decision behaviors that are often compromised in the setting of disease.

We present single unit data collected from the human OFC, amygdala, and hippocampus during performance of a delay discounting task where participants chose between a smaller, immediate and a larger, delayed reward. Using a logistic regression approach, we find that choice predictive neural activity is present throughout all three regions and is most strongly represented in the OFC. We also observe neural activity indicative of decision conflict between similarly valued options. Notably, units isolated in the OFC and hippocampus, but not the amygdala, were more predictive of choice compared to decision difficulty. Given these findings, our study extends prior work by providing direct evidence of single-neuron representations of inter-temporal choice in humans. Our findings in the human OFC, hippocampus, and amygdala reveal overlapping neural mechanisms with primate models, while also highlighting potential species-specific differences in the complexity of decision-making strategies. Lastly, our findings demonstrate that hippocampal representations of choice and decision conflict are stronger in participants with slower temporal discounting of delayed rewards. These results are therefore an initial characterization of multiregional single unit activity related to intertemporal choice and individual preferences for immediate vs. delayed rewards that may underlie impulsive decision making in humans.

Nine human participants with indwelling microelectrodes^10^ for epilepsy treatment performed a forced-choice delay discounting task (Fig. 1a,b; Supplementary Table 1; Methods). Each participant provided informed consent, according to an Institutional Review Board (IRB) protocol approved by University of California, Los Angeles (IRB #17-001433). During each trial, participants were asked if they preferred $10 at a delay or an amount less than$ 10 immediately. Decisions across trials were used to quantify the unique rate at which each participant subjectively devalued delayed rewards, k (Methods; Fig. 1c; Supplementary Fig. 1). Larger discounting rates indicate a stronger preference for immediate over delayed rewards and have previously been related to increased impulsivity^1^. We then used the individual discounting rate to approximate the subjective value of the delayed offer and decision difficulty (i.e. the difference between the immediate offer value and the subjective value of the delayed reward) for each trial (Methods).Fig. 1. Individual Preferences and Proximity in Subjective Value of Offered Rewards Affect Intertemporal Choice (a) Sagittal (left [L]; right [R]) and axial views of microelectrodes from all participants localized in Montreal Neurological Institute (MNI) space. OFC: orbitofrontal cortex, AMY: amygdala, HPC: hippocampus. 3D visualization generated using BrainNet Viewer, Version 1.7 (Release 20,191,031; The software is publicly available at: https://www.nitrc.org/projects/bnv/))^11^ (Methods) (b) Schematic of delay discounting task. There was a 1000 millisecond (ms) fixation cross followed by a question prompt. During each trial, participants selected one of two options: (1) an immediate reward of some amount smaller than $10 (“A”) or (2)$ 10 delivered at some fixed delay (“B”; “Delayed Reward”). (c) Discounting curves for each participant (Supplementary Fig. 1). A steeper slope indicates the individual more rapidly discounted delayed rewards and preferred smaller, immediate offers. (d) Increased reaction time observed as participants decided between two options of similar subjective value. Individual points represent the difference in subjective value between offered options and the corresponding reaction time. The red line represents a Gaussian function fit to data using nonlinear regression^12^. (e) Model parameters identified by the Gaussian function indicated that the rise in the curve (i.e. increase in reaction time) occurred surrounding a subjective value difference near | $1|. |$ 1| was therefore used as the threshold for determining decision difficulty (hard vs. easy choice). Reaction time was significantly greater during hard compared to easy choices. * = p < 0.05 using a linear effects model with a fixed effect of “reaction time x subjective value” and random effect of “participant”. Average reaction time for each subject during hard and easy choices are displayed using the same color code specified in (c).

We first asked if decision difficulty was related to behavior across participants. Previous studies report longer reaction times (RTs) during decisions involving options with small (difficult) compared to large (easy) differences in subjective value (DSV). To quantify the relationship between DSV and RT, we fit a Gaussian function^12^ using nonlinear regression (Methods) to identify a point at which DSV led to meaningful changes in reaction time. This approach identified a rise in RT at $1.078 (Fig. [1](#Fig1)e-d), suggesting that reaction time was longer when subjects evaluated options that were subjectively within$ 1 of each other. Notably, RT could be significantly modeled as a linear function of DSV (Supplementary Fig. 2). To verify that DSV <| $1| specifically led to increased reaction times, as proposed by the gaussian function results, we compared the RT of DSV <|$ 1| compared to DSV > =| $1| and found that significantly longer reaction times were observed during DSV <|$ 1| trials. Thus, we labeled trials as hard or easy based on whether the DSV was > = $1 in subsequent analyses (Fig. 1e).

A total of 193 single units were isolated across participants (50 OFC, 68 Amygdala, 75 Hippocampus; Fig. 1a; Supplementary Fig. 3). We first asked if neural activity differed when a participant was about to select an immediate vs. delayed option. To enhance our ability to compare activity across trials with uneven time lengths due to the self-paced task design, analyses focused on activity during the peri-decision period (Methods; Fig. 2a,b). We used a logistic regression decoding approach (Methods; Fig. 2c,d) and found 48 (96%), 51 (75%), and 51 (68%) units within the OFC, amygdala, and hippocampus, respectively, predicted choice prior to decision onset. We next asked if decision difficulty (hard [DSV <| $1|] vs. easy [DSV > =|$ 1|]) could similarly be predicted during the pre-decision period. We used the previously described decoding approach to predict whether a decision was easy or hard (Fig. 2e). We identified 44 (88%), 57 (84%), and 64 (85%) units within the OFC, amygdala, and hippocampus, respectively, that exhibited significant decision difficulty encoding. Of the total number of choice and/or difficulty encoding neurons 84%, 71% and 60% of units in the OFC, amygdala, and hippocampus, respectively, exhibited encoding of both variables (Supplementary Fig. 4–5). Though simultaneous peak encoding of both variables was infrequent, no significant temporal preferences (i.e., peak choice decoding preceding peak difficulty decoding) were observed across “both” encoding units (Supplementary Fig. 5).Fig. 2. Intertemporal Choice is Reflected in Human Single Neurons in the Orbitofrontal Cortex, Amygdala, and Hippocampus (a) Spike density (average +/− standard error of mean [s.e.m.]) and raster plot (b) for an example OFC unit with differential responses during immediate (red) and delayed (blue) choices. Average +/− standard error of all spike waveforms (a) left. (c) Schematic of single neuron decoding approach. During decoding, the firing rate of a neuron on a single trial was binned into 100 ms time bins. The firing rate was averaged across these 100 ms. We then selected 5 bins (features) as the input to a logistic regression model. The model therefore saw 500 ms of activity divided into 5, 100 ms time bins. The model utilized pre-decision neural features to predict the choice made at the end of the corresponding trial (selection of an immediate or delayed reward). We then plotted decoding accuracy of the model as the center of the 500 ms was shifted across time, closer to the decision (right). Blue line is the average decoding accuracy (across cross validations) from a model trained with labels matching the actual, observed outcome +/− s.e.m. Grey line is the decoding accuracy from a model trained with shuffled labels (i.e. trial labels did not correspond to neural activity) to simulate chance decoding. A cell was deemed as choice encoding if feeding the decoder activity at any point prior to the decision led to an accuracy that was greater than a chance. The time on the x-axis corresponds to the absolute time of the midpoint of the 500 ms input. A unit was called “choice encoding” if feeding the decoder activity at any point prior to the choice led to an accuracy that was both above chance and greater than the decoding accuracy of a chance decoder trained on the same activity but with shuffled labels (grey line; * = * p* < 0.05; Bonferroni corrected). (d) The average decoding accuracy +/− s.e.m of choice selective units from the OFC (left), amygdala (AMY; middle) and hippocampus (HPC; right). (e) Same as (d) but for difficulty encoding units that were isolated using the same approach depicted in (c), but with a decoder predicting whether a choice being made was hard or easy.

Our findings reveal notable similarities and differences between human and non-human primate electrophysiology in the context of inter-temporal decision-making. Consistent with studies in macaques^8,9^, we find that neurons in the human OFC encode subjective value and predict upcoming choices, supporting the idea that value representation is a conserved feature of inter-temporal decision-making across species. However, an important distinction emerges in the role of the hippocampus. While non-human primate studies have primarily focused on prefrontal and striatal mechanisms of temporal discounting, our results suggest that the human hippocampus plays a significant role in encoding both decision difficulty and choice representations. This aligns with neuroimaging evidence indicating that episodic future thinking—supported by hippocampal networks—modulates delay discounting behavior^2,7^.

We next sought to characterize population-level encoding of choice and difficulty. A choice population decoder was built for each region using activity from units that exhibited significant choice encoding. Each model feature, N_1_, N_2_ … N_n_ (where n = number of choice selective units within a region across all participants; Methods), was the average firing rate of a cell in a 500 ms window centered around the time the unit exhibited its highest choice decoding accuracy. Models were trained on 50% of the available data using logistic regression and tested on the remaining 50% (Methods). This approach yielded population accuracies of 78%, 72.6%, and 73.8%, across the OFC, amygdala, and hippocampus respectively (Fig. 3; Supplementary Fig. 6–8). When this approach was used to decode difficulty, we found accuracies of 67% (OFC), 74% (amygdala), and 66% (hippocampus). Comparisons of average peak single cell choice vs. difficulty accuracies exhibited similar results where the OFC and hippocampus, but not the amygdala, exhibited significantly greater average peak single cell accuracies when decoding choice compared to difficulty (Fig. 3).Fig. 3. Peak Population and Single Neuron Differences in Choice and Decision Difficulty Encoding. (Right column) Histogram of decoding performance across cross validations from choice (darker shade of all colors, solid line) and difficulty (lighter shade of all colors, dashed line) decoders built using activity from units (features) that significantly encoded choice and difficulty, respectively, in the (a) OFC (blue; choice mean +/− standard deviation [s.d.]: 78.00% +/− 12.72 s.d., difficulty mean +/− s.d.: 66.97% +/− 9.79; p < .001), (c) amygdala (red; choice mean +/− standard deviation [s.d.]: 72.65% +/− 12.93 s.d., difficulty mean +/− s.d.: 74.07% +/− 9.09; p = .219), and (e) HPC (green; choice mean +/− standard deviation [s.d.]: 73.85% +/− 12.86 s.d., difficulty mean +/− s.d.: 66.32 +/− 8.73; p < .001). * = * p* < 0.05, ** = * p* < 0.01 using permutation testing. (Left column) Average peak single unit activity during decoding of choice (darker shade of all colors) and difficulty (lighter shade of all colors) in the (b) OFC (blue; Choice: 48 units, mean = 59.76% +/− .61 s.e.m; Difficulty: 44 units, mean = 56.83% +/− .47 s.e.m; p < .001), (d) amygdala (red; Choice: 51 units, mean = 57.79% +/− .48 s.e.m; Difficulty: 57 units, mean = 57.67% +/− .41 s.e.m; p = .687) and (f) HPC (green; Choice: 51 units, mean = 58.56% +/− .56 s.e.m; Difficulty: 64 units, mean = 56.77% +/− .33 s.e.m; p = .004). * = * p* < .05, ** = * p* < .01 using linear mixed effect models.

To understand whether decoding dynamics were influenced by participant behavior, particularly their preference for immediate rewards, we constructed decoders using varying proportions of units from fast discounters (present favoring) and slow discounters (future favoring). Initially, we compared decoders using units exclusively from fast discounters with an equal number of units from all participants (to serve as a size-matched approximation of the initial behavior-agnostic population decoders; Methods). We observed differences between the “fast-only” and “all (fixed)” models in the hippocampus and amygdala, but not in the OFC (Supplementary Fig. 9). Representation of choice was increased in “fast-only” compared to the “all (fixed)” model in the amygdala. Conversely, choice and difficulty representations were decreased in “fast-only” hippocampal models (Supplementary Fig. 9). Further comparisons were made by building decoders exclusively from units of either fast or slow discounters, matched in number. We again observed reduced choice and difficulty representations in the hippocampus, while the amygdala showed increased choice representations in the “fast-only” models compared to the “slow-only” models (Supplementary Fig. 10). Analysis of data at the single unit level did not reveal significant “slow” vs. “fast-only” hippocampal average peak accuracy differences (Supplementary Fig. 11), however, amygdala “fast-only” isolated units exhibited a later decoding of choice and earlier decoding of difficulty compared to “slow-only” models (Supplementary Fig. 12).

Decades of non-invasive human neuroimaging studies demonstrate that delay discounting behavior–related (subjective valuation of offered options and choice preferences) activity is reflected throughout a broad neural network that includes the OFC, amygdala, and hippocampus^3,4^. Though findings from animal models suggest that economic decisions are driven by single neuron representations of subjective value^13,14^, characterization of neural activity that supports the evaluation of distal, abstract (vs. appetitive) prospective outcomes in humans has been largely out of reach^15–17^. We begin to address this knowledge gap in the present study and demonstrate that (1) option appraisal (derived from approximations of participant specific subjective value of delayed rewards and behaviorally validated) and (2) choice are reflected in single neuron activity across the human amygdala, OFC, and hippocampus. Notably, the present investigation characterizes representations of processed, contextualized prompt information (i.e. prompts with similarly valued options = “hard”) due to the relatively few instances of repetition of specific cues (i.e. specific delays and specific immediate offers; Supplemental Fig. 13; Supplementary Table 3). How the characterized abstract, contextual representations emerge from encoding of the variables that compose them (i.e. the raw value or the raw delay, rather than the subjective value which is a function of them) will be critical to understand in future studies.

The importance of understanding the neural origins of delay discounting is underscored by its alteration across a myriad of disorders related to heightened impulsivity. Neuroimaging studies suggest that the rate at which individuals discount future rewards (i.e. the amount of preference given to the present vs. the future) is related to interactions between frontal and medial temporal regions driven by episodic thinking^2^. Notably, rats with hippocampal dysfunction demonstrate increased delay discounting^18,19^ and rats with amygdala lesions fail to learn cue-outcome associations^5^, highlighting the importance of MTL structures in acquisition of subjective value representations. Congruent findings are reported in humans where patients with amygdala damage fail to learn from aversive monetary outcomes^20^ and patients with MTL damage do not exhibit episodic thinking related reductions in temporal discounting^7^. Using models built with units from participants with fast vs. slow discounting behaviors, we demonstrate that the degree of single unit representation of task variables in the amygdala and hippocampus may underlie an individual’s preference for smaller, immediate vs. larger, delayed rewards.

Results of the present study provide a rare opportunity to gain insight into the single unit computations of contemplation and selection of options that span both the present and the future across multiple brain regions in humans. We use a decoding approach to characterize neural activity underlying discounting behaviors. This allowed us to isolate common neural dynamics across broad trial categories that were not detected using traditional firing rate approaches. Though we comment on representations of delay discounting related variables/behavior, how regions interact with and influence one another remain relatively unexplored in humans. Such findings would serve to build on the present initial investigation and further guide identification of intra and inter regional dynamics that lead to pathological decision making and serve as therapeutic targets for impulsivity-related disorders.

Given the self-paced design, we focus analyses on peri-decision activity to characterize activity at identical periods across trials of varying lengths. This approach, which allowed us to effectively characterize the relationship between decision difficulty and reaction time, limits the ability to comment on state transitions leading up to decision making. Research in non-human primates demonstrates that value encoding neurons in OFC contributed to vacillating neural states during option contemplation that ultimately predicted choice^21^. Though our study identifies OFC value encoding neurons (that is also present in the amygdala and hippocampus) as well, the distinct contributions of these ensembles to decision making and how they evolve throughout time requires further investigation. In supplementary analyses (Supplementary Figs. 14–17), we were able to isolate cue-initiated choice and difficulty related activity throughout all regions. Single unit encoding of task variables was, on average, higher during the peri-decision, compared to the peristimulus period. Notably, this analysis revealed that difficulty decoding first emerges in the amygdala, further suggesting that this region may prioritize decision conflict information. Though these preliminary analyses provide a cursory examination of the onset of option contemplation related activity, further characterization of how these initial signals evolve as states shift from contemplation to decision is better suited to be explored within paradigms with fixed trials that include distinct evaluative and choice periods.

Methods

Study participants

9 participants with drug-resistant epilepsy (4 female, 5 male) undergoing depth electrode placement for localization of seizure foci at UCLA were included in the study. All participants provided written, informed consent to participate in research under the approval of the UCLA Medical Institutional Review Board. Participants were told they were free to withdraw consent and discontinue participation at any time. Clinical consideration for surgery was made by a multidisciplinary team of neurosurgeons, neurologists, and neuropsychologists and was independent of the research study. Pre-determined clinical criteria guided placement of Behnke–Fried electrodes (Adtech Medical, Racine WI) in each participant. Electrodes were implanted stereotactically with the aid of digital subtraction computed tomography (CT) angiography and magnetic resonance imaging (MRI). Each Behnke–Fried macro–micro depth electrode contained at least seven macroelectrode contacts (1.5 mm in diameter) spaced 1.5–3.5 mm apart along the shaft, and a Behnke–Fried inner platinum-iridium microwire bundle at the distal end of the electrode (California Fine Wire, Grover Beach, CA)^10^. All participants provided written, informed consent according to a protocol approved by the UCLA Institutional Review Board (IRB) (UCLA: IRB # 17-001433). All reported study procedures and recruitment comply with the approved IRB protocol and with the tenants of the Declaration of Helsinki.

Delay discounting task

Participants were presented with choices between $10 available after a specified delay (0, 2, 60, 180, or 365 days) and a smaller amount (<$ 10) available immediately. For example, the participant could be presented with the choice: “would you rather have $10 in 30 days or$ 2 now?” The task ran on a laptop using Matlab with Psychtoolbox extensions. Trials started with a fixation cross for 1000 ms after which participants indicated their preference by pressing a button on a Cedrus Response Pad (Response Pad RB-844, Cedrus Corporation, San Pedro, CA 90734, USA). After the participant selected their preferred option, there was a 1000 ms inter-trial interval before the fixation cross for the next trial was displayed on the screen. The position of the options on the screen (i.e., the immediate and delayed reward) was counterbalanced across trials. The task utilized an adjusting amount procedure (adjusting the immediate amount in increments of ± $0.50) to derive indifference points between the delayed-standard and immediate-adjusting options for each of the five delays assessed. An indifference point reflected the smallest amount of money an individual chose to receive immediately instead of the delayed standard amount ($ 10) at the specific delay. Trials continued until indifference points were found for all delays and a discounting quotient, k, was calculated. The number of selections that a participant made for immediate vs. delayed rewards and the participant-specific discounting k are in Supplementary Table 2.

Calculation of participant-specific discounting rate and subjective value

The reduction in subjective value (SV) of temporally delayed rewards was modeled by the hyperbolic function.

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$SV = \frac{A}{(1 + kD)}$$\end{document}

where A is the objective amount of the reward (fixed at $10 in our experiment), D is the variable delay at which A will be administered (2, 60, 180, or 365 days) and k is the subject-specific discount rate^2^. To calculate* k*, the hyperbolic model was fit to indifference points using non-linear least squares via lsqcurvefit.m in Matlab. Smaller values of k indicate greater tolerance for extended delays, while larger values of k indicate impatience and a refusal to wait for larger rewards to be disbursed in the future.

Electrode localization

The anatomical location of each electrode was determined by co-registering a high-resolution post-operative computed tomography (CT) scan to a pre-operative whole brain magnetic resonance imaging (T1-weighted sequences). The electrodes were localized by thresholding the raw CT image and calculating the unweighted mass center of each electrode and microelectrode bundle. The preimplantation three-dimensional T1 MR scan was processed using FreeSurfer to segment the white matter, deep gray matter structures and cortex. It was also processed to parcellate the neocortex according to gyral anatomy. Macro- and micro-electrode contacts were then attributed to a cortical region according to automated parcellation in FreeSurfer. Recording contacts were in a variety of regions. Recording channels inside the amygdala, hippocampus and orbitofrontal cortex were included for further analyses. We warped the aligned electrodes onto a standard brain template (using a Montreal Neurological Institute (MNI) template) to facilitate group-level visualization. The MNI reconstruction was performed for visualization purposes only, and electrode localizations were always determined in each patient’s native MRI space. 3D image in Fig. 1a was generated using BrainNet Viewer, Version 1.7 (Release 20191031)^11^. The software is publicly available at: https://www.nitrc.org/projects/bnv/.

Electrophysiology data acquisition

Each depth electrode terminated in eight 40-mm platinum-iridium microwires from which extracellular signals were continuously recorded (referenced locally to a ninth low-impedance microwire). Data were recorded using either Neuralynx (40 kHz sampling rate) or Blackrock (30 kHz sampling rate) data acquisition systems.

Offline spike sorting and pre processing

Neuronal clusters were identified using the ‘Wave Clus’ software package. As described previously, extracellular recordings were high pass filtered above 300 Hz and a threshold of 5 s.d. above the median noise level was computed^22,23^. Detected events were clustered (or categorized as noise) using automatic superparamagnetic clustering of wavelet coefficients, followed by manual refinement based on the consistency of spike waveforms and inter-spike interval distributions. 193 neural clusters (9 patients) were identified by ‘Wave Clus’, utilizing amplitude thresholding and the wavelet transform to implement superparamagnetic clustering. Furthermore, multiunits and single units were classified based on (1) spike shape and variance; (2) presence of a refractory period (less than 1% spikes with less than 3 ms ISI; (3) the ratio between the spike peak value and the noise level; and (iv) the ISI distribution of each cluster. There were few multi units (n = 9), and restricting the results to single units did not make a significant difference. The presented results therefore incorporate data from both multi units and single units^24^. Signal to noise ratio (SNR) measurements were also used to confirm that our recordings adhere to standard physiological expectations. SNR was defined as the maximum unit amplitude of the average waveform for each sorted unit divided by three times the average standard deviation of background noise. The background noise was estimated from the raw signal by applying a bandpass filter (300–3000 Hz) and computing the standard deviation of the filtered signal across multiple segments. This approach ensures a robust and physiologically meaningful quantification of unit isolation. To allow for consistency across trials, spike trains were aligned to decision onset and downsampled to 1 kHz. A continuous spike-density function was then calculated using a 100 ms standard deviation Gaussian smoothing kernel to estimate firing rate. We used these firing rates to assess task-evoked events.

Relationship between decision difficulty and reaction time

Trial difficulty was approximated via the difference in the value of the immediate offer (some amount less than $10) and the subjective value of the delayed offer. Subjective value was calculated using the participant’s individual discounting quotient (*k*), the offer amount ($ 10), and the delay (2, 60, 180, or 365 days) at which the offer would be withheld (see Calculation of Participant-Specific Discounting Rate). The trial-by-trial difficulty and the corresponding reaction times across participants were fit to a gaussian function of the form,

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$RT=A\cdot {e}^{-0.5{{(}_{S}^{d-u})}^{2}}+ C$$\end{document}

where RT is the reaction time, A is the peak height of the function above baseline,* d* is the trial difficulty (i.e. difference between the immediate offer amount and the subjective value of the delayed offer), μ is the location of the peak, S reflects the width of the function and C a constant^12^. In this way, we determined whether a function previously used to model the relationship between reaction time and decision difficulty would also be fit by our data. Data was fit to the function using fit.m in Matlab.

Identification of task variable encoding units

For each unit, the decision aligned spike density function for each trial was further downsampled by taking the average activity across non-overlapping 100 ms bins from − 2 to 0.5 s relative to decision onset. Predictor variables (features) for a model were then built by concatenating 5 adjacent bins (average spike density in 100 ms) beginning at − 2 s relative to decision onset. A model was then trained to predict a trial variable (for decision classifiers: immediate vs. delayed choice; for difficulty classifiers: easy vs. hard trial) using 70% of a subject’s trials and then tested on the remaining unseen 30%. Unique training and test sets at identical timepoints were generated across 200 cross validations. Chance distributions were generated by shuffling trial labels. To characterize model accuracy across time, the 5 adjacent bins were shifted every 100 ms until 0.5 s after decision onset. Models were implemented using LogisticRegressionCV of the sci-kit learn toolbox in python with L2 regularization^25^.

Permutation testing was used to assess whether the experimental decoder performed significantly better than the chance decoder at each 500 ms window by shuffling chance and experimental decoder performances 1000 times to generate a p value. A unit was included in subsequent single unit and population decoding analyses if decoding was significant under a Bonferroni adjusted (with respect to the number of temporally distinct permutation tests performed prior to decision onset) p-value (p = 0.05/20 for 20 total pre-decision windows). The average accuracy of all variable-decoding units in the OFC, amygdala and hippocampus are included in Fig. 2.

Comparison of single unit decoding dynamics

The maximum decoding accuracy, and the time at which it occurred, was variable across units within a region and varied based on the variable being decoded (Fig. 3; Supplementary Fig. 2). A linear mixed effects model was used to determine if there were significant differences in average peak decoding performance and timing within a region based on the variable being decoded. Within the LMM, participant and unit identity were treated as random effects. The restricted maximum likelihood method was used to estimate an effect of task variable (difficulty decoding: easy vs. hard; choice decoding: immediate vs. delayed). In this way, the model determined whether the estimated difference in average peak accuracy or average peak accuracy time for choice and difficulty decoding within a region was significantly different from zero.

Population decoding analyses

The collective decoding capacity of units was assessed by decoding from the population of recorded units. Here, model features consisted of the average firing rate of each unit in a 500 ms window. All units (separately for each region) that exhibited significant choice/difficulty decoding were used as model features. The model was trained on 50% of the data and was tested on the remaining 50%.

Training and test set construction began by splitting each cell’s activity (at a specific point in time) during each condition (i.e., immediate vs. delayed when decoding choice) into half (one half used for training, and the other for testing). Because participants had a unique preference for immediate vs. delayed rewards, they had a different number of exposures to each condition with respect to choice (immediate vs. delayed) and difficulty (easy vs. hard) (Supplementary Table 2). Thus, the training and test set size was bounded by the minimum # of trials seen for each condition across subjects. For example, the minimum of all delayed and immediate choice trials across participants was 10 and, thus, 5 immediate and delayed trials (10 total) were used for testing. To maximize the amount of data that the model was exposed to prior to testing, model features for each trial in the training data was constructed by randomly selecting a cell’s activity (feature) during a trial (that was not used in testing) during a specific condition. Specifically, a model saw features N_1_, N_2_, N_3_, …N_n_ where n = number of regional (i.e. OFC, HPC, amygdala), choice, or difficulty selective cells. For a training example for a condition of interest (i.e. immediate or delayed when decoding choice), N_1_ consisted of cell 1’s firing rate randomly selected from a pool containing that cell’s activity during unique instances of that condition. Features N_1_ to N_n_ consisted of unique cells during identical conditions across training examples. As in single unit decoding analyses, unique training and test sets at identical timepoints were generated across 200 cross validations. Model efficacy was assessed by comparing performance of the experimental decoder to chance decoders that were generated by training a decoder on the same training data, but with shuffled trial labels. All population models were implemented using LogisticRegressionCV of the sci-kit learn toolbox in python with L2 regularization^25^.

We used two approaches to assessing population encoding of a variable of interest. In the first approach, a fixed, an identical time window was used across all units. Given that units varied based on the timing of when they were most predictive of decision/decision difficulty, we also built models using activity centered at the point at which an individual unit was most predictive of the variable of interest. Across all regions, utilization of activity centered at each unit’s most predictive temporal window led to a higher mean accuracy than the greatest mean accuracy observed in the fixed temporal window approach (Supplementary Fig. 6).

Fast versus slow discounting models

We divided participants into two groups based on their discounting behaviors. Slow discounters were willing to accept temporal delays in exchange for greater monetary rewards and had an average k value of 0.0015 +/− 0.0005 (s.e.m.) and an average reaction time of 3.99 +/− 0.10 (s.e.m.) seconds. Fast discounters typically chose the immediately available option and had an average k value of 0.1739 +/− 0.084 and an average reaction time of 5.42 +/− 0.22 (s.e.m.) seconds. We used two approaches to approximate differences in trial variable representations between participants that preferred immediate rewards (fast discounters) vs. those that were relatively agnostic to temporal delays (slow discounters). In both approaches, we used the peak-population decoding strategy discussed in Population Decoding Analyses. In the first approach, we compared regional population decoders built from units from only fast discounters (“Fast Only”) to regional decoders built from an equal number of units randomly sampled from all participants (“All (fixed)”). Both “Fast Only” and “All (fixed)” contained N_1_ to N_Fs_ features where Fs = number of unique units isolated across participants with a tendency to rapidly discount temporally delayed rewards (Supplementary Table 2). The “All (fixed)” model represented an approximation of the discounting behavior agnostic model that included all participants described in Population Decoding Analyses, but with a number of features identical to that used in the “Fast Only” model. A total of Fs units/features were randomly selected across all participants to construct the All (fixed) model since decoder performance may increase with more units. The expanded training set approach described in Population Decoding Analyses was also used in both models here. In the second approach, we compared regional population decoders built from units from only slow discounters to regional population decoders built from an equal number of units from fast discounters.

Fast versus slow single unit characteristics

To determine if participant behavior is related to the level of variable encoding at the single unit level, we compared the average single unit encoding of choice/difficulty (see Identification of Task Variable Encoding Units; Supplementary Fig. 8) in fast vs. slow discounters using a linear mixed effects model. As in previous single unit analyses (see Comparison of Single Unit Decoding Dynamics), participant and unit identity were treated as random effects. The restricted maximum likelihood method was used to estimate an effect of trial variable. To determine if the time at which single unit activity was most predictive of choice/difficulty was different in fast vs. slow discounters we used the same linear mixed effects model approach using peak decoding accuracy time as the predictor variable (Supplementary Fig. 9).

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1