Summary of human intracranial studies examining reward-related neural activity, including the number of participants, brain regions recorded, electrophysiological features, behavioral tasks, and analytical approaches.
| First author, Year | Number of participants | Regions of interest | Electrophysio-logic features | Task | Analysis |
|---|---|---|---|---|---|
| Cohen et al., 2009 [16] | N = 5, DBS for treatment-resistant depression | Nucleus accumbens | Alpha (8–12 Hz), gamma (40–80 Hz), gamma-alpha synchronization | Rewarding shape learning task, select shapes on left versus right, given text feedback of money won/lost with sound (tada!/buzzer); not instructed; win (75%) or lose (25%) €0.12, 152 trials | Stimulus-locked changes in frequency over time; examined whether higher frequencies were coupled with activity in lower-frequency bands for each condition; examined possible differences between gamma-alpha couplings across different conditions. |
| Lega et al., 2011 [17] | N = 1, DBS for treatment-resistant depression | Bilateral ventral striatum | Firing rate: theta (4–8 Hz), alpha (10–14 Hz), beta (16–24 Hz), and low/high gamma | Video game reward task, 104 trials total | Firing rate and local field potential power for each of the three conditions: positive feedback, negative feedback, and reward-neutral feedback. |
| Kishida et al., 2016 [18] | N = 17, DBS for PD | Caudate (N = 14), putamen (N = 3). | Estimate dopamine concentration, measured by fast-scan cyclic voltammetry using linear regression models trained on in vitro data and the EN algorithm | Sequential investment game; 120 investment decisions (20 choices × 6 markets); adjust and submit investment, 0 to 100% of portfolio in 10% increments, experience gain or loss | Reward prediction error (RPE) is computed at each trial, measured as the difference between outcome and expected value (average return up to trial), and counterfactual prediction error (CPE). |
| Li et al., 2016 [19] | N = 8, sEEG for treatment-resistant epilepsy | Medial and lateral OFC | Local field potential amplitude; 86 contacts (35% code reward prob) | Probabilistic reward learning task; five slot machines with five reward probabilities (0/0.25/0.5/0.75/1) | “Expected value”—reward probability estimate since the first trial.“Risk”—emerges during reward anticipation or can also be found at the time of reward outcome.“Experienced value”—response to reward or no reward.Comparing amplitude (uV) and reaction times for different reward probabilities. |
| Saez et al., 2018 [13] | N = 10, ECoG (grids/strips for treatment-resistant epilepsy) | OFC (n = 192) | High frequency activity (70–200 Hz) | A neuroeconomic task that captures the tradeoff between risk and reward | Defined two sets of reward-related signals associated with both choice and outcome evaluation processes |
| Lopez-Persem et al., 2020 [20] | N = 36, iEEG for treatment-resistant epilepsy | Electrodes in vmPFC (n = 146); HC (n = 280); lOFC (n = 304); PHC (n = 122); total n =4,273 electrodes | High gamma (50–150 Hz) | Judgment tasks.Phase 1: ‘age rating and confidence task’ (120 trials, two blocks, faces and paintings; age 21-step scale, 100-step confidence).Phase 2: ‘likeability rating and confidence task’ (180 trials, three blocks, same faces and paintings, added food items, likeability 21-step scale, −10 to 10).Phase 3: forced binary choice task among prior stimuli (food, face, or painting). | Value representation for different categories of items (food, faces, paintings), value-based and non-value-based, first-order judgments (food/nonfood likeability and age ratings), and second-order judgments (confidence ratings).“Subjective value”: regression estimates of OFC (pooling vmPFC and lOFC) and (P)HC (pooling hippocampus and PHC) activity against food likeability rating |
| Gueguen et al., 2021 [21] | N = 20, iEEG for treatment-resistant epilepsy | Electrodes in aINS (n = 83)dlPFC (n = 74)vmPFC (n = 54)lOFC (n = 70) | Broadband gamma activity (BGA, in the 50−150 Hz range) and beta band (13–33 Hz), theta/alpha (4–8 and 8–13 Hz) | Instrumental learning task.Choose between two cues to either maximize monetary gains (for reward cues) or minimize monetary losses (for punishment cues). The pairs of cues associated with reward and punishment learning were intermingled within three to six sessions of 96 trials. In each pair, the two cues were associated with the two possible outcomes (0/1€ in the reward condition and 0/–1€ in the punishment condition) with reciprocal probabilities (0.75/0.25 and 0.25/0.75). | Generate trial-wise expected values and prediction errors, we fitted a Q-learning model (QL) to behavioral data |
| Jamali et al., 2021 [22] | N = 11, single-neuron recordings DBS for essential tremor (N = 7), PD (N = 3), and dystonia (N = 1) | Single cells in the human dmPFC (n = 212) | Firing rate | Verbal version of the false-belief task: other-belief versus physical trials; true- versus false-belief trials; true-physical versus false-physical trials | A Fisher discriminant was used to evaluate whether and to what degree the activity of each neuron during questioning could be used to predict specific trial conditions on a trial-by-trial basis.Model-switch decoding—a drop in decoding accuracy on model-switching would suggest that neuronal responses were selective for another’s beliefs. |
| Manssuer et al., 2022 [23] | N = 16, sEEG for treatment-resistant epilepsy | LFPs from amygdala (n = 16); OFC (n = 9); hippocampus (n = 9) | High frequency gamma 60–250 Hz; theta 4–8 Hz synchrony between amygdala and OFC | Monetary incentive delay task | Monetary reward processing |
| Aquino et al., 2023 [24] | N = 20, Behnke-Fried depth electrodes (macro- and micro-recordings) for treatment-resistant epilepsy | Pre-SMA (n = 137), vmPFC (n = 191) and dACC (n = 108) neurons (436 total) | Firing rate | Two-armed bandit task; contained 20 blocks of 15 trials, a total of 300 trials; split into two recording sessions with ten blocks each.Trial began with a baseline period (sampled randomly from a uniform distribution of 0.75–1.25 s), followed by a choice screen showing the two available slot machines presented on the left or on the right of the screen; after button press, chosen slot machine was shown for a period of 1–2 s (sampled randomly from a uniform distribution), followed by outcome screen shown for 2 s. The outcome screen showed either a golden coin to represent winning a reward or a crossed-out coin to represent not winning. | Logistic regression model to describe how the past history of rewards, sampling history, stimulus exposure history and their interactions with trial number correlated with decisions; defined Q values (denoted as Qs) as the mean of a β distribution that estimates the probability of receiving a reward from a bandit, as determined by the history of wins and losses after sampling a stimulus s, as well as δQ = Qleft – Qright, the difference between left and right Q values; defined an uncertainty value U as the variance of the same β distribution, as well as its corresponding differential δU = Uleft – Uright; defined novelty (N) as the variance of a β distribution in which β = 1 and the α parameter is the number of times patients were exposed to a stimulus s in the entire session, as well as its corresponding differential δN = Nleft – Nright. |
| Collomb-Clerc et al., 2023 [25] | N = 8, DBS for treatment-resistant epilepsy | Anterior thalamus (ATN) and dorsomedial thalamus (DMTN) | Low frequency oscillations (4–12 Hz) | Probabilistic instrumental reinforcement learning task.Cues were abstract visual stimuli taken from the agathodaimon alphabet. The four cue pairs were divided into two conditions (2 pairs of reward and 2 pairs of punishment cues), associated with different pairs of outcomes (winning 1€ versus nothing or losing 1€ versus nothing). | A standard QL algorithm was used to model choice behavior. |
| Hoy et al., 2023 [26] | N = 10, sEEG and ECoG for treatment-resistant epilepsy | Insula (n = 9) and dMPFC (n = 10) (primarily mid-cingulate cortex) | High frequency activity (7–150 Hz) | Interval timing task, consisting of four blocks (two easy, two hard), 75 trials each, with initial instruction indicating difficulty level. | HFA power as a marker of local population dynamics, compared the performance of three different linear mixed models in explaining single-trial dMPFC and INS responses to positive, negative, and neutral feedback during the task. |
| Marciano et al., 2023 [27] | N = 10, sEEG for treatment-resistant epilepsy | OFC n = 144 electrodes | High frequency activity (70–150 Hz) | Repeated trial dictator game; a single player (the ‘‘dictator’’) decides how to split different pots of money between themselves and a social counterpart to study social decision making.In each trial, patients chose between two allocations of money for themselves and another anonymous social counterpart. | OFC ECoG/single-unit activity analysis with time-resolved regression models, using predictors (reward magnitude, advantageous inequity, and disadvantageous inequity) to characterize modulation of neural firing rates; trial-wise encoding analyses to identify neurons and time windows where activity significantly tracked inequity-dependent value signals. |
| Batten et al., 2024 [28] | N = 4, DBS for PD, two intra-op recordings, spaced 14–28 days | SNr | Human electrochemistry, carbon-fibre electrodes | Ultimatum game, a two-person ‘take-it-or-leave-it’ game probing social fairness norms | Electrochemical estimates of dopamine and serotonin in the SNr during the ultimatum game |
| Man et al., 2024 [29] | N = 10, iEEG for treatment-resistant epilepsy (16 recording sessions) | Frontal pole, vmPFC, OFC, supramarginal gyrus, angular gyrus, putamen, hippocampus, amygdala, anterior and posterior insula428 contacts | Average contact activity (1–250 Hz) | Presented with two cards drawn sequentially and without replacement on each trial; cards were shuffled from a deck of 10 cards (ace to ten), excluding face cards. Instructed to treat an ace card as denoting “1”; cards were reshuffled after every trial; prior to drawing a card, participants predicted whether the second card have higher or lower numerical value than the first. | Understand how the brain temporally organizes reward and risk representations |
N: number of patients; n: number of electrodes; aINS: anterior insula; DBS: deep brain stimulation; dlPFC: dorsolateral prefrontal cortex; dmPFC: dorsomedial prefrontal cortex; lOFC: lateral orbitofrontal cortex; OFC: orbitofrontal cortex; PD: Parkinson’s disease; pre-SMA: pre-supplementary motor area; sEEG: stereo-electroencephalography; SNr: substantia nigra reticulata; vmPFC: ventromedial prefrontal cortex.