From:  Emerging insights in human brain and behavior from intracranial recordings

 Summary of human intracranial studies examining reward-related neural activity, including the number of participants, brain regions recorded, electrophysiological features, behavioral tasks, and analytical approaches.

First author, YearNumber of participantsRegions of interestElectrophysio-logic featuresTaskAnalysis
Cohen et al., 2009 [16]N = 5, DBS for treatment-resistant depressionNucleus accumbensAlpha (8–12 Hz), gamma (40–80 Hz), gamma-alpha synchronizationRewarding shape learning task, select shapes on left versus right, given text feedback of money won/lost with sound (tada!/buzzer); not instructed; win (75%) or lose (25%) €0.12, 152 trialsStimulus-locked changes in frequency over time; examined whether higher frequencies were coupled with activity in lower-frequency bands for each condition; examined possible differences between gamma-alpha couplings across different conditions.
Lega et al., 2011 [17]N = 1, DBS for treatment-resistant depressionBilateral ventral striatumFiring rate: theta (4–8 Hz), alpha (10–14 Hz), beta (16–24 Hz), and low/high gammaVideo game reward task, 104 trials totalFiring rate and local field potential power for each of the three conditions: positive feedback, negative feedback, and reward-neutral feedback.
Kishida et al., 2016 [18]N = 17, DBS for PDCaudate (N = 14), putamen (N = 3).Estimate dopamine concentration, measured by fast-scan cyclic voltammetry using linear regression models trained on in vitro data and the EN algorithmSequential investment game; 120 investment decisions (20 choices × 6 markets); adjust and submit investment, 0 to 100% of portfolio in 10% increments, experience gain or lossReward prediction error (RPE) is computed at each trial, measured as the difference between outcome and expected value (average return up to trial), and counterfactual prediction error (CPE).
Li et al., 2016 [19]N = 8, sEEG for treatment-resistant epilepsyMedial and lateral OFCLocal field potential amplitude; 86 contacts (35% code reward prob)Probabilistic reward learning task; five slot machines with five reward probabilities (0/0.25/0.5/0.75/1)“Expected value”—reward probability estimate since the first trial.
“Risk”—emerges during reward anticipation or can also be found at the time of reward outcome.
“Experienced value”—response to reward or no reward.
Comparing amplitude (uV) and reaction times for different reward probabilities.
Saez et al., 2018 [13]N = 10, ECoG (grids/strips for treatment-resistant epilepsy)OFC (n = 192)High frequency activity (70–200 Hz)A neuroeconomic task that captures the tradeoff between risk and rewardDefined two sets of reward-related signals associated with both choice and outcome evaluation processes
Lopez-Persem et al., 2020 [20]N = 36, iEEG for treatment-resistant epilepsyElectrodes in vmPFC (n = 146); HC (n = 280); lOFC (n = 304); PHC (n = 122); total n =4,273 electrodesHigh gamma (50–150 Hz)Judgment tasks.
Phase 1: ‘age rating and confidence task’ (120 trials, two blocks, faces and paintings; age 21-step scale, 100-step confidence).
Phase 2: ‘likeability rating and confidence task’ (180 trials, three blocks, same faces and paintings, added food items, likeability 21-step scale, −10 to 10).
Phase 3: forced binary choice task among prior stimuli (food, face, or painting).
Value representation for different categories of items (food, faces, paintings), value-based and non-value-based, first-order judgments (food/nonfood likeability and age ratings), and second-order judgments (confidence ratings).
“Subjective value”: regression estimates of OFC (pooling vmPFC and lOFC) and (P)HC (pooling hippocampus and PHC) activity against food likeability rating
Gueguen et al., 2021 [21]N = 20, iEEG for treatment-resistant epilepsyElectrodes in aINS (n = 83)
dlPFC (n = 74)
vmPFC (n = 54)
lOFC (n = 70)
Broadband gamma activity (BGA, in the 50−150 Hz range) and beta band (13–33 Hz), theta/alpha (4–8 and 8–13 Hz)Instrumental learning task.
Choose between two cues to either maximize monetary gains (for reward cues) or minimize monetary losses (for punishment cues). The pairs of cues associated with reward and punishment learning were intermingled within three to six sessions of 96 trials. In each pair, the two cues were associated with the two possible outcomes (0/1€ in the reward condition and 0/–1€ in the punishment condition) with reciprocal probabilities (0.75/0.25 and 0.25/0.75).
Generate trial-wise expected values and prediction errors, we fitted a Q-learning model (QL) to behavioral data
Jamali et al., 2021 [22]N = 11, single-neuron recordings DBS for essential tremor (N = 7), PD (N = 3), and dystonia (N = 1)Single cells in the human dmPFC (n = 212)Firing rateVerbal version of the false-belief task: other-belief versus physical trials; true- versus false-belief trials; true-physical versus false-physical trialsA Fisher discriminant was used to evaluate whether and to what degree the activity of each neuron during questioning could be used to predict specific trial conditions on a trial-by-trial basis.
Model-switch decoding—a drop in decoding accuracy on model-switching would suggest that neuronal responses were selective for another’s beliefs.
Manssuer et al., 2022 [23]N = 16, sEEG for treatment-resistant epilepsyLFPs from amygdala (n = 16); OFC (n = 9); hippocampus (n = 9)High frequency gamma 60–250 Hz; theta 4–8 Hz synchrony between amygdala and OFCMonetary incentive delay taskMonetary reward processing
Aquino et al., 2023 [24]N = 20, Behnke-Fried depth electrodes (macro- and micro-recordings) for treatment-resistant epilepsyPre-SMA (n = 137), vmPFC (n = 191) and dACC (n = 108) neurons (436 total)Firing rateTwo-armed bandit task; contained 20 blocks of 15 trials, a total of 300 trials; split into two recording sessions with ten blocks each.
Trial began with a baseline period (sampled randomly from a uniform distribution of 0.75–1.25 s), followed by a choice screen showing the two available slot machines presented on the left or on the right of the screen; after button press, chosen slot machine was shown for a period of 1–2 s (sampled randomly from a uniform distribution), followed by outcome screen shown for 2 s. The outcome screen showed either a golden coin to represent winning a reward or a crossed-out coin to represent not winning.
Logistic regression model to describe how the past history of rewards, sampling history, stimulus exposure history and their interactions with trial number correlated with decisions; defined Q values (denoted as Qs) as the mean of a β distribution that estimates the probability of receiving a reward from a bandit, as determined by the history of wins and losses after sampling a stimulus s, as well as δQ = QleftQright, the difference between left and right Q values; defined an uncertainty value U as the variance of the same β distribution, as well as its corresponding differential δU = UleftUright; defined novelty (N) as the variance of a β distribution in which β = 1 and the α parameter is the number of times patients were exposed to a stimulus s in the entire session, as well as its corresponding differential δN = NleftNright.
Collomb-Clerc et al., 2023 [25]N = 8, DBS for treatment-resistant epilepsyAnterior thalamus (ATN) and dorsomedial thalamus (DMTN)Low frequency oscillations (4–12 Hz)Probabilistic instrumental reinforcement learning task.
Cues were abstract visual stimuli taken from the agathodaimon alphabet. The four cue pairs were divided into two conditions (2 pairs of reward and 2 pairs of punishment cues), associated with different pairs of outcomes (winning 1€ versus nothing or losing 1€ versus nothing).
A standard QL algorithm was used to model choice behavior.
Hoy et al., 2023 [26]N = 10, sEEG and ECoG for treatment-resistant epilepsyInsula (n = 9) and dMPFC (n = 10) (primarily mid-cingulate cortex)High frequency activity (7–150 Hz)Interval timing task, consisting of four blocks (two easy, two hard), 75 trials each, with initial instruction indicating difficulty level.HFA power as a marker of local population dynamics, compared the performance of three different linear mixed models in explaining single-trial dMPFC and INS responses to positive, negative, and neutral feedback during the task.
Marciano et al., 2023 [27]N = 10, sEEG for treatment-resistant epilepsyOFC n = 144 electrodesHigh frequency activity (70–150 Hz)Repeated trial dictator game; a single player (the ‘‘dictator’’) decides how to split different pots of money between themselves and a social counterpart to study social decision making.
In each trial, patients chose between two allocations of money for themselves and another anonymous social counterpart.
OFC ECoG/single-unit activity analysis with time-resolved regression models, using predictors (reward magnitude, advantageous inequity, and disadvantageous inequity) to characterize modulation of neural firing rates; trial-wise encoding analyses to identify neurons and time windows where activity significantly tracked inequity-dependent value signals.
Batten et al., 2024 [28]N = 4, DBS for PD, two intra-op recordings, spaced 14–28 daysSNrHuman electrochemistry, carbon-fibre electrodesUltimatum game, a two-person ‘take-it-or-leave-it’ game probing social fairness normsElectrochemical estimates of dopamine and serotonin in the SNr during the ultimatum game
Man et al., 2024 [29]N = 10, iEEG for treatment-resistant epilepsy (16 recording sessions)Frontal pole, vmPFC, OFC, supramarginal gyrus, angular gyrus, putamen, hippocampus, amygdala, anterior and posterior insula
428 contacts
Average contact activity (1–250 Hz)Presented with two cards drawn sequentially and without replacement on each trial; cards were shuffled from a deck of 10 cards (ace to ten), excluding face cards. Instructed to treat an ace card as denoting “1”; cards were reshuffled after every trial; prior to drawing a card, participants predicted whether the second card have higher or lower numerical value than the first.Understand how the brain temporally organizes reward and risk representations

N: number of patients; n: number of electrodes; aINS: anterior insula; DBS: deep brain stimulation; dlPFC: dorsolateral prefrontal cortex; dmPFC: dorsomedial prefrontal cortex; lOFC: lateral orbitofrontal cortex; OFC: orbitofrontal cortex; PD: Parkinson’s disease; pre-SMA: pre-supplementary motor area; sEEG: stereo-electroencephalography; SNr: substantia nigra reticulata; vmPFC: ventromedial prefrontal cortex.