The role of dopamine-sensitive motor cortical circuits in the development and execution of skilled forelimb movements

Martyna Gorkowska-Nosal; Gniewosz Drwiega; Lukasz Szumiec; Jan Rodriguez Parkitna; Przemyslaw E. Cieslak

PMC · DOI:10.1016/j.isci.2026.114983·February 10, 2026

The role of dopamine-sensitive motor cortical circuits in the development and execution of skilled forelimb movements

Martyna Gorkowska-Nosal, Gniewosz Drwiega, Lukasz Szumiec, Jan Rodriguez Parkitna, Przemyslaw E. Cieslak

PDF

Open Access

TL;DR

This study explores how dopamine and brain circuits in the motor cortex help mice learn and perform skilled movements to get rewards.

Contribution

The study reveals how dopamine dynamics and neuronal activity in the motor cortex are linked to skilled movement execution and reward processing.

Findings

01

Dopamine activity in the motor cortex is temporally linked to movement and reward consumption.

02

Neuronal activity in the motor cortex reflects the vigor of forelimb movements.

03

D1+ and D2+ neurons are differently distributed in the layers of the motor cortex.

Abstract

Dopamine (DA) signaling in the primary motor cortex (M1) is crucial for motor skill learning. However, the DA dynamics in the M1 during the formation and execution of skilled behavior have not yet been investigated. We trained head-fixed D1Cre and D2Cre mice to perform skilled forelimb movements with a joystick to collect water rewards and used fiber photometry to simultaneously monitor DA dynamics and population-level calcium (Ca2+) activity from D1+ and D2+ neurons in the M1 forelimb area. We found that the activity of DA and neuronal populations in M1 is temporally linked to joystick movements and reward consumption, tracks actual reward availability, and reflects the vigor of forelimb movement. Our findings show how DA dynamics and activity of local dopaminoceptive circuits in the M1 are shaped during motor learning and execution of skilled behavior. •Mice perform skilled forelimb…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Figures7

Click any figure to enlarge with its caption.

Head-fixed mice learn to perform skilled forelimb movements to obtain a reward(A) Representation of injecting AAV9-hSyn-GRAB_DA2h_ and AAV1-Syn-Flex-NES-jRGECO1a into M1. Coronal sections obtained from D1Cre and D2Cre mice showing expression of GRAB_DA2h_ (green) and jRGECO1a (red) in the M1 forelimb area (brain slices are 0.5 mm anterior to the bregma). Scale bars, 500 μm. Magnification: 20×. See also Figure S1 for the histological map of GRAB_DA2h_ and jRGECO1a expression and fiber tip placements.(B) Representation of the task in which head-fixed mice perform skilled forelimb movements with

DA dynamics in the M1 encode execution of skilled forelimb movement and reward consumption(A) Peri-event plots of the average normalized DA signal in D1Cre mice aligned to threshold cross (top) and binned data analyzed at 1-s intervals (bottom). (Left to right) Data are presented as within-session (early or late) comparisons with respect to the baseline (BL) and between-session comparisons of movement- and reward-related periods (early vs. late).(B) Peri-event plots of the average normalized Ca^2+^ signal in D1Cre mice aligned to threshold cross (top) and binned data analyzed at 1-s intervals

The neural response in the M1 reflects the movement vigor(A and B) Movement kinematic parameters obtained from the low- and high-amplitude trials during regular threshold session in D1Cre and D2Cre mice showing the difference in response vigor. (Left to right) Distribution of movement amplitudes, movement amplitude, and joystick velocity at the time of threshold crossing.(C) Peri-event plots of the average normalized DA signal in D1Cre mice aligned to threshold cross (top) and comparisons of movement- and reward-related periods (analyzed at 1-s intervals) between low- and high-amplitude trials

The neural response in the M1 reflects the movement vigor during the increased threshold session(A and B) Movement kinematic parameters obtained from the low- and high-amplitude trials during the increased threshold session in D1Cre and D2Cre mice. (Left to right) Distribution of movement amplitudes, movement amplitude, and joystick velocity at the time of threshold crossing.(C) Peri-event plots of the average normalized DA signal in D1Cre mice aligned to threshold cross (top) and comparisons of movement- and reward-related periods (analyzed at 1-s intervals) between low- and high-amplitude tr

DA in the M1 signals actual reward availability during skilled performance(A) Peri-event plots of the average normalized DA signal in D1Cre mice aligned to threshold cross (top) and binned data analyzed at 1-s intervals (bottom). (Left to right) Data are presented as within-session (delay or omission) comparisons with respect to the baseline (BL).(B) Peri-event plots of the average normalized Ca^2+^ signal in D1Cre mice aligned to threshold cross (top) and binned data analyzed at 1-s intervals (bottom). (Left to right) Data are presented as within-session (delay or omission) comparisons with r

A subset of D1+ and D2+ neurons of the M1 contacts long-range targets(A) Representation of the four major areas of rAAV2-Retro-EYFP and Fluoro-Green injections: primary motor cortex (M1), dorsolateral striatum (DLS), thalamus (Thal), and pontine nucleus (PN). See also Figure S6 for representative histology of the injection sites.(B and C) Coronal sections obtained from D1-tdTomato and D2Cre::Ai14 (tdTomato) mice showing laminar distribution of tdTomato (red) expressing D1+ or D2+ neurons and EYFP or FGr (green) expressing retrogradely labeled projection neurons in the M1 forelimb area (brain s

Keywords

biological sciencesnatural sciencesneurosciencesystems neuroscience

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMotor Control and Adaptation · Transcranial Magnetic Stimulation Studies · Action Observation and Synchronization

Full text

Introduction

The acquisition of new motor skills and the adaptation of motor behavior to changing environmental demands are essential for survival. A distributed neural network mediates the learning and performance of motor skills, with the primary motor cortex (M1) acting as a central hub that transmits output cortical motor commands to downstream motor centers, including the basal ganglia, motor thalamus, brainstem, and spinal cord.1^,^2^,^3^,^4 The M1 is innervated by dopaminergic fibers, with preferential innervation in the M1 forelimb area.5^,^6^,^7^,^8^,^9^,^10 There is growing evidence that dopamine (DA) transmission is crucial for motor skill learning, mediates structural and cellular synaptic plasticity in the M1, and modulates reach kinematics.6^,^11^,^12^,^13^,^14^,^15 Nevertheless, the DA dynamics in the M1 during the formation and execution of skilled forelimb movements have not yet been investigated.

DA action in M1 is mediated by DA receptors, D1 and D2, encoded by the Drd1a and Drd2 genes, respectively. Application of specific D1 and D2 antagonists into the M1 forelimb area suppresses synaptic plasticity and motor skill acquisition.11^,^12^,^16 Moreover, activation of D1 and D2 receptors has been shown to modulate the excitability of M1 pyramidal and GABAergic neurons.10^,^17^,^18 Nevertheless, still, little is known about M1 populations of dopaminoceptive neurons. Recently, a few studies have employed transgenic mouse strains, with Drd1a and Drd2 gene promoters driving the expression of Cre recombinase and fluorescent proteins, to selectively target populations of D1 receptor-positive (D1+) and D2 receptor-positive (D2+) cells in the mouse M1.17^,^18^,^19 These studies showed that D1+ and D2+ populations are non-overlapping (expressing either D1 or D2 receptors), diverse (containing pyramidal and GABAergic neurons), and have a preferential laminar distribution (D1+ neurons are mainly found in the deep layers, whereas D2+ are primarily found in the superficial layers). However, the activity patterns of these populations during movement execution and their functional contribution to motor performance remain to be determined.

We have developed a forelimb-specific task in head-fixed mice, in which animals make skilled forelimb movements using a joystick to acquire water reward.20^,^21^,^22^,^23 We used fiber photometry to track DA dynamics and population-level calcium (Ca^2+^) activity in the M1 forelimb area of D1Cre and D2Cre mice, during the development and execution of skilled behavior, and following subsequent changes to the reward threshold or availability. Furthermore, we used retrograde tracings to determine the long-range connections of M1 projection neurons in D1tdTomato and D2Cre::Ai14(tdTomato) mice. We provide the first comprehensive assessment of the role that DA-sensitive motor cortical circuits play in the formation and execution of skilled forelimb movements.

Results

Simultaneous recording of DA dynamics and population-level Ca2+ activity in the M1 forelimb area

We recently identified two populations of DA receptor-expressing neurons with a layer-specific distribution in the forelimb area of the mouse M1.19 We showed that D1+ neurons are primarily found in the deep layers, whereas D2+ cells are distributed in the superficial layers. Here, we aim to determine DA release dynamics and record cell type-specific Ca^2+^ activity of these populations in awake, behaving animals. We injected an adeno-associated virus (AAV) expressing a high-affinity green fluorescent DA sensor (GRAB_DA2h_)24 and Cre-dependent red fluorescent Ca^2+^ indicator (jRGECO1a)25 into the M1 forelimb area of D1Cre and D2Cre transgenic mice,26^,^27 expressing Cre recombinase in dopaminoceptive neurons (Cre+). The DA sensor was also expressed in wild-type littermates (Cre-), and data from Cre+ and Cre- genotypes were combined within each strain. AAV injections were made in layer 5 of D1Cre mice and layer 2/3 of D2Cre animals, with optical fibers positioned above the injection site in the right hemisphere (Figures 1A, S1A, and S1B).

Using a CMOS camera-based fiber photometry system,28 we simultaneously monitored GRAB_DA2h_ and jRGECO1a signals in head-fixed mice trained to make skilled, bimanual forelimb movements with a joystick to acquire a delayed water reward (Figure 1B). During recordings, we observed Ca^2+^ transients in D1+ and D2+ neuronal populations, along with concomitant DA release events. The observed signal changes were related to the motions of the forelimbs, which in the following sessions were oriented toward moving the joystick (Figures S1C and S1D).Figure 1. Head-fixed mice learn to perform skilled forelimb movements to obtain a reward(A) Representation of injecting AAV9-hSyn-GRAB_DA2h_ and AAV1-Syn-Flex-NES-jRGECO1a into M1. Coronal sections obtained from D1Cre and D2Cre mice showing expression of GRAB_DA2h_ (green) and jRGECO1a (red) in the M1 forelimb area (brain slices are 0.5 mm anterior to the bregma). Scale bars, 500 μm. Magnification: 20×. See also Figure S1 for the histological map of GRAB_DA2h_ and jRGECO1a expression and fiber tip placements.(B) Representation of the task in which head-fixed mice perform skilled forelimb movements with a joystick to obtain a delayed reward, and a fiber photometry setup for simultaneous recording of DA dynamics and population level Ca^2+^ activity in the M1 forelimb area. See also Figure S1 for representative traces of neuronal Ca^2+^ and DA signals acquired during training.(C) Increase in the number of rewarded joystick movements in D1Cre mice during training.(D) Movement kinematic parameters obtained from the early and late sessions in D1Cre mice showing that the average movement amplitude and velocity increased between training phases. (Left to right) Distribution of movement amplitudes, movement amplitude, and joystick velocity at the time of threshold crossing.(E) Individual movement trajectories of an example D1Cre mouse aligned to the initial joystick position (black dot), and preferred movement direction in the group of D1Cre animals (based on all extracted trajectories).(F) Change in movement similarity in D1Cre mice (based on the average distance between pairwise trajectories). (Left to right) Similarity between the early and late sessions; similarity within the late session.(G–J) Same as (C)–(F) for D2Cre mice.In (C)–(J), data were obtained from 14 D1Cre (10 Cre+ and 4 Cre-) and 13 D2Cre (10 Cre+ and 3 Cre-) animals. Data are represented as mean ± SEM. Paired t test; ∗∗∗p < 0.001, ∗∗p < 0.01, ∗p < 0.05.

Overall, we confirmed that DA dynamics and population-level Ca^2+^ activity in the M1 forelimb area can be recorded with fiber photometry.

Head-fixed mice learn to perform skilled forelimb movements to obtain a reward

As training progressed, mice of both genotypes demonstrated a significant increase in the number of rewarded joystick movements (Figures 1C and 1G). To further investigate the development of this skilled behavior, we compared the movement kinematic parameters between the early and late learning sessions. For the early session, we selected a day on which the animal made at least 30 rewarded joystick movements (usually around day 4; in D1Cre median = 4.5; in D2Cre median = 4), and data from the last 2 days were used for the late session comparison. We found that the mice gradually produced more vigorous movements, as the amplitude and velocity increased significantly across the learning days (Figures 1D and 1H).

Animals primarily used pushing and pulling motions along their body axis (with relatively little lateral movement) to control the joystick, but since reward was elicited by joystick movements in any direction, we saw a considerable degree of trial-to-trial variability in movement trajectories (Figures 1E and 1I). We employed the dynamic time warping (DTW) algorithm29 to determine the Euclidean distance between pairwise trajectories and compare their similarity. In contrast to the early session, we noticed less similarity between trajectories in the late session, when animals performed a substantial quantity of movement, but there were no noticeable deviations in the similarity of movements throughout the session (Figures 1F and 1J). The degree of trajectory variability was comparable to what has been previously observed in similar operant joystick tasks.20^,^21^,^22^,^23

Overall, we demonstrated that head-fixed animals developed goal-directed behavior and used skilled forelimb movements to manipulate the joystick and obtain delayed rewards.

DA dynamics in the M1 encode execution of skilled forelimb movement and reward consumption

Next, we monitored DA and Ca^2+^ activity in the M1 and licking patterns following joystick movement and reward delivery, throughout the same early and late sessions. The data were aligned to the threshold crossing (T: 0 s), with the preceding 2 s serving as the baseline (BL: −2 to 0 s), and analyzed at 1-s intervals, with the movement phase occurring between 0 and 1 s and the reward consumption between 1 and 2 s period. We compared the subsequent time periods with the BL within each session, as well as movement- and reward-related periods between sessions (Figures 2A–2F) and between mouse strains (Figures S2A and S2B). As demonstrated in D1Cre and D2Cre mice, the DA in the M1 was released in response to joystick movement and reward delivery (Figures 2A and 2D). The D1+ and D2+ populations were likewise simultaneously recruited during joystick movement, and the reward-related response was also present in the Ca^2+^ signals, particularly during the early session (Figures 2B and 2E), suggesting that at the population level, D1+ and D2+ cells in the M1 may also be activated during reward collection. Nevertheless, motor-related activity was more prominent than reward-related activity in the Ca^2+^ data, indicating that the D1+ and D2+ populations were mainly involved in the generation of forelimb movements. Since there was a simultaneous increase in the DA and Ca^2+^ signals, we performed cross-correlation analysis to further investigate this relationship and determine the similarity and temporal alignment between these signals. In both groups of animals, DA and Ca^2+^ signals showed a large positive correlation, with a very short (∼50 ms) negative lag (Figures S2C and S2D). This suggests that the increase in DA levels and the recruitment of DA-sensitive neuron populations in M1 are coordinated processes that co-occur in time.Figure 2DA dynamics in the M1 encode execution of skilled forelimb movement and reward consumption(A) Peri-event plots of the average normalized DA signal in D1Cre mice aligned to threshold cross (top) and binned data analyzed at 1-s intervals (bottom). (Left to right) Data are presented as within-session (early or late) comparisons with respect to the baseline (BL) and between-session comparisons of movement- and reward-related periods (early vs. late).(B) Peri-event plots of the average normalized Ca^2+^ signal in D1Cre mice aligned to threshold cross (top) and binned data analyzed at 1-s intervals (bottom). (Left to right) Data are presented as within-session (early or late) comparisons with respect to the BL and between-session comparisons of movement- and reward-related periods (early vs. late).(C) Peri-event plots of the average lick rate (licks/s) in D1Cre mice aligned to threshold cross (top) and binned data analyzed at 1-s intervals (bottom). (Left to right) Data are presented as within-session (early or late) comparisons with respect to the BL and between-session comparisons of movement- and reward-related periods (early vs. late).(D–F) Same as (A)–(C) for D2Cre mice.In (A)–(F), data were obtained from 14 D1Cre (10 Cre+ and 4 Cre-) and 13 D2Cre (10 Cre+ and 3 Cre-) animals. Data are represented as mean ± SEM. Repeated-measure one-way ANOVA; multiple comparisons: BL mean vs. every other mean, movement (0–1 s) mean vs. reward (1–2 s) mean; Bonferroni’s post hoc test; ∗∗∗p < 0.001, ∗∗p < 0.01, ∗p < 0.05; or paired t test; ∗∗∗p < 0.001, ∗∗p < 0.01, ∗p < 0.05. Dashed lines indicate joystick movement (threshold crossing) and reward delivery. See also Figure S2 for comparisons between D1Cre and D2Cre mice and cross-correlations between DA and Ca^2+^ signals.

As the animals progressed through training, reward-related DA and Ca^2+^ responses in D1Cre mice decreased slightly in the late session (Figures 2A and 2B), but we did not observe this change in D2Cre animals (Figures 2D and 2E). The behavioral data revealed that mice from both groups licked more vigorously (with higher frequency) at the time of reward delivery (1–2 s) (Figures 2C and 2F) and showed the emergence of anticipatory licking that preceded reward delivery (0–1 s) in the late session. The licking frequency prior to the reward delivery in the late session was slightly higher in D1Cre mice than in D2Cre animals, but no other differences between genotypes were observed (Figures S2A and S2B). This may imply that the reduction in reward-related DA and Ca^2+^ responses observed in D1Cre animals between the early and late sessions could be related to a change in reward expectation.

While rewarded joystick movements were our main focus, we also observed increases in Ca^2+^ and DA concentrations after low-amplitude (subthreshold) joystick movements and unrewarded movements during the inter-trial interval (ITI). We manually marked the onset of these joystick movements in a sample of mice and examined the DA and Ca^2+^ responses that matched the movement onset (Figures S2E and S2F). In subthreshold trials, the movement-related signal amplitude was reduced, and in D2Cre mice, it was hardly distinguishable from the BL. In both strains, the reward-related component of the signal was absent during unrewarded movement, and once the movement was finished, the signal reverted to its BL. This suggests that the magnitude of DA and Ca^2+^ responses in M1 may be determined by movement vigor and reward availability.

Overall, we found that animal behavior was goal-directed and oriented toward attaining a delayed reward, and that the generation of skilled forelimb movements and reward collection are encoded by the DA dynamics and population activity of DA-recipient neurons in the M1.

The neural response in the M1 reflects the movement vigor but not the exerted effort

To test if DA-dependent signaling in the M1 correlates with movement vigor, we further trained mice with a regular amplitude threshold (equivalent to a late session) and sorted the rewarded joystick movements produced during this session by amplitude to extract the low-amplitude (25% percentile) and high-amplitude (75% percentile) motions. We found that the high-amplitude movements were generated at higher velocities (performed with greater vigor) (Figures 3A and 3B). When we compared DA and Ca^2+^ responses associated with the low-amplitude/low-velocity and high-amplitude/high-velocity movements, we found that the increased movement vigor translated into increased DA signaling (Figures 3C and 3F) and greater recruitment of D1+ and D2+ populations in the M1 (Figures 3D and 3G). The licking frequency following reward delivery in high-amplitude/high-velocity trials was slightly increased in D2Cre, but not D1Cre mice (Figures 3E and 3H).Figure 3. The neural response in the M1 reflects the movement vigor(A and B) Movement kinematic parameters obtained from the low- and high-amplitude trials during regular threshold session in D1Cre and D2Cre mice showing the difference in response vigor. (Left to right) Distribution of movement amplitudes, movement amplitude, and joystick velocity at the time of threshold crossing.(C) Peri-event plots of the average normalized DA signal in D1Cre mice aligned to threshold cross (top) and comparisons of movement- and reward-related periods (analyzed at 1-s intervals) between low- and high-amplitude trials (bottom).(D) Peri-event plots of the average normalized Ca^2+^ signal in D1Cre mice aligned to threshold cross (top) and comparisons of movement- and reward-related periods (analyzed at 1-s intervals) between low- and high-amplitude trials (bottom).(E) Peri-event plots of the average lick rate (licks/s) in D1Cre mice aligned to threshold cross (top) and comparisons of movement- and reward-related periods (analyzed at 1-s intervals) between low and high amplitude trials (bottom).(F–H) Same as (C)–(E) for D2Cre mice.In (A)–(H), data were obtained from 14 D1Cre (10 Cre+ and 4 Cre-) and 13 D2Cre (10 Cre+ and 3 Cre-) animals. Data are represented as mean ± SEM. Paired t test; ∗∗∗p < 0.001, ∗∗p < 0.01, ∗p < 0.05. Dashed lines indicate joystick movement (threshold crossing) and reward delivery. See also Figure S4 for comparisons between D1Cre and D2Cre mice.

We then examined the potential effects of increasing task difficulty on animal behavior and neural responses in the M1. In a following session, the amplitude threshold required to obtain a reward was increased by 3 mm, forcing the animal to exert more effort. We compared all rewarded trials in the increased threshold session with all rewarded trials in the preceding session with a regular threshold. As increasing the threshold demanded more effort to obtain the reward, we observed a reduction in the number of rewarded trials, which could suggest a performance issue or a decrease in motivation (Figures S3A and S3E). Nevertheless, on successful trials, mice from both groups were able to adjust the reach amplitude to the changed threshold requirements, but no noticeable change was observed in the peak velocity measured at the time of threshold crossing (Figures S3A and S3E). This would imply that, despite exerting more effort in response to increasing threshold requirements, average movement vigor remained unchanged. Accordingly, when we compared the mean fluorescence recorded during joystick movement (0–1 s) and reward delivery (1–2 s), we found no noticeable difference in DA release between sessions with regular and increased thresholds (Figures S3B and S3F). While the D1+ population’s movement-related activity somewhat decreased during the increased threshold session, the response of the D2+ population remained unchanged between these two conditions (Figures S3C and S3G). However, when we sorted the rewarded joystick movements made during the increased threshold session by amplitude, we discovered a similar pattern to the sorted data from the regular threshold session. The high-amplitude movements occurred at higher velocities (Figures 4A and 4B), and the increase in movement vigor was associated with increased DA signaling (Figures 4C and 4F) and greater recruitment of D1+ and D2+ populations in the M1 (Figures 4D and 4G). The increased threshold session had no effect on the frequency of licking (Figures 4E, 4H, S3D, and S3H). When we compared data across genotypes, we found that the D1Cre mice generally exhibited more anticipatory licking than D2Cre animals (Figures S4A–S4C), while D2Cre mice had greater neuronal population activation during the increased threshold session than D1Cre mice (Figure S4B). No other differences between genotypes were observed.Figure 4. The neural response in the M1 reflects the movement vigor during the increased threshold session(A and B) Movement kinematic parameters obtained from the low- and high-amplitude trials during the increased threshold session in D1Cre and D2Cre mice. (Left to right) Distribution of movement amplitudes, movement amplitude, and joystick velocity at the time of threshold crossing.(C) Peri-event plots of the average normalized DA signal in D1Cre mice aligned to threshold cross (top) and comparisons of movement- and reward-related periods (analyzed at 1-s intervals) between low- and high-amplitude trials (bottom).(D) Peri-event plots of the average normalized Ca^2+^ signaling D1Cre mice aligned to threshold cross (top) and comparisons of movement- and reward-related periods (analyzed at 1-s intervals) between low- and high-amplitude trials (bottom).(E) Peri-event plots of the average lick rate (licks/s) in D1Cre mice aligned to threshold cross (top) and comparisons of movement- and reward-related periods (analyzed at 1-s intervals) between low- and high-amplitude trials (bottom).(F–H) Same as (C)–(E) for D2Cre mice.In (A)–(H) Data were obtained from 14 D1Cre (10 Cre+ and 4 Cre-), and 12 D2Cre (10 Cre+ and 2 Cre-) animals. Data are represented as mean ± SEM. Paired t test; ∗∗∗p < 0.001, ∗∗p < 0.01, ∗p < 0.05. Dashed lines indicate joystick movement (threshold crossing) and reward delivery. See also Figure S4 for comparisons between D1Cre and D2Cre mice.

Overall, we found that the DA dynamics and population activity of DA-receptive neurons in the M1 reflect movement vigor (represented by movement velocity) rather than exerted effort (overcoming increased amplitude threshold).

DA in the M1 signals actual reward availability during skilled performance

We were able to temporally isolate the DA-release and neural activity related to reward collection from the modulation of activity during forelimb movement. To further determine the role of neural activity in the M1 in reward processing, we modified the task by introducing a longer reward delay (3-s instead of 1-s). We found that the DA response in M1 decayed at the time of the expected reward (1–2 s) and ramped up again following the actual reward (3–4 s) (Figures 5A and 5D). Although primarily recruited during joystick movement, the D1+ and D2+ populations also showed a residual response to delayed reward (Figures 5B and 5E), further indicating their role in reward processing. Mice from both groups increased their anticipatory licking during the expected reward period (1–2 s) and then further elevated their licking frequency at the time of actual reward delivery (3–4 s) (Figures 5C and 5F).Figure 5DA in the M1 signals actual reward availability during skilled performance(A) Peri-event plots of the average normalized DA signal in D1Cre mice aligned to threshold cross (top) and binned data analyzed at 1-s intervals (bottom). (Left to right) Data are presented as within-session (delay or omission) comparisons with respect to the baseline (BL).(B) Peri-event plots of the average normalized Ca^2+^ signal in D1Cre mice aligned to threshold cross (top) and binned data analyzed at 1-s intervals (bottom). (Left to right) Data are presented as within-session (delay or omission) comparisons with respect to the BL.(C) Peri-event plots of the average lick rate (licks/s) in D1Cre mice aligned to threshold cross (top) and binned data analyzed at 1-s intervals (bottom). (Left to right) Data are presented as within-session (delay or omission) comparisons with respect to the BL.(D–F) Same as (A)–(C) for D2Cre mice.In (A)–(F), data for “delayed reward” analysis were obtained from 14 D1Cre (10 Cre+ and 4 Cre-) and 13 D2Cre (10 Cre+ and 3 Cre-) animals and for “omission” analysis from 9 D1Cre (7 Cre+ and 2 Cre-) and 8 D2Cre (8 Cre+) animals. Data are represented as mean ± SEM. Repeated-measure one-way ANOVA; multiple comparisons: BL mean vs. every other mean, movement (0–1 s) mean vs. expected reward (1–2 s) mean, expected reward (1–2 s) mean vs. actual reward (3–4 s) mean; Bonferroni’s post hoc test; ∗∗∗p < 0.001, ∗∗p < 0.01, ∗p < 0.05 See also Figure S5 for comparisons between D1Cre and D2Cre mice.

A subset of mice was also trained on a reinforcement schedule in which the expected reward (previously delivered at 1-s delay) was completely omitted. The reward-related component of the signal in both DA (Figures 5A and 5D) and Ca^2+^ (Figures 5B and 5E) data was absent, and DA levels and population activity of D1+ and D2+ cells elevated by the joystick movement slowly returned to BL. Although mice showed a significant increase in licking during anticipation (0–1 s) and following expected reward (1–2 s), the average licking frequency in the absence of actual reward was relatively low (Figures 5C and 5F). Again, D1Cre mice exhibited more anticipatory licking than D2Cre animals in the delayed reward experiment, but no other differences between genotypes were observed (Figures S5A and S5B).

Overall, this showed that animal behavior was under goal-directed control and that the DA release and population activity of DA-recipient neurons in the M1 tracked the actual reward availability during skilled performance.

A subset of D1+ and D2+ neurons of the M1 contacts long-range targets

We previously traced the axonal projections of D1+ and D2+ cells in the M1 using an anterograde tracing approach and found labeled axons in various cortical and subcortical brain regions.19 However, it was not established whether these were axons of passage or functional connections. Therefore, we wanted to confirm long-range connections between D1+ or D2+ populations in M1 and the expected target regions. We used D1-tdTomato30 and D2Cre::Ai14 (tdTomato)27^,^31 mice and injected rAAV2-retro-hSyn-EYFP (rAAV2-retro)32 or Fluoro-Green (FGr)33 retrograde tracers (which labeled cells in green) at four specific target locations, including: the M1, dorsolateral striatum (DLS), thalamus (Thal), and pontine nucleus (PN) (Figures 6A and S6A). We obtained slices containing the M1 forelimb regions contralateral and ipsilateral to an injection site and counted D1+ or D2+ (tdTomato-positive) and EYFP+ (EYFP-positive) or FGr+ (FGr-positive) cells (Figures 6B and 6C). We plotted the laminar distribution of D1+ and D2+ cells together with the projection-specific distributions of retrogradely labeled EYFP+ and FGr+ neurons (Figures S6B and S6C). Consistent with our earlier findings,19 D1+ neurons were most concentrated in deep layers (relative to the pia), while their abundance in the superficial layers was lower (Figures 6D and S6B). In contrast, D2+ cells were predominantly concentrated in the upper layers, with a much smaller number in the lower half of M1 (Figures 6G and S6C).Figure 6A subset of D1+ and D2+ neurons of the M1 contacts long-range targets(A) Representation of the four major areas of rAAV2-Retro-EYFP and Fluoro-Green injections: primary motor cortex (M1), dorsolateral striatum (DLS), thalamus (Thal), and pontine nucleus (PN). See also Figure S6 for representative histology of the injection sites.(B and C) Coronal sections obtained from D1-tdTomato and D2Cre::Ai14 (tdTomato) mice showing laminar distribution of tdTomato (red) expressing D1+ or D2+ neurons and EYFP or FGr (green) expressing retrogradely labeled projection neurons in the M1 forelimb area (brain slices are 0.5 mm anterior to the bregma). Scale bars, 50 μm. See also Figure S6 for analysis of laminar distributions.(D) Cell density of D1+ neurons plotted as a function of distance from pia (where pia = 0 and white matter = 1; therefore, <0.5 indicates cells clustered in the upper layers, while >0.5 indicates cells clustered in the deep layers).(E) Colocalization of D1+ neurons with different populations of projection neurons. Percentage of D1+ neurons indicates colabeled cells across all layers that are also EYFP+ (% [D1+ EYFP+] / D1+) or FGr+ (% [D1+ FGr+] / D1+). Percentage of EYFP+ neurons indicates cells that are also D1+ (% [D1+ EYFP+] / EYFP+), and percentage of FGr+ neurons indicates cells that are D1+ (% [D1+ FGr+] / FGr+).(F) Comparison of labeling specificity of rAAV2-retro and FGr tracers measured by the number of cells labeled by each tracer.(G–I) Same as (D)–(F) for D2Cre mice.In (D)–(I), data were obtained from 21 D1-tdTomato and 19 D2Cre::Ai14 animals (2–3 mice per injection site, 4 slices per animal, slices were collected from −0.10 mm posterior to 1.00 mm anterior to the bregma). Data are represented as mean ± SEM. Paired t test; ∗∗∗p < 0.001. Two-way ANOVA; multiple comparisons: EYFP mean vs. FGr mean; Bonferroni’s post hoc test; ∗∗∗p < 0.001, ∗∗p < 0.01. (B–I) cM1 (primary motor cortex injection, M1 forelimb area contralateral to the injection site); cDLS (DLS injection, M1 forelimb area contralateral to the injection site); iDLS (dorsolateral striatum injection, M1 forelimb area ipsilateral to the injection site); iThal (thalmus injection, M1 forelimb area ipsilateral to the injection site); iPN (pontine nucleus injection, M1 forelimb area ipsilateral to the injection site).

Next, we examined the colocalization of D1+ or D2+ neurons with different populations of EYFP+ and FGr+ projection neurons. We quantified the percentage of D1+ neurons that are also EYFP+ (% [D1+ EYFP+] / D1+), and D1+ neurons that are also FGr+ (% [D1+ FGr+] / D1+), and did the same with D2+ neurons (Figures 6E and 6H). In each case, we found relatively strong colabeling for cells projecting to contralateral M1 (∼20% in D1+ and ∼10% in D2+ population), with much less colabeling for cells projecting to other regions (less than 5%). Because the ratios of tdTomato+ to retrogradely labeled cells were not equal, this analysis might have underestimated the number of colabeled cells. Therefore, we also quantified the percentage of EYFP+ neurons that are D1+ (% [D1+ EYFP+] / EYFP+), and FGr+ neurons that are D1+ (% [D1+ FGr+] / FGr+), and did the same with D2+ neurons (Figures 6E and 6H). In this case, the overlap was slightly higher, but the percentage of colabeled neurons projecting to subcortical targets was still relatively low, suggesting that only a small fraction of neurons from D1+ and D2+ populations contact long-range targets.

Although the laminar distributions of retrogradely labeled cells were comparable between the two tracers, there was a visible difference in the laminar distribution and proportion of labeled cells projecting to the Thal (Figures S6B and S6C). We speculated that the rAAV2-retro and FGr had different specificity for labeling particular cells or projections, which could account for the observed effect. To further test this, we compared the total number of neurons labeled using both methods. The analysis revealed that mice from both strains injected with FGr had a greater number of labeled neurons projecting to the Thal than animals injected with rAAV2-retro (Figures 6F and 6I). A considerably greater number of M1 projecting neurons in FGr-injected animals were also confirmed in the D1-tdTomato strain (Figure 6F).

Overall, we confirmed that D1+ and D2+ neurons have a discrete distribution in the layers of the M1. We also showed that only a small fraction of D1+ and D2+ projection neurons of the M1 contact long-range targets. However, the reported effects may have been affected by the labeling efficacy of the retrograde tracers used in the study.

Discussion

Motor learning is a process by which animals acquire skilled movements and associate new sensory information with actions.34^,^35 In principle, this learning process is driven and motivated by rewards that inform about the consequences of actions.36^,^37^,^38 A hypothesized neural substrate of reward signals in the M1 is dopaminergic projections from the ventral tegmental area (VTA), and previous studies have shown that these dopaminergic inputs from the VTA to M1 are essential for the reach-to-grasp skill learning, modulation of synaptic plasticity, and reorganization of the M1 network for storing new motor skills.6^,^11^,^39 However, they were focused on examining the long-term consequences of DA depletion or VTA inhibition. Here, we provide the evidence that DA is released both during the initiation of skilled forelimb movement and following reward consumption. Moreover, DA reward response tracks the delay in reward delivery and is absent following unrewarded joystick movement. This indicates that DA in M1 signals reward in real time and thus may directly influence motor cortical learning processes by informing about the consequences of actions and facilitating motor adaptation.

It has been shown that M1 undergoes structural and functional neuroplasticity changes during motor learning, and that DA is essential for these processes.11^,^12^,^13^,^16^,^39^,^40^,^41 Here, we found that populations of DA-sensitive D1+ and D2+ neurons were concurrently recruited during the onset of movement, but also responded to reward. During learning, there was no discernible change in population activity related to movement. This could be consistent with the earlier studies showing that the average fraction of neurons in M1 activated in each individual forelimb movement remains stable throughout learning, and the overall mean activity did not change.39^,^41^,^42 However, these studies used 2-photon microscopy, which allowed proving that learning was associated with reorganization of the neuronal activity dynamics and the network connectivity, processes that cannot be captured with fiber photometry. Furthermore, our findings did not confirm a causal relationship between DA release in M1 and D1+ or D2+ recruitment, which may also be due to limitations of fiber photometry. Still, the reward signals encoded by these populations are consistent with recent reports showing that the M1 neurons may represent the outcome of movement performance and that DA input to M1 is crucial for the outcome encoding,39^,^43 further supporting the notion that DA reward signals in M1 may shape the motor output.

The ability to execute motions over a range of amplitudes and speeds, known as movement vigor, is another aspect of motor skill learning, and it is generally accepted that DA invigorates and motivates movement.44^,^45 In our study, we found that the DA dynamics and population activity of DA-receptive neurons in the M1 reflect movement vigor (represented by movement velocity), but not necessarily the exerted effort (overcoming an increased amplitude threshold). This was evident when individual movements were divided into low- and high-amplitude ones, but not in the average responses calculated from the entire session. The observed increase in DA levels and greater recruitment of D1+ and D2+ populations in M1 could have reflected preparatory neural activity related to the decision to engage in vigorous action, which is consistent with the role of M1 as a processing site for voluntary motor commands. However, further research will be needed to fully understand how M1 dopaminoceptive circuits and their connections to downstream motor centers regulate vigor and reach kinematics.44^,^46^,^47

Axonal projections from the mouse M1 have been extensively studied, revealing a very complex network.3^,^48^,^49^,^50^,^51 We have previously provided details of the laminar distribution of D1+ and D2+ somas in the M1 forelimb area and traced their axonal projections.19 Here, we injected retrograde tracers into previously identified regions of axonal projections to determine the long-range connections of D1+ and D2+ projection neurons in more detail. Among the identified long-range connections, the majority were made with the contralateral cortex, and to a lesser extent with the striatum, thalamus, and pons. Nevertheless, even though both tracers were effective in retrograde labeling of projection neurons, we found a relatively low degree of colocalization between retrogradely labeled cells and tdTomato-expressing D1+ and D2+ neurons. We also found that the FGr tracer was more effective in labeling cortico-thalamic projections than AAV2-retro. This would suggest that either a small percentage of D1+ and D2+ cells are in fact projection neurons or that their number was underestimated. However, even though the AAV2-retro was demonstrated to be an effective retrograde viral tracer, it was suggested that the labeling specificity for cortico-thalamic projections is relatively weak, while that of cortico-striatal projections is moderate.32 Furthermore, previous studies utilized rabies or classic retrograde tracers (fluorogold and cholera toxin B subunit) to define the projections from the M1 forelimb area.3^,^50^,^51 Therefore, we cannot exclude the possibility that the use of other tracers would provide slightly different results.

Overall, our findings provide new insights into the role of DA and local dopaminoceptive circuits in the M1 in motor learning and execution of skilled behavior. We demonstrate for the first time that DA-dependent neuronal activity in M1 is temporally linked to skilled forelimb movements, provides information about the outcome of the movement, and may reflect the decision to perform vigorous action. This suggests that reinforcement motor learning of skilled behavior may be supported by the phasic DA signals in M1. Understanding reward signals in M1 may open new avenues for improving motor rehabilitation.36

Limitations of the study

Here, we used fiber photometry to monitor the population-level Ca^2+^ activity and DA release in the M1 forelimb area. Although fiber photometry provides satisfactory temporal resolution, it lacks cellular resolution, which may be crucial for capturing changes in the M1 network during motor learning.39^,^41 Furthermore, the heterogeneity of D1+ and D2+ populations,19 as well as the lower than predicted connectivity with downstream motor centers, may hinder our comprehensive understanding of their role in motor learning and movement execution. Further optogenetic investigation is required to determine the impact of the D1+ and D2+ populations in M1 on movement vigor and reach kinematics. Future research could also involve measuring the activity of D1+/D2+ neurons while modulating DA release, in order to clarify the relationship between DA action and neuronal activity in the M1.

Resource availability

Lead contact

Requests for further information and resources should be directed to and will be fulfilled by the lead contact, Przemyslaw E. Cieslak ([email protected]).

Materials availability

This study did not generate new, unique reagents.

Data and code availability

•Raw data have been deposited at RODBUK Cracow Open Research Data Repository: https://doi.org/10.57903/UJ/VE9HAJ.
•Python code used for data analysis has been deposited at Gniewosz Drwiega GitHub: https://github.com/gniewko-d/Joystick_task/.
•Joystick task is described by Belsey et al.20 Full build instructions and Arduino code can be found at the Yttri Lab GitHub: https://github.com/YttriLab/Joystick.
•Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

Acknowledgments

This research was funded in whole by the National Science Centre, Poland [grant SONATA 2020/39/D/NZ4/00503 (to P.E.C.)]. For the purpose of open access, the authors have applied a CC-BY public copyright license to any author accepted manuscript (AAM) version arising from this submission. M.G.-N. was supported by the 10.13039/501100004281National Science Centre, Poland [grant PRELUDIUM 2024/53/N/NZ4/04146]. G.D. was supported by the 10.13039/501100004281National Science Centre, Poland [grant PRELUDIUM 2022/45/N/NZ4/03171]. We would like to thank Alex Kwan (Yale University) for sharing the design for the head-fixation system.

Author contributions

P.E.C. designed the study and wrote the manuscript. M.G.-N. and P.E.C. performed the experiments and analyzed the data. G.D. wrote code for data analysis. J.R.P. provided transgenic mice. L.S. genotyped animals and managed the colony. M.G.-N., G.D., and J.R.P. revised and edited the manuscript.

Declaration of interests

The authors declare no competing interests.

STAR★Methods

Key resources table

REAGENT or RESOURCESOURCEIDENTIFIERAntibodiesRabbit Anti-RFPRocklandRRID: AB_2209751Rabbit Anti-GFPAbcamRRID: AB_305564Chicken Anti-GFPAbcamRRID: AB_300798Donkey Anti-Rabbit Cy3Jackson ImmunoResearchRRID: AB_2313568Donkey Anti-Rabbit Alexa 488Jackson ImmunoResearchRRID: AB_2340619Donkey Anti-Chicken Alexa 488Jackson ImmunoResearchRRID: AB_2340376Bacterial and virus strainsAAV9-hSyn-GRAB_DA2h_AddgeneRRID: Addgene_140554AAV1-Syn-Flex-NES-jRGECO1aAddgeneRRID: Addgene_100853rAAV2-Retro-hSyn-EYFPUNC Vector CoreRRID: Addgene_81070Chemicals, peptides, and recombinant proteinsNormal Donkey SerumJackson ImmunoResearchRRID: AB_2337258Triton X-Sigma AldrichCat#: T8787Fluoroshield with DAPISigma AldrichCat#: F6057Parafin OilSigma-AlrdichCat#: 76235SaccharinSigma-AlrdichCat#: 109185PBS 7.4RothCat#: 1108.1FormaldehydePOCHCat#: 432173111Tolfedine 4%VetoquinolN/AKetamineBiowetN/ASedazineBiowetN/AMorbitalBiowetN/ADeposited dataRaw dataThis paperhttps://doi.org/10.57903/UJ/VE9HAJAnalysis scriptsThis paperhttps://github.com/gniewko-d/Joystick_task/Experimental models: Organisms/strainsD2Cre: B6.FVB(Cg)-Tg(Drd2-cre)ER44Gsat/MmucdMMRRC: UCDRRID: MMRRC_032108-UCDD1Cre: B6.FVB(129S6)-Tg(Drd1a-cre)AGsc/KndlJThe Jackson LaboratoryRRID: IMSR_JAX:030329D1-tdTomato: B6.Cg-Tg(Drd1a-tdTomato)6Calak/JThe Jackson LaboratoryRRID: IMSR_JAX:016204Ai14: B6.Cg-Gt(ROSA)26Sor^tm14(CAG-tdTomato)Hze^/JThe Jackson LaboratoryRRID: IMSR_JAX:007914Software and algorithmsRWD Fiber Photometry Analysis SoftwareRWD Life Sciencehttps://www.rwdstco.com/product-item/r820-tricolor-multichannel-fiber-photometry-system/ImageJNIHhttps://imagej.net/ij/Zen Imaging SoftwareZeisshttps://www.zeiss.com/microscopy/en/products/software/zeiss-zen.htmlGraphPad Prism 6GraphPad Softwarehttps://www.graphpad.com/Python 3.9Anaconda Software Distributionhttps://www.anaconda.com/MATLAB R2024aMathWorkshttps://www.mathworks.com/products/matlab.htmlOtherTricolor Multichannel Fiber Photometry SystemRWD Life ScienceModel: R820Automated Stereotaxic InstrumentRWD Life ScienceModel: 71000Widefield Light MicroscopeZeissModel: Axio Imager M2VibratomeLeica BiosystemsModel: VT1200SGlass Microelectrode PullerNarishigeModel: PE-21Glass Capillary NanoinjectorNeurostarModel: NanoWData Acquisition BoardArduinoModel: Mega 2560 Rev3Capacity Touch SensorAdafruitModel: MPR121USB 3.0 CameraBaslerModel: acA1440-220uc5 MP LensBaslerModel: C125-0818-5MUSB 3.0 CameraMindVisionModel: MV-SUA502C-T5 MP LensZLKCModel: VM0420MP5Super-Bond Universal KitSun Medicalhttps://www.sunmedical.co.jp/english/product/super-bond/universal-kit/index.htmlBorosilicate Glass PipettesSutter InstrumentCat#: B114-53-10NPCeramic FerruleThorLabsCat#: CFLC230-10Miniature Hall Effect JoystickRuffy ControlsCat#: TS1-1-R-R-1-BKLaboratory ChowAltrominCat#: VRF1Hylo GelUrsapharmN/AFluoro-GreenTombow PencilN/A

Experimental model and study participant details

All animal procedures were approved by the 2nd Local Institutional Animal Care and Use Committee in Krakow (approval number 223/2022, issued on 04 August 2022) and conducted in accordance with the Directive 2010/63/EU of the European Parliament, and of the Council of 22 September 2010 on the protection of animals used for scientific purposes. Mice were housed 2–5 per cage in an animal facility room with a controlled temperature (22 ± 2°C) and humidity (40–60% RH), under a 12 h light/dark cycle. Unless otherwise specified, mice had ad libitum access to water and laboratory chow (VRF1, Altromin). D1Cre mice26 were obtained from the German Cancer Research Center, Heidelberg, and D2Cre27 from the University of California, Davis (MMRRC_032108-UCD). The D1-tdTomato line 630 and Ai14 (tdTomato) Cre reporter line31 were purchased from The Jackson Laboratory (IMSR_JAX: 016204 and IMSR_JAX: 007914). For the purpose of the project, the D2Cre and Ai14 (tdTomato) strains were crossed to obtain D2Cre::Ai14, double transgenic animals. All mice were congenic with the C57BL/6N background (>8 generations of backcrosses prior to initiation of the study). Genotyping was performed using a standard PCR assay according to previously described protocols and genotyping protocols available in the JAX database. Mice of both sexes, aged 8–12 weeks at surgery, were used in the experiments.

Method details

Surgery

Mice were anesthetized with a mixture of ketamine (100 mg/mL, Biowet) and xylazine (20 mg/mL, Biowet) and placed into an automated stereotaxic instrument (model 71000, RWD Life Science Co., Ltd.). During surgery, eyes were protected from drying by application of eye drops (Hylo Gel, Ursapharm), and body temperature was maintained at 37°C by an automatic heating pad. After cutting the skin and exposing the skull surface, a small craniotomy above the region of interest was made using a hand drill. Borosilicate glass pipettes (B114-53-10NP, Sutter Instrument) were pulled with a vertical glass microelectrode puller (model PE-21, Narishige) to obtain ∼30 μm tips, backfilled with paraffin oil (76235-500 ML, Sigma-Aldrich), and connected to a glass-capillary Nanoinjector (model NanoW, Neurostar GmbH). The adeno-associated virus (AAV) solution or fluorescent emulsion was front-loaded into the pipette and injected into the brain tissue at a rate of 1 nL/s. The pipette was kept in place for 5 min after injection before being slowly withdrawn.

The following viral vectors and fluorescent tracer were used in the experiments: AAV9-hSyn-GRAB_DA2h_24 and AAV1-Syn-Flex-NES-jRGECO1a25 for in vivo photometry; rAAV2-Retro-hSyn-EYFP32 and Fluoro-Green33 for retrograde tracing. Viral vectors were obtained from Addgene (140554-AAV9, 100853-AAV1) or UNC GTC Vector Core (rAAV2-retro); Fluoro-Green (FGr) was manufactured by Tombow Pencil Co., Ltd. AAVs were injected at a volume of 100–200 nL, and FGr at a volume of 20–60 nL per site, according to the following coordinates: M1 forelimb area (M1: A/P 0.5 mm; M/L 1.5 mm from the bregma; D/V −0,9 mm [D1Cre] or −0,4 mm [D2Cre] from pia); dorsolateral striatum (DLS: A/P 0.5 mm; M/L 2.5 mm; D/V −3,3 to −3,0 mm from the bregma); thalamus (THAL: A/P −1,4 mm; M/L 1.5 mm; D/V −3,8 to −3,3 mm from the bregma) or pontine nucleus (PN: A/P −4,2 to −4,0 mm; M/L 0.5 mm; D/V −5,6 to −5,4 mm from the bregma).

Fiber-optic cannulas (optical fiber: Ø200 μm, 0.39 NA; ceramic ferrule: Ø230 μm, 1.25 mm/6,4 mm long; Thorlabs) were made in-house and implanted during the same surgery. In the photometry experiment, the tip of the optical fiber was inserted into the right M1 and positioned 150 to 200 μm above the injection site. Following the placement of the optical fibers, two anchor screws and the custom-made stainless steel or 3D-printed headplate were attached to the exposed surface of the skull. Dental adhesive resin cement (Super-Bond Universal, Sun Medical) was used to secure everything in place. After surgery, animals were given subcutaneous injections of analgesic, anti-inflammatory drug Tolfedine 4% (40 mg/mL, Vetoquinol) and 5% glucose solution.

Joystick task

The joystick task was modeled after the open-source joystick manipulandum for mouse research described by Belsey et al. (2020),20 with modifications to the Arduino code and setup design. The behavioral training setup consisted of a custom-made head-fixation platform made of solid aluminum optical breadboards and optical post assemblies (Thorlabs); a control unit and data acquisition board (Arduino Mega 2560 Rev3, Arduino); an SD card reader module; a water delivery circuit controlled by a mini-solenoid valve (12 V, 0.04 MPa); and peripherals for interacting with the animal: a spring-loaded miniature hall effect joystick (TS1, Ruffy Controls) and lick spout (blunt tip needle) coupled to a capacity touch sensor (MPR121, Adafruit) for lick detection. Mouse behavior was recorded using a USB 3.0 camera (model MV-SUA502C-T, MindVision) equipped with a 5 MP fixed 4 mm focal length C-Mount lens (model VM0420MP5, ZLKC) or a USB 3.0 camera (acA1440-220uc, Basler) equipped with a 5 MP 8 mm fixed focal length C-Mount lens (C125-0818-5M, Basler) placed to the left of the animal. The camera captured 90 frames per second at a resolution of 640 by 480 pixels.

Mice were given about a week to recover from surgery before starting water restriction. During this period, animals received 1 mL of water per day and were gradually acclimated to head restraint for another week. After two weeks, head-fixed mice were trained to make self-initiated (uncued) forelimb movements with a joystick to acquire a delayed water reward. When the joystick position surpassed the amplitude threshold by 3 mm, a 3 μL drop of saccharin-flavored (0.01% w/v) water was delivered after a 1 s delay, followed by a 3 s inter-trial interval (ITI) in which no movement would be rewarded. The joystick position at the end of the ITI served as the starting point for the next trial (almost always close to the central default location). Head bar length and joystick distance from the platform were increased by 5 mm after the 3^rd^ session, in order to develop the strength and endurance necessary to complete the task. In the initial days, some mice needed assistance from the experimenter to grab and move the joystick; these trials were not included in the analysis. Once the initial connection between joystick movement and reward was established (usually by session 7), the set reward threshold was raised from 3 mm to 6 mm and maintained as the default threshold for the rest of the experiment. After 14 training sessions, the animals went through extra sessions in which the reward threshold was suddenly increased (to 9 mm), and the reward was either delayed (by 3 s) or not given at all (omission). Regular sessions with a 6 mm threshold and a 1 s reward delay were used to break up these unique sessions. Each animal completed up to 25 sessions.

Each time the suprathreshold joystick movement was detected, a TTL signal was sent from the Arduino-controlled behavioral setup to a fiber photometry recording device. The behavioral events (joystick movements and licks) were recorded and stored on SD card as a CSV file. A custom-made Python script was used to further analyze the training session data offline. Mice were trained once a day, 6 days per week, and behavioral sessions lasted up to 20 min. At the end of the training session, the amount of water that each animal had consumed was calculated, and if required, mice were given extra water to supply 1 mL each day.

Fiber photometry recording

Fiber photometry acquisition was performed with a commercial Tricolor Multichannel Fiber Photometry System (model R820, RWD Life Science) equipped with three LED excitation light sources – 560-nm for the red calcium indicator (jRGECO1a), 470-nm for the green dopamine probe (GRAB_DA2h_), and 410-nm for the reference signal. A multi-branched fiber-optic patch cable (2 m long, Ø200 μm, 0.37 NA, RWD Life Science) with two isolated fibers, permitting signal recording in two animals at the same time, was attached to the implanted fiber-optic cannulas with a zirconia sleeve. LEDs were set to provide ∼30 μW of light intensity at the fiber tip; light intensity was kept constant across sessions. Temporally intermingled excitation pulses of alternating wavelengths were delivered to avoid crosstalk between green and red fluorescence, and emissions were collected by two independent CMOS camera detectors at 20 Hz. To minimize the possibility of fluorescence signal bleaching, in vivo recording data were collected every other day within 20 min.

A dedicated fiber photometry system software was used for signal acquisition and initial processing (RWD Life Science). Pre-processing of the recorded signals included: (1) baseline-correction, which used an iterative weighted least square algorithm to correct the trend of decreasing fluorescence; and (2) motion-correction, which used the robust linear fit to fit the 410-nm signal to 470-nm or 560-nm signals. The motion-corrected signal was then obtained by subtracting the fitted 410-nm signal from the signal of interest. The change in fluorescence relative to the baseline fluorescence (ΔF/F) was further calculated as ΔF/F = (F-F1)/F0, where F represents the target fluorescence, F1 is the fitted 410-nm data, and F0 is the median of the raw fluorescence of the whole session; in the case of the peri-event analysis, F0 was the median of the raw fluorescence from the time window −2 to 0 s. Signals were further normalized as Z score = (x-mean)/std, where x = ΔF/F.

A built-in peak statistics algorithm was used to detect individual transients, defined as events with amplitudes greater than two times the median absolute deviation (MAD) above the median of the time window from 100 to 1300 s. These criteria generally selected peaks that aligned with a human observer’s judgment for detecting transients and behavioral outcomes (individual forelimb movements). For the purpose of analysis of event-related activity, the fluorescence signals aligned with threshold crossing (T = 0 s) were collected at 50 ms intervals, within the time window of −2 to 6 s. Events were saved to CSV files. A custom-written Python script was used for further processing, including trial sorting and data averaging. For the averaged peri-event traces, we compared the signal activity of one signal (e.g., dopamine release) against another signal (e.g., D1+ or D2+ neuronal signal) using the MATLAB crosscorr function. We computed the cross-correlation coefficient across various time lags, ranging from −1 s to +1 s. The cross-correlation coefficients were plotted against the lag values, with positive lags indicating that the first signal leads the second, and negative lags indicating that the second signal leads the first.

Histology

Mice were euthanized via intraperitoneal injection of Morbital (Biowet) (sodium pentobarbital 133.3 mg/mL + pentobarbital 26.7 mg/mL) and perfused with ice-cold phosphate-buffered saline (PBS, pH 7.4, Roth) followed by 4% formaldehyde (pH 7.4, POCH). Dissected brains were fixed in 4% formaldehyde for 24h at 4°C. Next, brains were transferred to a 30% sucrose solution for at least 48h at 4°C. Coronal slices (50 or 100 μm) containing the motor cortex and striatum, thalamus, or pons were cut on an automatic vibrating blade microtome (VT1000S, Leica Biosystems) or manual cryostat microtome.

Slices were blocked in PBS-T containing 10% Normal Donkey Serum (017-000-121, Jackson ImmunoResearch) and 0.6% Triton X-100 (Sigma-Aldrich) for 1 h at room temperature; followed by incubation with primary antibody dissolved in PBS-T (2% NDS, 0.3% Triton X-100) in antibody dilution solution 1:1000 overnight at 4°C. The following primary antibodies were used: rabbit anti-RFP (600-401-379, Rockland); rabbit anti-GFP (ab6556, Abcam); chicken anti-GFP (ab13970, Abcam). After primary antibody incubation, the tissue was washed in PBS-T and incubated with secondary antibody in antibody dilution solution 1:400 overnight at 4°C. The following secondary antibodies were used: Cy3 AffiniPure F(ab')2 Fragment Donkey Anti-Rabbit IgG (H + L) (711-166-152, Jackson ImmunoResearch); Alexa Fluor 488 AffiniPure F(ab')2 Fragment Donkey Anti-Rabbit IgG (H + L) (711-546-152, Jackson ImmunoResearch); Alexa Fluor 488 AffiniPure F(ab')2 Fragment Donkey Anti-Chicken IgY (IgG) (H + L) (703-546-155, Jackson ImmunoResearch). The tissue was then washed with PBS and mounted on glass slides using Fluoroshield with DAPI (F6057, Sigma-Aldrich). Brain sections obtained from Drd2::CreAi14 mice injected with FGr were mounted on glass slides without incubation.

Images were acquired using a Zeiss Axio Imager M2 microscope equipped with an Axiocam 503 mono camera. Whole brain images were acquired with a 5× objective and z stack images of regions of interest were acquired with a 20× objective. ZEN Microscopy Software (Zeiss) was used for image acquisition and initial preprocessing (stitching and background subtraction). ImageJ (NIH) image analysis software was further used for image processing (cropping, brightness, and color adjustment). Cell counting was performed in a ∼850 μm wide region of interest across the motor cortex, using the Cell Counter plugin for ImageJ.

Quantification and statistical analysis

Results are presented as mean ± SEM. Statistical analysis was based on the assumption that the samples follow a Gaussian distribution. Student’s t test was applied for statistical comparisons between the two groups, and ANOVA followed by post-hoc analysis (Bonferroni’s multiple comparison test) was used for analysis with multiple groups, and repeated measures were incorporated when appropriate. p < 0.05 was considered statistically significant. All statistical analyses were conducted using GraphPad Prism 6 software (GraphPad Software, Inc.). Detailed results of the statistical analyses are presented in Table S1.

Bibliography51

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Guo J.-Z.Graves A.R.Guo W.W.Zheng J.Lee A.Rodríguez-González J.Li N.Macklin J.J.Phillips J.W.Mensh B.D.Cortex commands the performance of skilled movemente Life 42015 e 1077410.7554/e Life.10774 PMC 474956426633811 · doi ↗ · pubmed ↗
2Arber S.Costa R.M.Connecting neuronal circuits for movement Science 36020181403140410.1126/science.aat 599429954969 · doi ↗ · pubmed ↗
3Muñoz-Castañeda R.Zingg B.Matho K.S.Chen X.Wang Q.Foster N.N.Li A.Narasimhan A.Hirokawa K.E.Huo B.Cellular anatomy of the mouse primary motor cortex Nature 598202115916610.1038/s 41586-021-03970-w 34616071 PMC 8494646 · doi ↗ · pubmed ↗
4Roth R.H.Ding J.B.Cortico-basal ganglia plasticity in motor learning Neuron 11220242486250210.1016/j.neuron.2024.06.01439002543 PMC 11309896 · doi ↗ · pubmed ↗
5Hosp J.A.Molina-Luna K.Hertler B.Atiemo C.O.Luft A.R.Dopaminergic Modulation of Motor Maps in Rat Motor Cortex: An In Vivo Study Neuroscience 159200969270010.1016/j.neuroscience.2008.12.05619162136 · doi ↗ · pubmed ↗
6Hosp J.A.Pekanovic A.Rioult-Pedotti M.S.Luft A.R.Dopaminergic Projections from Midbrain to Primary Motor Cortex Mediate Motor Skill Learning J. Neurosci.3120112481248710.1523/JNEUROSCI.5411-10.201121325515 PMC 6623715 · doi ↗ · pubmed ↗
7Hosp J.A.Nolan H.E.Luft A.R.Topography and collateralization of dopaminergic projections to primary motor cortex in rats Exp. Brain Res.23320151365137510.1007/s 00221-015-4211-225633321 · doi ↗ · pubmed ↗
8Tennant K.A.Adkins D.L.Donlan N.A.Asay A.L.Thomas N.Kleim J.A.Jones T.A.The Organization of the Forelimb Representation of the C 57BL/6 Mouse Motor Cortex as Defined by Intracortical Microstimulation and Cytoarchitecture Cereb. Cortex 21201186587610.1093/cercor/bhq 15920739477 PMC 3059888 · doi ↗ · pubmed ↗