Neonatal social communication and single genes predict the variability of post-pubertal social behavior in a mouse model of paternal 15q11-13 duplication
Noboru Hiroi, Takahira Yamauchi, Kota Tamada, Takeshi Takano, Mitsuteru Nakamura, Mariel Barbachan e Silva, Kenny Ye, Hitoshi Inada, Takaki Tanifuji, Takeshi Hiramoto, Lucas Stevens, Gina Kang, Marisa Esparza, Takefumi Kikusui, Noriko Osumi, Pilib Ó Broin, Toru Takumi

TL;DR
The study shows that early mouse vocalizations and gene expression can predict later social behavior in a model of a genetic disorder linked to neurodevelopmental issues.
Contribution
A novel computational approach linking neonatal social communication and gene expression to predict post-pubertal social behavior in a CNV mouse model.
Findings
Neonatal call sequences in mice with 15q11–13 duplication lack incentive value for social communication.
Expression levels of Magel2, Herc2, and Ndn in the prefrontal cortex predict post-pubertal social interaction variability.
Variability in neonatal social communication predicts later social behavior in the CNV mouse model.
Abstract
Mental illnesses associated with high-risk copy number variations (CNVs) are characterized by incomplete penetrance and variable severity, with their underlying mechanisms remaining inadequately understood. We hypothesized that such phenotypic variability is evident from the neonatal stage and is, at least in part, attributable to individual differences in the expression levels of CNV-encoded genes in the brain. We conducted an analysis of the quantitative and functional structure of neonatal social communication, assessed post-pubertal social interaction, and evaluated the brain expression levels of genes within the same cohort of a mouse model of paternal human 15q11–13 duplication, a high-risk factor variably associated with neurodevelopmental disorders. Subsequently, computational methods were utilized to identify predictive variables for the variability of post-pubertal social…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomic variations and chromosomal abnormalities · Congenital heart defects research · Williams Syndrome Research
Introduction
Copy number variations (CNVs) represent chromosomal deletions and duplications of several million base pairs within and are associated with significantly elevated risks of developing mental illnesses, including schizophrenia, autism spectrum disorder (ASD), intellectual disability, bipolar disorder, mood disorders, attention-deficit/hyperactivity disorder, and various other psychiatric disorders [1]. These genetic variations serve as a reliable entry point for investigating the mechanistic underpinnings of psychiatric disorders[2–4]. However, a significant challenge in formulating therapeutic strategies based on CNV is that each CNV encompasses multiple genes, and it remains unclear whether all encoded genes and their molecular pathways are functionally relevant to a specific phenotype and how each gene contributes to a wide range of phenotypes [3, 5, 6].
One potential strategy to address this challenge is to identify variants among CNV-encoded genes in individuals without CNVs. Recent large-scale analyses have detected ultra-rare variants of CNV-encoded single genes in idiopathic cases of schizophrenia and ASD [7–12]. Although these studies have provided valuable insights, considerable interpretative challenges persist. Firstly, the limited number of identified variants for each gene does not allow for statistically robust associations between variants and psychiatric disorders. Secondly, the absence of ultra-rare variants of a CNV-encoded gene does not constitute evidence of its lack of contribution to psychiatric disorders. Indeed, studies with larger sample sizes tend to uncover more ultra-rare single-gene variants of CNVs compared to those with smaller sample sizes. Due to the technical challenges of identifying CNV-encoded driver genes in humans, it is often postulated, without empirical validation, that all genes within each CNV functionally contribute to mental illness or specific phenotypes.
Phenotypic variability is another aspect of human CNVs. Not all individuals harboring CNVs are diagnosed with a psychiatric disorder (i.e., incomplete penetrance), and the same CNVs may contribute to diverse disorders in different individuals (i.e., pleiotropy). Furthermore, the severity of each disorder varies among individual CNV carriers (i.e., variable expressivity) [3, 5]. Environmental, stochastic, prenatal, and postnatal factors play a role in these variabilities. While individual variation in CNV-encoded gene expression in the brain is another potential contributor, quantifying CNV-encoded gene expression levels and their variability in the human brain during and around symptomatic phases is not feasible.
The developmental origins of phenotypic variability constitute another fundamental question that remains unresolved. While the presence of severe core symptoms is a prerequisite for clinical diagnosis, various social, cognitive, and motor measures deviate from normative standards among infants who are later diagnosed with psychiatric disorders [13–15]. Atypical preverbal vocalizations are one such measure [16–18] and has been proposed as a prerequisite for the development of subsequent social behavior impairments [16, 19]. However, establishing a prospective relationship between neonatal indicators and later phenotypes in the same individuals requires years, or even decades, of observation in humans.
Duplications at human chromosome 15q11–13 raise the risk of developing intellectual disability, ASD, and schizophrenia[1]. Maternally-derived 15q11–13 duplication s are more frequently observed in individuals with various neurodevelopmental phenotypes, compared to paternally-derived cases [20–32]. However, paternally-derived duplications and triplications are variably linked to an elevated risk for ASD, intellectual disability, developmental delays, epilepsy/seizures, behavioral problems, or a combination of these diagnoses.[22, 26–28, 32–40] However, the source of phenotypic variability remains unclear. Ultra-rare protein-truncating variants of several 15q11–13 genes, such as GABRA5 and GABRB3 in schizophrenia[10], MAGEL2 in ASD[8], and GABRA5 and ATP10A in bipolar disorder[41], have been identified with high odds ratios. One patient diagnosed with ASD has been reported to carry a small paternally inherited duplication limited to a small region including TUBGCP5, CYFIP1, NIPA2, NIPA1, MKRN3, MAGEL2, and NDN [27]. While these associations with single genes are suggestive, they have not consistently reached statistical significance due to their rarity, and the contributions of each encoded gene to social dimensions remain unclear.
The technical limitations inherent in human studies pose challenges to mechanism-based predictions and therapeutic advancements. A genetic mouse model provides a complementary approach to address these methodological challenges[3–5, 42]. Although mouse models do not fully replicate the entire symptomatology of each clinically defined psychiatric disorder, they offer a means to model certain aspects of mental illness [2, 3, 5]. Deficits in social dimensions are observed in both idiopathic and CNV-linked schizophrenia and ASD[15, 43–47]. CNV-encoded single genes in mouse models are implicated in cognitive and social dimensions[3–5, 48–50] that are adversely affected in humans with CNVs [51–55] and in idiopathic schizophrenia and ASD [3].
To evaluate the hypothesis that the variability of post-pubertal social behavior has a neonatal origin and is influenced by expression levels of CNV-encoded genes, we utilized a mouse model of paternally-inherited 15q11–13 duplication [56]. We sequentially assessed neonatal vocalizations, post-pubertal social interactions, and gene expression in the post-pubertal brains of the same cohort in the paternal duplication model of 15q11–13. A machine learning algorithm was employed to identify predictive variables for post-pubertal social interaction based on a pooled set of neonatal social communication variables and post-pubertal gene expression data. Our results identified specific predictive neonatal and gene variables associated with post-pubertal social behavior. This computational approach to a preclinical model offers a powerful and complementary means to enhance our understanding of the developmental and genetic origins of phenotypic variability linked to dimensions of CNV-associated mental illness.
Methods
Ethics Approval
The use of animal subjects complied with protocols approved by the Animal Care and Use Committee of the Albert Einstein College of Medicine (Animal Welfare Assurance A3312–01), the University of Texas Health Science Center at San Antonio (Animal Welfare Assurance 3345–01, 20190084AR), the RIKEN Center for Brain Science (2018–056), and Kobe University School of Medicine (P200104-R12), in accordance with guidelines established by the National Institutes of Health.
Mice
We utilized a congenic mouse model with a duplication of the 15q11–13 region; the original mouse lineage [56] was backcrossed to C57BL/6J mice for over 10 generations to reduce the unequal genetic backgrounds between wild-type and mutant littermates[4]. Each breeding pair consisted of one male paternal 15q11–13 duplication (Dup/+) model mouse (aged 10–20 weeks) [56] and one female C57BL/6J mouse (aged 10 to 30 weeks) sourced from Japan SLC, Inc., Shizuoka, Japan. Offspring were housed in groups of two to four male wild-type and Dup/+ littermates per cage. The minimum sample size was determined through power analyses based on prior studies. All male mice in each litter were used. The litter sizes were indistinguishable between +/+ and Dup/+ mice (+/+, Average=4.6, SEM=0.327; Dup/+, Average=4.7, SEM=0.334; Mann-Whitney U test, U=350.5, p=0.850).
Male mice were randomly assigned to experimental groups, and experimenters were blinded to genotypes during group allocation and testing (see Supplementary Table S2 for sample sizes). We did not use female mice for three primary reasons. First, initiating testing of post-pubertal social interactions at the same estrous stage across all female mice at 10 weeks of age was technically difficult. Second, conducting social interaction assessments and subsequent sacrifices at the same estrous stage posed technical challenges. Third, brain gene expression fluctuates throughout the estrous cycle in mice[57].
Genotyping was performed using forward (5′-ATATGTACTTTTGCATATAGTATAC-3′) and reverse (5′-AGAGGAGGGCCTTACTAATTACTTA-3′) primers.
Behavioral Assays
Ultrasonic vocalization was assessed during maternal separation at postnatal days (P) P8 and P12, and post-pubertal social behavior was evaluated at 10 weeks of age. Neonatal vocalizations were analyzed both qualitatively and quantitatively, while post-pubertal social behavior was recorded and manually rated following established methodologies [50, 58–64].
To evaluate the incentive value of neonatal vocalizations, we measured the approach behavior of C57BL/6J mothers that had not previously been exposed to the vocalizations of either +/+ or Dup/+ pups from the 15q11–13 duplication model, following our procedures [61](see Supplementary Materials).
Quantitative Reverse Transcription Polymerase Chain Reaction
The mice were euthanized after the completion of behavioral testing,. The prefrontal cortex (the entire cortex anterior to Bregma +1.98) and the remaining forebrain (excluding the olfactory bulb) were dissected, frozen, and utilized for quantitative reverse transcription polymerase chain reaction (qRT-PCR) analyses (see Supplementary Materials and Supplementary Table S1). The cerebellum and brainstem were excluded from this analysis.
Computational analyses
We conducted Shannon entropy analysis, the Markov model, UMAP (uniform manifold approximation and projection), and the Lasso (least absolute shrinkage and selection operator) regression model in accordance with our previously published methodologies [59, 61, 65, 66] (see Methods and Supplementary Materials).
Statistical Analysis
We used SPSS (v29.0.2.0 (20), IBM Corporation). Group means were compared using analysis of variance. In instances where interaction effects were statistically significant, Student’s T-tests were employed to elucidate the nature of the interaction; two-sided t-tests were applied for comparisons between two groups. A p-value of less than 0.05 was considered statistically significant. The significance level for multiple tests was adjusted using the Benjamini–Hochberg correction at false discovery rates (FDR) of 5%, 10% and 25%. Data that violated the assumptions of homogeneity of variance or normality were analyzed using linear mixed effect models with the individual animal as the random effect or nonparametric tests. All statistical values are detailed in Supplementary Table S1.
Results
Characterization of Wave Shapes and Patterns of Neonatal Vocalizations
Our previous work indicated that paternal Dup/+ pups exhibited a significantly higher frequency of ultrasonic vocalizations compared to their +/+ littermates on postnatal days 7 (P7) and 14 (P14). However, no significant differences were observed between the two groups before or after this time frame [56]. Consequently, in the current study, we selected postnatal days 8 (P8) and 12 (P12) as the periods of interest for evaluating ultrasonic vocalizations.
Consistent with this previous observation of ours, the total number of ultrasonic vocalizations at P12 is greater in Dup/+ pups than in +/+ littermates, although no statistically significant differences are noted between the two genotypes at P8 (Fig. 1A, inset).
Neonatal mouse vocalizations are characterized by a diverse array of wave shapes and patterns [67], with these parameters being differentially influenced by specific genes or sets of genes associated with neurodevelopmental disorders [59, 61, 63, 65, 68, 69]. To further investigate the impact of the Dup/+ genotype on various call types, we conducted an analysis utilizing our previously published methodology for call classification [65] (Figure S1; Supplementary Materials, Call-type classification). At P12, Dup/+ pups emitted a greater number of harmonic, step-up, two-step, step-down, multiple-step, downward, chevron, and ambiguous calls compared to +/+ littermates. Conversely, Dup/+ pups exhibited lower frequencies of upward calls at P8 and reverse chevron calls at P12 in comparison to +/+ pups (Fig. 1A).
To determine whether the observed differences in call counts could be attributed to an overall increase in vocalizations by Dup/+ pups, we analyzed the proportions of each call type within each genotype. Dup/+ pups displayed significantly higher proportions of harmonic, step-up, two-step, multiple-step, and ambiguous call types relative to +/+ littermates. In contrast, Dup/+ pups proportionally emitted fewer short, upward, and reverse chevron calls than their +/+ counterparts (Figure S2).
In summary, the Dup/+ genotype significantly influences the emission of harmonic, step-up, two-step, multiple-step, short, upward, reverse chevron, and ambiguous calls beyond what would be expected based solely on their absolute numbers. The genotype-dependent variations in the number of step-down, downward, and chevron calls can be attributed to their higher frequencies observed in Dup/+ pups. Although the counts of the short call type did not show significant differences between the two genotypes, Dup/+ pups exhibited a proportionally lower emission of these calls compared to +/+ pups.
Characterization of Acoustic Features of Neonatal Vocalizations
In addition to categorizing neonatal calls by wave shapes and patterns, each call exhibits a distinct set of quantitative acoustic properties. We employed the Uniform Manifold Approximation and Projection (UMAP) method [70] to elucidate the dimensionality of the quantitative features of calls. UMAP is a dimensionality reduction technique based on Riemannian geometry and algebraic topology, which effectively clusters data exhibiting similar quantitative characteristics.
The quantitative acoustic parameters of the calls, analyzed using VocalMat [71], include: a) bandwidth measured in Hz; b) maximum, mean, and minimum frequencies in Hz of a call or its principal components; and c) maximum, mean, and minimum intensities of each call. The UMAP analysis resulted in the formation of four principal spatial clusters based on these quantitative properties. Subsequently, we incorporated the categorical classification of calls into the UMAP framework (Fig. 1B).
Various call types were represented within or across these clusters (Fig. 1B). Each cluster displayed a unique set of call-type representations (Figure S3–1 to S3–4), but the shapes and positions of the four clusters did not seem to differ between the Dup/+ and +/+ genotypes, suggesting that the Dup/+ genotype did not substantially influence the overall quantitative features of the calls. Conversely, the density of certain calls exhibited genotype-dependent variations. The genotype-dependent differences in call numbers, indicated by the dots in Fig. 1B, were more pronounced in the data from postnatal day 12 (P12). Specifically, Dup/+ pups exhibited a greater number of calls in Clusters 1, 2, and 4 compared to +/+ pups at P12, whereas an opposite trend was observed in Cluster 3. These patterns align with the categorization analysis of call types (see Fig. <link rid=“fig1”>1</link>A; S3–1 to S3–4). Notably, the increased density of calls observed in Cluster 1 of P12 Dup/+ mice, compared to P8 Dup/+ (Fig. 1B), is attributed to higher levels of harmonic and step-up call types (see Figure S3–1). The increased calls in Cluster 1 from P8 to P12 in Dup/+ pups reflect more step-up, two-step, complex, multiple-step, and downward call types emitted (see Figure S3).
Determination of Call Sequences
The vocal call sequences of pups have significant biological implications. When presented with the complete call sequence of a pup, mothers exhibit a strong maternal response; however, if the sequence is randomized while retaining all call types and amplitudes, its ability to elicit a maternal approach is diminished[61]. Furthermore, the call sequences of pups with a gene-dose alteration associated with neurodevelopmental disorders (e.g., Tbx1) lack incentive value for maternal engagement[61].
Calls are emitted with varying inter-call intervals. While it remains unclear whether a functional unit exists within a series of mouse pup calls, we operationally defined a call sequence as comprising inter-call intervals shorter than those theoretically expected from a given number of calls emitted within a specified testing period (i.e., Poisson distribution)[59, 61, 65]. This methodology was applied to the calls of Dup/+ and +/+ littermates (Figure S4). The distribution curves of observed inter-call intervals intersected with those of the theoretical inter-call intervals at 362.41 ms and 408.66 ms for +/+ and Dup/+ pups, respectively, at P8; the intersection points were 366.55 ms and 314.99 ms for +/+ and Dup/+ pups, respectively, at P12. When an inter-call interval exceeded the intersection point values, calls preceding and succeeding the interval were classified as the last and first calls of two consecutive, distinct sequences, respectively; calls with inter-call intervals shorter than the intersection point values were considered components of a sequence.
Degree of Unpredictability of Call Selection and Sequencing
Following the definition of the call sequence, we examined whether calls are emitted unpredictably as single instances or in sequences. Shannon entropy scores were utilized to assess the degree of unpredictability in the selection of call types (H0), the distribution of distinct calls within the selected call types (H1), and the distributions of distinct call types in two-call sequences (H2), three-call sequences (H3), and four-call sequences (H4).
The degree of unpredictability decreased at each H level, indicating the presence of predictable elements as pups select call types and establish connections within sequences. Entropy scores consistently declined at each H level for both Dup/+ and +/+ pups at P8 and P12. The choice of call types and their connections had equal levels of unpredictability between +/+ and Dup/+ pups at each level at both P8 and P12 (Figure S5AB), except for sequences at H3 and H4 at P8 (Figure S5A). These findings suggest that Dup/+ pups and +/+ pups generated similarly predictable call sequences, despite the observation that Dup/+ pups emitted a greater number of calls than their +/+ counterparts at P12. At P8, Dup/+ pups emitted three- and four-call sequences more unpredictably than +/+ pups.
Dup/+ and +/+ pups did not differ in the number of calls per sequence (Figure S6A). By contrast, Dup/+ pups emitted longer sequences than +/+ pups at P8, but not at P12 (Figure S6B). Dup/+ pups emitted more sequences, compared to +/+ pups, at P12, but not at P8 (Figure S6C).
Proportions of Two-Call Sequences
The entropy analysis does not consider the variability of call types among individual pups. For example, if call type A is followed by call type B in one pup while call type C is followed by call type D in another, both scenarios yield an identical entropy score. Consequently, we additionally examined the specific call types in every pair of consecutive calls within the call sequences. We analyzed the proportions of each two-call connection (see Figure S7). This analysis revealed that both Dup/+ and +/+ pups emit a greater proportion of specific two-call connections (e.g., harmonic followed by harmonic at P8) and exhibit differentially expressed connections between the genotypes (e.g., at P8, chevron followed by chevron; downward followed by step-up; at P12, harmonic followed by harmonic; short followed by flat). At P12, the most proportionally frequent call connections were short followed by short in +/+ and harmonic followed by harmonic in Dup/+ pups. +/+ pups tended to make connections among simple call types (i.e., downward, short, flat, and upward; blue letter calls), while Dup/+ pups connected more diverse call types, including both simple waves and multiple waves (see more scattered call connections). These findings also indicate that the proportions of various two-call connections within sequences differ between the two genotypes. Indeed, the trajectories of call sequences appear to diverge in the three-dimensional UMAP space (see Figure S8). More diverse call transitions between cluster 3 and the other clusters appear in Dup/+ pups and transitions tend to occur more frequently within cluster 3 in +/+ pups, where simple call types aggregate (see Figures S3 and S7).
Finite State Rule in Two-Call Sequences
Pups display distinct sequences of calls, and temporal call transitions appear to differ between the genotypes (see Figure S7, S8). However, in this qualitative analysis, sequences that begin with less frequently emitted call types are underrepresented, while those starting with more frequently emitted call types are overrepresented. For instance, even if a particular call consistently precedes another call type, such a connection may not be well represented in terms of proportions if the first call’s frequency is low relative to the total call count. The Markov property addresses this technical limitation by positing that the future state depends solely on the current state. Consequently, the probability of the second call type in each two-call connection is calculated exclusively based on the first call type. In this framework, the probabilities of transitions from one call to the next are unaffected by the absolute number or proportion of the first call among all emitted calls. Our previous work indicated that mouse pups with Tbx1 heterozygosity, 16p11.2 hemizygosity, and Fmr1 deletion—risk gene variants linked to increased susceptibility to neurodevelopmental disorders—exhibit alterations in the finite state rule governing two-call connections within sequences[59, 61, 65].
At P8, both Dup/+ and +/+ pups utilized harmonic, step-up, downward, chevron, and short call types in repeats (e.g., harmonic followed by harmonic). Compared to +/+ pups, Dup/+ pups showed more diverse call connections between different call types (Fig. 1C, P8).
At P12, +/+ pups exhibited self-repeat connections within harmonic, downward, short, flat, and upward calls, while Dup/+ pups did so within step-up, harmonic, downward, chevron, and short call types (see Fig. 1C, P12). Both +/+ and Dup/+ pups made fewer, but non-identical connections among different call types.
Incentive Values of Neonatal Call Sequences for Maternal Approach
We previously demonstrated that the vocalizations of Tbx1 heterozygous pups did not elicit maternal approach behavior as efficiently as those of wild-type littermates, indicating that the incentive values of call sequences of heterozygous pups were diminished compared to those of wild-type pups [61]. To evaluate how the 15q11–13 duplication alters the incentive values of neonatal vocalizations, we exposed lactating C57BL/6J mothers, 12 days postpartum, to the P12 call sequences of +/+ and Dup/+ pups using our choice tube apparatus (Fig. 2A). The emitter utilized in this study consisted of a surface-heating thin-film electrode, a nanocrystalline silicon (ns-Si) layer, and a single-crystalline silicon wafer [72]; it efficiently generates ultrasound waves through heat transfer at the surface into the air, rather than mechanical vibration. This feature allows for the maintenance of a constant sound amplitude of up to 160 kHz [72, 73], which is crucial for replaying ultrasound vocal waves exceeding 100 kHz.
We selected call sequences from the P12 time point, as +/+ and Dup/+ pups exhibited the most pronounced differences in the number and proportions of emitted call types at this age (see Fig. 1A, S2, and S7). The representative +/+ and Dup/+ pups were chosen based on their proximity to the median numbers, median proportions, median two-call sequence numbers, and median two-call proportions (see Supplementary Figure S9–1-S9–4). For this selection process, we excluded Markov probabilities, as they identify two-call sequences based on finite states; while useful for uncovering hidden rules of call connections, they are not optimal for selecting the most frequently occurring sequences or calls.
While C57BL/6J mothers spent comparable amounts of time peeking at the entrances of both the sound and no-sound tubes (Fig. 2B), they spent significantly more time exploring the end of the sound tube compared to the no-sound tube when the representative +/+ vocalizations were presented (Fig. 2C, +/+ calls). In contrast, the mothers exhibited indistinguishable amounts of time at the ends of both sound and no-sound tubes in response to the representative Dup/+ vocalizations (Fig. 2C, Dup/+ calls). Additionally, the latencies with which mothers approached the entrances of the sound and no-sound tubes were indistinguishable (Fig. 2D).
To assess the significance of sound presentations and potential baseline side preferences, we measured these three parameters without call playback; otherwise, the experimental procedure remained identical to that involving sound presentations. The mothers did not display a preference between the two tubes regarding peeking time, exploration time at the ends, or latencies (Supplementary Figure S10A, B, and C).
In sum, +/+ calls possess greater incentive values than Dup/+ calls, as evidenced by the longer duration of exploration toward +/+ calls at the end of the sound tube than Dup/+ calls.
Post-Pubertal Social Behavior
The Dup/+ and +/+ pups tested at P8 and P12 were subsequently evaluated for social behavior upon reaching 10 weeks of age. Molecular mechanisms underlying physical social interaction cannot be effectively recapitulated in a testing environment where a barrier prevents direct reciprocal social interaction between the experimental and stimulus mice; data obtained under such restricted conditions are not reproducible [74, 75]. Indeed, we previously demonstrated that Dup/+ mice engage in less active direct social contact with another mouse compared to +/+ mice in a test environment that allowed direct physical interaction [56]. Consequently, we employed our standard naturalistic home-cage test apparatus, facilitating direct interactions among mice, as previously described[50, 56, 59, 60, 62–65]. Male Dup/+ mice exhibited reduced levels of active affiliative social interaction compared to their +/+ littermates (Fig. 3A). It is noteworthy that there was considerable score overlap between Dup/+ and +/+ mice, accompanied by significant variance within each genotype.
Expression of Mouse Ortholog Genes Encoded in 15q11–13 Copy Number Variation in the Brain
The observed expression levels of paternally and maternally inherited genes within the 15q11–13 region do not necessarily correspond to the expected levels in the brain tissues of humans [76] and mice [56]. However, the gene expression in the brain and behavioral variability have not been correlated in the same individuals or mice. We thus assessed the expression levels of ten protein-coding genes corresponding to the murine ortholog of the 15q11–13 region following an evaluation of social behavior at ten weeks of age in the same mice. The prefrontal cortex and the remaining forebrain (excluding the olfactory bulb) were utilized to quantify gene expression levels via quantitative reverse transcription polymerase chain reaction (qRT-PCR). Our primary analysis focused on the prefrontal cortex, given its well-established functional significance in social behavior in mice [77–81].
Expression levels of all genes, except for Ube3a and Cyfip1, were significantly elevated in the prefrontal cortex of Dup/+ mice compared to their +/+ littermates (see Fig. 3B). The absence of elevated expression of Ube3a in Dup/+ mice, as it is a maternally inherited gene, with its paternal copy silenced by genomic imprinting. Similarly, Cyfip1 is located outside the duplicated segment, rendering its overexpression unexpected. Non-imprinted genes (Herc2, Gabrg3, Gabra5, Gabrb3, and Atp10a) are expressed from both paternal and maternal copies in +/+ mice, leading to expression levels of approximately 1.5 relative quantification (RQ) from three copies in paternal Dup/+ mice. Although Atp10a was previously presumed to be maternally inherited due to paternal imprinting, evidence indicates that it is expressed from both paternal and maternal copies in the embryonic, neonatal, and adult brains of mice and humans [82–84]. The paternally inherited genes (Snrpn, Ndn, Magel2, and Mkrn3) are expressed as a single copy in +/+ mice, as the maternal copy is silenced by genomic imprinting. Consequently, their expression levels reached approximately 2.0 RQ in paternal Dup/+ mice. Notably, the expression levels of Mkrn3 were significantly higher than anticipated based solely on duplication. Furthermore, the expression levels of the paternally inherited genes exhibited greater variability than those in +/+ mice (Fig. 3B).
In the remaining forebrain, all genes, except for Cyfip1 and Ube3a, demonstrated increased expression levels in Dup/+ mice relative to +/+ littermates (see Figure S11). Similar to the findings in the prefrontal cortex, Mkrn3 expression levels were markedly higher than expected based solely on duplication (i.e., RQ = 2.0). In comparison to +/+ mice, individual Dup/+ mice exhibited significantly greater variance in the expression of paternally inherited genes in the remaining forebrain (Figure S11).
Predicting Post-Pubertal Social Interaction Scores through Neonatal Social Communication Variables and Post-Pubertal Brain Gene Expression
We previously identified predictive neonatal vocalization parameters for post-pubertal social behaviors in a mouse model of 16p11.2 hemizygous deletion [59]. Given the expression variability of genes located in the 15q11–13 ortholog, alongside variable neonatal social communication, we aimed to investigate the predictive capacity of these two sets of parameters regarding post-pubertal social behavior. If neonatal social communication serves as a prerequisite and brain gene expression acts as a determinant for the development of post-pubertal social behavior, then variability in these neonatal and brain gene expression parameters would predict variability in post-pubertal social behavior.
We aggregated the frequency of each call type (Fig. 1A), the proportions of each call type (Figure S2), the frequencies and proportions of two-call connections (Figure S7), the Markov probabilities of two-call connections (Fig. 1C) at postnatal day 12, and expression levels of genes encoded within the 15q11–13 region in the prefrontal cortex at 10 weeks of age (Fig. 3B). Subsequently, we employed Lasso regression models to identify predictors of individual variability in post-pubertal social interaction scores. To ascertain the parameters necessary for an optimal model fit that balances maximum likelihood with the risk of overfitting, we utilized the Akaike Information Criterion (Figure S12). Following the establishment of cutoff points yielding the best model fit, Lasso regression models elucidated predictors among gene expression and neonatal call metrics (Fig. 4).
The probabilities of transitioning from flat call types to chevron call types (F_Ch(P)) and from upward transitions to ambiguous call types (U_Amb(P)) emerged as predictors for the social interaction scores of +/+ mice during the first social interaction test session (Fig. 4A). The frequencies of transitions from harmonic to downward calls (Har_D(N)) and from flat to step-down call types (F_Sd(N)) served as predictors for the social interaction scores of Dup/+ mice at session 1 (Fig. 4B).
More predictors were identified for social interaction scores during session 2. Notably, various probabilities, frequencies, and Markov probabilities of two-call transitions significantly predicted the social interaction scores of +/+ mice (Fig. 4C). Interestingly, the social scores of Dup/+ mice were predicted by the expression levels of Magel2 and Herc2 in the prefrontal cortex, in addition to the frequencies of transitions from downward to step-down call types (D_Sd(N)) and from step-down to step-down call types (Sd_Sd(N)), as well as the Markov probability of multiple steps to two steps (Ms_Ts(MP)) (Fig. 4D).
Given the sample sizes, a simple cross validation for models is not suitable [85]. A more appropriate approach to confirm the validity of the identified predictive parameters is to eliminate a subset of data from the original data sets and determine how many times the identified predictors appear. We thus used the 5-fold cross validation procedure by creating five smaller data sets after randomly excluding 5–6 mice from the original data set five times and running Lasso analyses. Among gene expression predictors, Magel2 and Herc2 appeared three times out of 5 repeats at Session 2 in Dup/+ mice (Figure S13D, Fold 1-Fold 5). Vocalization parameters that appeared three time or more were F_Ch(P) and U_Amb(P) in +/+ and Har_D(N) and F_Sd(N) in Dup/+ at session 1; Su_Su(P), Amb_Ts(MP), U_D(P), Sh_F(P), Sd_Ts(MP) in +/+ at session 2 (Figure S13A, B, C, Fold 1-Fold 5). It is interesting to observe that the social behavior of Dup/+ at session 2 were better predicted by gene expression than neonatal vocalizations, whereas the opposite were the case for all the other three cases: +/+ at sessions 1 and 2 and Dup/+ at session 1.
We further validated gene expression predictors by analyzing correlation coefficients (Figure S14). The expression levels of Magel2, Herc2, and Ndn, in the prefrontal cortex of Dup/+ mice at session 2 exhibited the highest correlation coefficients, with higher expression of these genes being associated with lower social interaction scores among individual Dup/+ mice.
Interestingly, the predictors differed between sessions in both +/+ and Dup/+ mice. This result is not unexpected, as social interaction predominantly encompasses aggression, stress, and anxiety during the initial encounter, while memory-based familiarity and recognition, along with established hierarchy, come into play during Session 2 [86].
Discussion
Our computational approach demonstrates that individual variability in post-pubertal social behavior arises from call sequences during the neonatal period and correlates with varying expression levels of 15q11–13-encoded genes. This methodology provides a technical framework for predicting the developmental trajectories of social behaviors from the neonatal to post-pubertal stages in both individual mice and humans affected by this and other CNVs. Furthermore, while assessing the variability of CNV-encoded gene expression in the human brain presents significant challenges, our complementary approach in a mouse model facilitates the identification of CNV-encoded driver genes and their associated molecular pathways.
Mouse pups typically exhibit more than ten distinct call types (Figure S1), frequencies of which are variably impacted by gene dose alterations associated with neurodevelopmental disorders [59, 61, 65, 68, 87, 88]. The current study extends our previous finding that 15q11–13 Dup/+ pups produce a greater number of calls than +/+ mice by demonstrating that this phenotype is a net result of increased numbers or proportions of certain call types (harmonic, step-up, two-steps, step-down, multiple-steps, downward, chevron, and ambiguous) and decreased numbers or proportions of others (short, upward, and reverse chevron).
Although Dup/+ and +/+ pups maintained a consistent selection of call types within each genotype, as indicated by indistinguishable entropy scores between the two groups at P12 (see Figure S5), the genotypes differed significantly in specific call types emitted (see Fig. 1A, 1B; Figure S2), call sequences (Figure S7), and finite state-based rules of call sequences (Fig. 1C). Both the proportion of calls (Figure S7) and Markov-type call connections (Fig. 1C).
The classification of calls based on their acoustic properties identified four major distinct groups (see Fig. 1B; Figures S3–1 to S3–4). These clusters only partially accounted for the qualitative classification of call types, as many distinct call types shared similar quantitative properties within each cluster, while others exhibited distinct quantitative characteristics. Moreover, despite qualitative similarities, the quantitative properties of the same call types could differ across different clusters (see Fig. 1B; Fig S3). Nonetheless, these quantitative differences did not appear to correlate with genotype (Fig. 1B; Figure S3–1 to S3–4), suggesting that quantitative, acoustic properties of call sounds were less effective than qualitative properties in detecting genotype-dependent phenotypes.
We evaluated the biological significance of these genotype-dependent properties of neonatal vocalizations. The representative call types and sequences of the Dup/+ genotype were ineffective in eliciting maternal search behavior toward the sound source, as evidenced by the time spent at the end of the sound tube (Fig. 2C). Previous work with Tbx1 heterozygous pups indicated that the time spent peeking at the entrance of the sound tube was influenced more by call types than by call sequences; randomized wild-type calls elicited similar maternal responses to the original calls. However, the randomized wild-type calls failed to sustain maternal search behavior (i.e., time spent at the end of the sound tube) or to prompt a rapid approach to the tube entrance (i.e., latency to approach the sound tube entrance)[61]. These data suggest that call types dictate the initial selection of the tube from which pup calls are played, while call sequences influence the persistence of search behavior in the sound tube and the speed of approach to the source of pup calls. Given the differential reliance of various aspects of maternal approach behavior on components of vocalization, it is significant that the calls of Dup/+ pups were ineffective in eliciting sustained search behavior for the call source. In contrast, neither +/+ nor Dup/+ calls resulted in statistically significant peeking at the entrance or rapid approaches to the sound tube entrance (i.e., latency), although there were non-significant trends towards the +/+ sound tube (see Fig. 2B and D). The incentive values of call sequences of Dup/+ pups seem to be diminished.
The expression levels of duplicated mouse genes orthologous to the human 15q11–13 duplication exhibited variability among individual mice (see variance in Fig. 3B). Our Lasso regression analysis identified Magel2 and, to a lesser extent, Herc2 as potential predictors of individual post-pubertal social interaction scores in Dup/+ mice (Fig. 4D). These parameters repeatedly appeared in the 5-fold validation despite a potential bias due to the small sample size (Figure S13, Fold 1–5). Both Magel2 and Herc2 expression levels were negatively correlated with post-pubertal social interaction (see Figure S14). Although Ndn, another paternally-inherited gene, was not identified by the Lasso regression analysis, its individual expression levels similarly exhibited a high negative correlation coefficient with social interaction scores in Dup/+ mice (Figure S14), consistent with our previous findings demonstrating that normalization of Ndn restores certain parameters of social behavior in a mouse model of 15q11–13 duplication [89]. Our preclinical data are consistent with human findings that protein-truncating variants of MAGEL2 and HERC2 are present in non-CNV carriers with idiopathic cases of autism spectrum disorder (ASD) [8] and schizophrenia[10], respectively, although these findings have not reached statistical significance due to their rare nature.
Magel2 in the prefrontal cortex emerged as the most robust predictor of social interaction, with its degree of overexpression negatively correlating with social interaction levels in session 2 when Dup/+ mice exhibited lower levels of social interaction compared to +/+ mice. A mouse model with overexpressed Magel2 but normalized Ndn demonstrated social interaction levels indistinguishable from wild-type mice, yet remained highly reactive to a stranger mouse[89]. Moreover, isolated normalization of Ndn alone did not fully restore the higher number of neonatal vocalizations in our paternal 15q11–13 duplication model [89]. Thus, the role of Magel2 as an additional potential driver gene for specific parameters of social behavior cannot be discounted. Indeed, the critical role of Magel2 in social behavior is further substantiated by its deletion, as MAGEL2 deletion is a recognized risk factor for ASD, intellectual disability, and epilepsy in humans [90]. Paternally inherited Magel2 deletion in mice adversely affects neonatal vocalizations [91], various aspects of post-pubertal social behavior [92–94] and parenting behavior[95].
Conversely, overexpression of a 1.5 Mb segment encompassing Ube3a and Snrpn does not influence social interaction in mice[89]. Our data align with this observation, as neither gene was identified as a predictor (see Fig. 4D). A mouse model with an additional copy of Ube3a does not exhibit any phenotypic abnormalities in post-pubertal social behavior; however, a model with two extra copies of Ube3a does exhibit such abnormalities, despite neonatal vocalizations remaining unaltered in both models [96, 97]. Another model with four extra copies of Ube3a showed no deficits in social behavior [98]. This apparent lack of a dose-dependent phenotype may, in part, be attributable to genetic background, as the former [96] and latter [98] models were generated from FVB and C57BL/6J mice, respectively. Different inbred mouse models exhibit varying susceptibility to gene-dose alterations[3, 4]. Alternatively, the absence of phenotype in the three-chamber apparatus in the four-extra-copy model may result from inherent limitations of this task, which does not directly measure social interaction [75, 99].
We acknowledge the potential for false-negative cases due to conceptual and technical limitations of our approach. Our strategy capitalized on the variance of gene expression levels, which are likely to exhibit higher selection sensitivity for larger variances. However, while Mkrn3 expression demonstrated significantly higher variance than Herc2 in the prefrontal cortex of Dup/+ mice, only Herc2 was selected in our predictive models. Hence, the magnitude of variance alone does not fully account for the outcomes of our predictive models. Although Herc2 is a non-imprinted gene, its overexpression level might determine the level of social interaction in Dup/+ mice, together with Magel2.
It should be noted that technical limitations constrain our interpretation of the qRT-PCR-based approach. Gene expression variability may not be reliably assessed for genes with basally low levels of expression compared to those with high basal expression levels. Despite this interpretative limitation, our computational approach is still applicable to other CNVs for identifying predictors among abundantly expressed genes and neonatal parameters related to social and other behaviors. Some genes expressed in a mouse model of 16p11.2 hemizygous deletion exhibit greater inter-individual variability compared to other encoded genes [59]. Similarly, a mouse model of 22q11.2 hemizygous deletion displays individual variability in gene expression[100].
Our analysis focused on the prefrontal cortex due to its established functional relevance to social behavior [101], but can be applied to other brain regions implicated in other dimensions, such as motor and cognitive capacities, of mental illness. Regional and single-cell gene expression within various brain regions is expected to elucidate the circuits and networks and specific cell types through which driver genes influence social and other behaviors in this and other genetic mouse models of CNVs. Moreover, our approach can be effectively applied to single-gene cases of psychiatric disorders, as gene expression variance may account for incomplete penetrance and variable expressivity of behavioral dimensions in such instances.
Although neonatal social communication precedes post-pubertal social behavior, the question of whether neonatal social communication is a prerequisite for the development of post-pubertal social behavior has not yet been experimentally evaluated. In this study, a gene-dose alteration of 15q11–13 affected both the types and sequences of neonatal vocalizations as well as post-pubertal social interactions at the individual mouse level. The phenotypic correlations observed across these two developmental time points are consistent with the hypothesis that neonatal social communication is essential for the later development of social behavior.
On the other hand, predictive parameters in gene expression varied between genotypes. Levels of Magel2, Herc2, and Ndn were predictive of individual post-pubertal social interaction levels in Dup/+ mice, whereas none of 15q11–13-encoded genes were correlated with the variability of post-pubertal social interaction in +/+ littermates (see Figure S14). This finding raises the intriguing possibility that the variabilities in social interaction among controls and mutants depend on distinct sets of genes. A corollary of this observation is that, although social interaction scores partially overlap between genotypes, the underlying genes responsible for such an overlap may differ between genotypes. As the functional roles of a gene in each phenotype might differ in wild-type and mutant mice, caution is needed in extrapolating gene functions between mice with and without risk gene variants.
Paternally inherited 15q11–13 duplication does not confer complete penetrance. Although it is challenging to estimate the precise percentages of currently identified cases, approximately half of the carriers exhibit a composite phenotype that variably includes intellectual disability, developmental delay, ASD, speech delay, and behavioral problems, while one-third of carriers do not exhibit any of these diagnoses[33]. Severity also varies among individual carriers of paternal 15q11–13 duplication within each diagnosis. Such phenotypic variability cannot be accounted for solely by the chromosomal breakpoints of individual carriers[102]. Our computational approach with a preclinical mouse model is a complementary method to identify the sources of phenotypic variability, thereby contributing to a more comprehensive mechanistic understanding of psychiatric disorders and the development of gene therapy strategies targeting dimensions of mental illness.
Supplementary Material
Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Malhotra D. and Sebat J., CN Vs: harbingers of a rare variant revolution in psychiatric genetics. Cell, 2012. 148(6): p. 1223–1241.22424231 10.1016/j.cell.2012.02.039PMC 3351385 · doi ↗ · pubmed ↗
- 2Zinkstok J., , The 22q 11.2 deletion syndrome from a neurobiological perspective. Lancet Psychiatry, 2019. 6(11): p. 951–960.31395526 10.1016/S 2215-0366(19)30076-8PMC 7008533 · doi ↗ · pubmed ↗
- 3Hiroi N. and Yamauchi T., Modeling and Predicting Developmental Trajectories of Neuropsychiatric Dimensions Associated With Copy Number Variations. Int J Neuropsychopharmacol, 2019. 22(8): p. 488–500.31135887 10.1093/ijnp/pyz 026PMC 6672556 · doi ↗ · pubmed ↗
- 4Hiroi N., Critical Reappraisal of Mechanistic Links of Copy Number Variants to Dimensional Constructs of Neuropsychiatric Disorders in Mouse Models. Psychiatry and Clinical Neurosciences, 2018. 72(5): p. 301–321.29369447 10.1111/pcn.12641 PMC 5935536 · doi ↗ · pubmed ↗
- 5Hiroi N., , Copy Number Variation at 22q 11.2: from rare variants to common mechanisms of developmental neuropsychiatric disorders. Mol. Psychiatry, 2013. 18: p. 1153–1165.23917946 10.1038/mp.2013.92PMC 3852900 · doi ↗ · pubmed ↗
- 6Hiroi N., , Mouse models of 22q 11.2-associated autism spectrum disorder. Autism, 2012. S 1(001): p. 1–9.
- 7Satterstrom F.K., , Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism. Cell, 2020. 180(3): p. 568–584 e 23.31981491 10.1016/j.cell.2019.12.036PMC 7250485 · doi ↗ · pubmed ↗
- 8Fu J.M., , Rare coding variation provides insight into the genetic architecture and phenotypic context of autism. Nat Genet, 2022. 54(9): p. 1320–1331.35982160 10.1038/s 41588-022-01104-0PMC 9653013 · doi ↗ · pubmed ↗
