A maternal-to-zygotic-transition gene block on the zebrafish sex chromosome
Catherine A Wilson, John H Postlethwait

TL;DR
The study identifies a unique region on zebrafish sex chromosomes that plays a role in the maternal-to-zygotic transition during early embryonic development.
Contribution
The discovery of a gene block on zebrafish sex chromosomes that is silenced in ovaries but expressed in testes and embryos during zygotic genome activation.
Findings
Region-2 on chromosome 4R has testis-biased gene expression and is silent in ovaries.
Region-2 contains maternal-specific genes and miR-430, which targets maternal RNA for degradation.
The region is associated with the maternal-to-zygotic transition in zebrafish embryos.
Abstract
Wild zebrafish (Danio rerio) have a ZZ/ZW chromosomal sex-determination system with the major sex locus on the right arm of chromosome-4 (Chr4R) near the largest heterochromatic block in the genome, suggesting that Chr4R transcriptomics might differ from the rest of the genome. To test this hypothesis, we conducted an RNA-seq analysis of adult ZW ovaries and ZZ testes in the Nadia strain and identified 4 regions of Chr4 with different gene expression profiles. Unique in the genome, protein-coding genes in a 41.7 Mb section (Region-2) were expressed in testis but silent in ovary. The AB lab strain, which lacks sex chromosomes, verified this result, showing that testis-biased gene expression in Region-2 depends on gonad biology, not on sex-determining mechanism. RNA-seq analyses in female and male brains and livers validated reduced transcripts from Region-2 in somatic cells, but without…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6| A. Properties | B. χ2 test | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Chr4-R1 | Chr4-R2 | Chr4-R3 | Chr4-R4 | Genome | R1 | R2 | R3 | R4 | |
| # differentially expressed genes | 429 | 308 | 114 | 44 | 15,984 | R1 | 2.53E−33 | 2.79E−05 | 0.57 |
| # DE genes upregulated in testis vs ovary | 290 | 307 | 98 | 28 | 9,650 | R2 | 5E−146 | 4.93E−26 | |
| # DE genes upregulated in ovary vs testis | 139 | 1 | 16 | 16 | 6,334 | R3 | 2.01E−05 | ||
| Percent DE genes upregulated in testis vs ovary | 67.6% | 99.7% | 86.0% | 63.6% | 60.4% | R4 | |||
| Percent DE genes upregulated in ovary vs testis | 32.4% | 0.3% | 14.0% | 36.4% | 39.6% | ||||
| Average log2-fold change up for testis-overexpressed genes | 3.169 | 3.972 | 3.796 | 3.427 | 3.161 | ||||
| Average log2-fold change up for ovary-overexpressed genes | 1.826 | n/a | 0.925 | 6.241 | 1.930 | ||||
| Average TPM in testis | 37.95 | 1.22 | 7.75 | 15.22 | 38.61 | ||||
| Average TPM in ovary | 28.82 | 0.06 | 4.88 | 317.10 | 40.13 | ||||
- —NIH10.13039/100000002
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsnanoparticles nucleation surface interactions · Inorganic Fluorides and Related Compounds
Introduction
Within a species, selection can act differently on females and males due to differences in physiological traits that affect gamete formation, mating, fertilization, and parental care (Darwin 1871). In humans, these sex-related physiological features can lead to health-related sex-biased discrepancies in functioning of the immune, cardiovascular, and skeletal systems, and in metabolism, responses to drugs and toxins, and the type and incidence of cancers (Wizemann and Pardue 2001). These facts led some health research funding agencies to prioritize research that balances considerations of sex as a biological variable (Clayton and Collins 2014). Sex differences in development and physiology can arise from genes that are expressed at a greater level in one sex compared to the other (sex-biased genes) (Dutoit et al. 2018).
Genes with sex-biased expression might not be distributed randomly across the genome. In species that have chromosomally XX females and XY males, X chromosomes are often enriched with female-biased genes and conversely, in species that have ZW females and ZZ males, Z chromosomes often have an excess of male-biased genes (Mank 2009). These distributions likely reflect the fact that the phenotypes of traits that maximize male fitness might sometimes conflict with those that maximize female fitness (Lande 1980).
Fish make superior models for understanding sex genetics, gonad development, and the mechanisms of human diseases (Ye and Chen 2020; Aharon and Marlow 2021; Baldridge et al. 2021; Nagahama et al. 2021; Pan et al. 2021; Henke et al. 2023; MacRae and Peterson 2023) and sex-biased expression is common in fish for genes encoding proteins, microRNAs (miRNAs), and long noncoding RNAs (lncRNAs) (Desvignes et al. 2019, 2021; Dechaud et al. 2021; Guo et al. 2021; Lichilin et al. 2021; Kaitetzidou et al. 2022). In zebrafish, a premier model for human disease (Baldridge et al. 2021), sex-biased transcription has been studied for liver, brain, and gonads, with the greatest sex differences in gene expression found in gonads (Robison et al. 2008; Sreenivasan et al. 2008; Small et al. 2009; Ung et al. 2013; Zheng et al. 2013; Wong et al. 2014; Yang et al. 2016). Furthermore, in zebrafish and other teleosts, female-biased and male-biased genes both evolve more rapidly than genes that do not show sex-biased expression (Yang et al. 2016; Lichilin et al. 2021).
Despite the importance of sex-biased gene expression to understand sexual selection, to discover the origin of sex-related phenotypes, and to evaluate the accumulation of sex-related genes on sex chromosomes, little information exists concerning the genomic distribution of sex-biased genes along fish chromosomes (Wong et al. 2014; Kaitetzidou et al. 2022). Here, we describe the discovery of strong testis-biased gene expression in hundreds of protein-coding genes in a heterochromatic region of the long arm of the homomorphic sex chromosome Chr4 near the sex-determining locus in a zebrafish strain that retains the wild ZW female/ZZ male sex-determining system (Wilson et al. 2014) and in which ZZ fish develop directly into males, avoiding the juvenile ovary stage observed in laboratory strains (Wilson et al. 2024). Results showed that this region (Chr4 Region-2) coincides with genes encoding maternally supplied ribosomal and spliceosomal components for the early embryo (Locati, Pagano, Ensink et al. 2017; Pagano, Dekker et al. 2020) as well as genes that are among the first to be transcribed after zygotic gene activation (ZGA; White et al. 2017; Wells et al. 2023). Furthermore, this region is adjacent in the genome to genes that help to remove maternal transcripts (Giraldez et al. 2005, 2006; Lund et al. 2009; Takacs and Giraldez 2016), making Chr4 Region-2 a maternal-to-zygotic-transition gene block.
Methods
Female and male zebrafish of the Nadia strain (NA, ZDB-GENO-030115-2) provided gonads for these experiments. All animal work was approved by the University of Oregon Institutional Animal Care and Use Committee (# AUP-21-06 v.1). Adult fish were euthanized at 3-month post-fertilization and genotyped by PCR for sex chromosomes as previously described (Wilson et al. 2014). From 5 adult ZW females and 5 adult ZZ males, we dissected pairs of ovaries and testes, respectively, stored them in RNAlater (Invitrogen), and extracted total RNA using the RiboPure RNA Purification Kit (Ambion). We assessed total RNA quality on the Agilent Fragment Analyzer and enriched mRNA from samples with an RQN ≥ 7.9 using the Dynabeads mRNA Purification Kit (Ambion). Strand-specific sequencing libraries were generated with the NextFlex Rapid Directional qRNA-Seq Library Prep Kit (BIOO Scientific) and were sequenced on an Illumina NextSeq 500 to generate paired-end 75-nt reads. Illumina reads were preprocessed with Dupligänger (Sydes et al. 2020) to annotate barcodes, to act as a wrapper for Cutadapt (Martin 2011), which removes adapters, and Trimmomatic (Bolger et al. 2014), which trims reads for quality. Reads were then aligned to the zebrafish reference genome (version GRCz11) (Howe et al. 2013) with GSNAP (Wu and Nacu 2010), allowing up to 10% mismatches and using transcript splice sites identified in the Ensembl Release 92 annotation (Ens92) (Zerbino et al. 2018; Martin et al. 2023). These aligned reads were then processed with Dupligänger to remove PCR duplicates (Sydes et al. 2020). Reads that mapped to annotated features were counted with HTSeq-Count in “strict” mode (Anders et al. 2015). Differential expression in protein-coding genes was calculated with DESeq2 (version1.20.0) (Love et al. 2014). We performed functional enrichment analysis of differentially expressed genes using PANTHER (Mi et al. 2017).
RNA-seq data for wild-type ovary and testis from fish of zebrafish strain AB (ZDB-GENO-960809-7) were obtained from SRA accession PRJNA512103 and processed as described (Yan et al. 2019). AB control liver and brain RNA-seq data were obtained from SRA accession PRJNA600413 (Banu et al. 2020); because these somatic organ libraries did not have molecular barcodes, we removed adapters with Cutadapt (Martin 2011), quality trimmed with Trimmomatic (Bolger et al. 2014), and then aligned sequences to GRCz11 as above. We used GTFtools (Li et al. 2022) to analyze Ens92 gene models and used the length of merged exons of isoforms of each gene to calculate TPM (transcripts per kilobase of gene length per million total reads) for all protein-coding genes in each library.
The GRCz11 repeat annotation was obtained from UCSC (Navarro Gonzalez et al. 2021). We excluded any repeat with a “Family” or “Class” that contained a “?”, or was labeled “Unknown”. We also excluded any repeat with the following “Family” or “Class”: “Simple_repeat”, “Low_complexity”, “tRNA”, “rRNA”, and “snRNA” as well as repeats with fewer than 50 copies in the genome. We generated bed files for each repeat name and used BEDTools (Quinlan and Hall 2010) to calculate the overlap with selected genomic regions, then calculated the percentage of each region covered by each repeat type. Hierarchical clustering of genomic regions was performed using the R function “hclust” and the “average” method. Principal component analysis based on the correlation matrix was performed using the R function “prcomp”. Transposable element transcription quantification was calculated using Telescope (Bendall et al. 2019). Reads from Nadia gonads were realigned to the GRCz11 zebrafish assembly using STAR (Dobin et al. 2013) with parameters –winAnchorMultimapNmax 100 –outFilterMultimapNmax 100. Alignments were then reassigned with Telescope, using the UCSC GRCz11 repeat annotation as input. TE counts generated by Telescope were analyzed using DESeq2 (Love et al. 2014).The zebrafish maternal 5S rDNA consensus sequence 5Sseq1 (Locati, Pagano, Ensink et al. 2017) was aligned to GRCz11 with BLASTN (Camacho et al. 2009) using parameters -perc_identity 100 -qcov_hsp_perc 100. Maternal 5S gene positions were plotted on Chromosome-4 with ChromoMap (Anand and Rodriguez Lopez 2022). tRNA annotations were obtained from GtRNAdb (Chan and Lowe 2016). Maternal U1 and U6 sequences were obtained from a curated annotation by Pagano, Dekker et al. 2020. Annotations were visualized in IGV (Thorvaldsdottir et al. 2013). Orthologs for Chr4 genes were obtained from ZFIN (Bradford et al. 2022).
Results
Differential gene expression in Nadia gonads
Using RNA-seq, we compared gene expression patterns of testes from chromosomally ZZ adult males to those of ovaries from chromosomally ZW adult females in 3-month post-fertilization young adults of the Nadia strain, pooling both gonads from each of 5 replicate individuals of each sex genotype for 10 independent samples. Sequencing resulted in an average of 10.45 million reads per library over the 10 libraries. Analysis of testis vs ovary with DESeq2 (Love et al. 2014) identified 17,052 differentially expressed (DE) genes (q-value cutoff of 0.1 and P-value cutoff of 0.05) from the 25,432 annotated protein-coding genes in the GRCz11 zebrafish genome assembly. A total of 6,557 genes showed ovary-biased expression and substantially more (10,495 genes) showed testis-biased expression (Fig. 1a, Supplementary Table 1).
Testis vs ovary differential gene expression. a) Testis vs ovary differential expression plotted against genomic position across the 25 zebrafish chromosomes. The plot shows unique and specific upregulation of testis genes vs ovary genes on Chr4, the sex chromosome. b) Testis vs ovary differential expression across Chr4. The plot identified 4 distinct regions: R1, with values similar to the bulk of the genome; R2, with testis expression on average much higher than ovary expression; R3, again with values similar to most of the genome; and R4, with some strongly ovary-biased genes and coincident with sar4 containing the major sex-determining locus. The horizontal black bar represents Chr4, with the location of the centromere (horizontal red line) located by alignment of the centromeric markers BX537156 (Freeman et al. 2007) and Z20450 (Mohideen et al. 2000) to GRCz11; the positions of maternal 5S ribosomal genes (Locati, Pagano, Ensink et al. 2017) are indicated by green vertical lines, and mir430 genes are marked by white vertical lines. c) Replication banding of Chr4 and Chr3 (Amores and Postlethwait 1999; Phillips et al. 2006). Karyotypes showed dark staining (early replicating DNA) on the short (left) arm of Chr4 and at the distal tip of the long (right) arm of Chr4, in contrast to pale staining (late-replicating DNA) in the position expected for R2 on Chr4R, while only thin light bands appeared on all other chromosomes, represented here by Chr3. d) The absolute level of gene expression in transcripts per kilobase of gene length per million total reads (TPM) for all protein-coding genes within Region-2 of Chr4, showing that transcripts were detected from most genes in testis (blue) but transcript count was near 0 from most genes in ovary (red).
Functional enrichment analysis
To analyze functions of the most strongly differentially expressed genes, we set criteria of greater than 5-fold change and a q-value of less than 0.0001, thereby identifying a subgroup of 6,305 strongly DE genes. Of these strongly DE genes, 1,521 (24.1%) showed ovary-biased expression but considerably more, 4,784 (75.9%), showed testis-biased expression (Fig. 1a, Supplementary Table 1). GO term enrichment analysis for these strongly DE genes identified the top categories for Biological Process classification as “sperm-egg recognition” (GO:0035036); “cell-cell recognition” (GO:0009988); “fertilization” (GO:0009566); “binding of sperm to zona pellucida” (GO:0007339); “oogenesis” (GO:0048477), “axonemal dynein complex assembly” (GO:0035082), “cilium movement” (GO:0003341); and “germ cell development” (GO:0007281) (Supplementary Table 2). For the Molecular Function classification, top categories were “protein serine/threonine kinase activity” (GO:0004674) and “protein kinase activity” (GO:0004672) (Supplementary Table 3). These categories are expected for germ cell development and cell signaling during gametogenesis.
Genomic distribution of differentially expressed genes
To determine whether sex bias in differential gene expression was evenly distributed across the genome, we plotted the log2-fold expression change of all 17,052 protein-coding genes that were differentially expressed between adult ZW ovaries and ZZ testes (Fig. 1a). Results showed that for most of the genome, genes with testis-biased expression tended on average to be differentially expressed to a greater fold than genes with ovary-biased expression, confirming earlier studies for zebrafish and other animals, including mammals (Parisi et al. 2003; Ranz et al. 2003; Malone et al. 2006; Yang et al. 2006). On average, testis-overexpressed genes were overexpressed 9.2-fold, but ovary-overexpressed genes were overexpressed only 3.7-fold. Differentially expressed genes were approximately evenly distributed across the genome, with 1 exception (Fig. 1a). The sex chromosome, chromosome-4 (Chr4) (Anderson et al. 2012; Wilson et al. 2014) stood out due to a large block of genes lacking substantial upregulation of genes in ovary relative to testis (Fig. 1a).
To examine Chr4 in more detail, we first identified the position of its centromere by aligning the centromeric markers BX537156 and Z20450 (Mohideen et al. 2000; Freeman et al. 2007) to GRCz11 (Fig. 1b). A focus on Chr4 showed that its right arm (Chr4R) in Ensembl (https://www.ensembl.org/Danio_rerio/Location/Chromosome?r=4), which is the cytogenetically long arm, Chr4q (Postlethwait et al. 1994; Daga et al. 1996; Phillips et al. 2006) had: (1) few genes with ovary-biased expression; (2) an average expression bias for testis-biased genes stronger than the genome-wide average; and (3) less variation in dynamic expression range compared to the left arm (Chr4L, the cytogenetic short arm Chr4p), which was similar to the rest of the genome (Fig. 1, a and b). These results show that Chr4 is not only cytogenetically distinct (Pijnacker and Ferwerda 1995; Daga et al. 1996; Gornung et al. 1997; Amores and Postlethwait 1999; Sola and Gornung 2001; Traut and Winking 2001; Phillips et al. 2006), but at least for Nadia strain gonads, was also transcriptionally distinct.
Differential gene expression for testis vs ovary divided Chr4 into 4 distinct regions (Fig. 1b). Region-1 extended from the telomere of Chr4L, through the centromere, and ended at position 29.1 Mb, just to the right of the mir430 gene cluster on Chr4R. Differential gene expression in Region-1 exhibited a pattern similar to the rest of the genome, but with a somewhat higher percent of testis-biased genes (Table 1A). In Region-1, 67.6% of differentially expressed genes were upregulated in the testis but only 32.4% of DE genes were upregulated in the ovary (Fig. 1b, Table 1A). These values are only somewhat more testis-biased than the rest of the genome, which had 60.4% of DE genes upregulated in testis and 39.6% in the ovary (Table 1A). Testis-overexpressed genes in Region-1 had an average log2-fold change relative to expression of those genes in the ovary of 3.2, while ovary-overexpressed genes had an average log2-fold change relative the expression of those genes in the testis of just 1.8, similar to the rest of the genome (Table 1A).
Region-2 began shortly to the right of the mir430 gene cluster at 29.1 Mb on Chr4R and extended to 70.8 M (Fig. 1b). In Region-2, 307 of 308 differentially expressed genes annotated as protein coding were overexpressed in testis compared to ovary (Fig. 1b, Table 1A). These protein-coding genes upregulated in testis were interspersed with the positions of maternal 5S ribosomal genes (Locati, Pagano, Ensink et al. 2017) (Fig. 1b), and the only gene in Region-2 that was overexpressed in ovary compared to testis (ENSDARG00000115819) was an 18S rRNA gene incorrectly annotated as protein coding. Region-2 of Chr4 appears to be cytogenetically distinct: Chr4R is largely heterochromatic and most of it is late replicating, in contrast to the right telomeric region and Chr4L, which matched most of the rest of the genome as being early replicating (Amores and Postlethwait 1999) (Fig. 1c).
Because heterochromatin is associated with transcriptional silencing (Richards and Elgin 2002), we wanted to see how Region-2 compared to the rest of the genome. We examined the absolute levels of gene expression across the genome by calculating the TPM for all protein-coding genes in the dataset. We found that for Nadia gonads, the TPM for most genes in Region-2 was 0 or nearly so in the ovary (Fig. 1d). The average expression for all 651 annotated protein-coding genes in Region-2 was 0.06 TPM per gene in ovaries regardless of whether genes were differentially expressed in testis vs ovary; in contrast, the TPM average genome-wide was 39.3, over 600 fold higher. About half of these 651 Region-2 genes (352 genes, 54.1%) lacked detectable expression in the ovary. In contrast, the average expression level for these 651 genes in the testis was 1.22 TPM, with only 79 (12.1%) lacking detectable expression. A total of 74 Region-2 genes (11.4%) lacked detectable expression in both gonads. We conclude that genes in Region-2 of Chr4 in Nadia gonads showed greatly reduced transcript numbers compared to the rest of the genome, consistent with its heterochromatic nature, and that this region was transcriptionally silent for protein-coding genes in ovary, but not in testis.
Region-3 stretched from 70.8 Mb to 76.9 Mb. Ovarian gene expression was not silenced in Region-3, but this region nevertheless exhibited testis-biased gene expression. Of the 114 differentially expressed genes in Region-3, 86% were overexpressed in testis relative to ovary, while only 14.0% were overexpressed in ovaries relative to testis (Table 1A). A chi-square test showed that this result for Region-3 was significantly different from Region-1 (P-value = 2.79E^−05^) and Region-2 (P-value = 5E^−146^) (Fig. 1b, Table 1B). The magnitude of ovary expression was reduced in Region-3 compared to the rest of the genome, with an average log2-fold change for ovary-overexpressed genes of 0.93 compared to the value for the rest of the genome of 1.9 (Table 1A).
Region-4 corresponded to the sex-determining region, sar4 (Wilson et al. 2014). Region-4 began at 76.9 Mb at the right of the ms4a17a gene cluster and extended to the telomere. Region-4 stood out not only because it contains previously identified strongly sex-linked loci (Wilson et al. 2014) but also because it had some of the most intensely ovary-biased genes across the entire genome (Fig. 1, a and b).
Region-4 in GRCz11 contained 58 annotated protein-coding genes, 16 of which were upregulated in ovary relative to testis in the Nadia analysis, and 28 of which were upregulated in testis vs ovary (Table 1A, Supplementary Table 1). Of the 16 ovary-biased genes in Region-4, 9 were overexpressed more than 150-fold (log2-fold change = 7.23) relative to testis and 4 were expressed more than 1,000-fold (log2-fold change = 9.97, Supplementary Fig. 1). These genes included the mucin gene CU467646.3 (ENSDARG00000099076), which belongs to a tandemly expanded mucin-2-like, cyprinid-specific gene family (gene tree ENSGT00630000090221) that encodes chorion proteins (Hau et al. 2020). Three of the 16 genes highly upregulated in ovary (ENSDARG00000112275, ENSDARG00000113671, ENSDARG00000115978) are members of a cluster of maternal 45S ribosomal genes that were misannotated as protein-coding genes in Ens92 (Ortega-Recalde et al. 2019; Locati, Pagano, Girard et al. 2017). None of these ovary-biased genes in the sex-determining region are related to any known sex-determination genes in other species or are predicted to play any regulatory role in gonad development, although the maternal 45S genes have been proposed to be intrinsic to sex determination in zebrafish (Ortega-Recalde et al. 2019).
Although some highly ovary-specific genes appear to cluster in Region-4 (Fig. 1b), in total, more differentially expressed protein-coding genes in Region-4 showed testis-biased expression than ovary-biased expression. The percentage of testis and ovary expressed genes in Region-4 was not significantly different from Region-1 (chi-square test, P-value = 0.57), but was significantly different from Region-2 and Region-3 (Table 1B). The highest testis-biased Region-4 gene, however, was only 39-fold overexpressed in testis vs ovary, compared to hundreds of fold overexpression for some of the ovary-biased genes.
Gonadal gene expression in a domesticated laboratory strain
In contrast to NA, the laboratory strain AB lacks strongly sex-linked markers and appears to lack a Z chromosome (Wilson et al. 2014), likely because it was derived by several generations of gynogenesis (Walker-Durchanek 1980); thus, AB is likely chromosomally WW, a genotype that usually becomes females but can also develop into neomales (Wilson et al. 2014, 2024; Valdivieso et al. 2022). We wondered if the testis- vs ovary-biased gene expression pattern in NA Chr4 was a peculiarity of the chromosomal sex-determination system in NA strain fish, or if it also held for the laboratory AB strain. To answer this question, we analyzed RNA-seq data from ovaries and testes of 8-month-old adult AB wild-types (SRA accession PRJNA512103 (Yan et al. 2019)). Analysis of transcription levels from ovaries of 4 AB females and testes of 3 AB males revealed that AB and NA share the same pattern of gonad-specific gene expression levels in all 4 regions of Chr4 (Fig. 2). Region-2 had a mean expression level for all protein-coding genes in AB ovaries of 0.17 TPM, with 370 of 651 genes (56.8%) lacking detectable expression; the mean expression level for AB testis was 4 times higher than AB ovaries (0.68 TPM), with only 20.4% of genes (133 of 651) showing no expression (Supplementary Table 4). The overall pattern for AB gonads was like that obtained for NA gonads. This finding shows that the Nadia chromosomal sex-determination mechanism is not necessary for the ovary-specific gene silencing observed for Region-2 of Chr4, but instead, this silencing must be related to gonad development and/or function.
Gene expression in gonads and in somatic organs measured by log10 TPM. a) A heatmap showing TPM levels for annotated protein-coding genes across the genome in male and female gonads of strain NA and AB and in male and female somatic organs (brain and liver) for strain AB plotted against position. Color scale below at right. b) A heatmap showing TPM along the length of Chr4 for the same samples as in a). Plots revealed low transcript levels in Region-2, especially in ovaries of both Nadia and AB, and that, although gene expression was also low in Region-2 in strain AB brain and liver, it was not sexually dimorphic in these somatic organs.
Expression of Chr4 Region-2 genes in somatic organs
Having shown that ovary-specific silencing of Region-2 depends on testis-vs-ovary function and not on sex chromosome genotype, we wondered if the sex-specific expression differences we detected were displayed only by gonads or also occurred in somatic organs. To answer this question, we accessed published RNA-seq data for 2 additional organs: brain and liver. Data were retrieved from brains of 2 AB males and 3 AB females and from livers of 2 AB males and 2 AB females, with data from all 4 samples derived in the same study (PRJNA600413) (Banu et al. 2020). We calculated TPM for all annotated protein-coding genes across the genome and plotted results as in the gonad experiments. Results revealed reduced expression of Chr4 Region-2 genes relative to other portions of the genome for both brain and liver (Fig. 2; Supplementary Table 4), paralleling results for gonads (Fig. 1). This finding showed that reduced expression of loci in Region-2 was not specific to gonads but was shared by at least 2 other somatic organs. A close look at transcript counts revealed low but detectable levels of transcripts of Region-2 genes in both brain and liver, unlike ovary, where hardly any transcripts were detected for nearly all genes in Region-2. In male livers, the average TPM for genes in Region-2 was 0.27 with 123 of 651 genes (18.9%) lacking detectable expression, and in females, the average TPM for genes in Region-2 was 0.25 with 141 genes (21.7%) lacking detectable expression. In male brains, the average TPM for genes in Region-2 was 0.49, and in females, it was 0.57, with 74 (11.4%) and 76 (11.7%) genes lacking detectable expression, respectively (Supplementary Table 4). This suppression of gene expression is consistent with the enrichment of transcription-inhibiting H3K9me2 and H3K9me3 histone modifications observed along Chr4R (Yang et al. 2020; Padeken et al. 2022). While repression of transcription for Region-2 genes occurred in both somatic organs we examined, it was not sex-specific in brain or liver, and neither somatic organ showed the apparent silencing of transcription observed for the ovary. We conclude first that gene expression for Chr4 Region-2 was reduced both for adult gonads and adult somatic organs, and second that expression was uniquely silent in adult ovaries, and merely suppressed in testes and in somatic organs of both sexes.
Gene content of Region-2
Chr4 in zebrafish has long been known to be unique in the genome due to its heterochromatic long arm, Chr4R (Pijnacker and Ferwerda 1995; Daga et al. 1996; Gornung et al. 1997; Amores and Postlethwait 1999; Phillips et al. 2006). The genic content of Chr4R was also shown to be distinct in terms of 5S rRNA genes, mir430 genes, zinc finger protein genes, snRNA genes, tRNA genes, Kolobok-1 DNA transposons, and pseudogenes (Anderson et al. 2012). The Chr4R-linked 5S rRNA genes are a maternal-specific isoform (Locati, Pagano, Ensink et al. 2017). The satellite repeats MOSAT-2 and SAT-2 have a complementary distribution, with MOSAT-2 shown to be almost exclusively on Chr4R but SAT-2 being absent from Chr4R although present broadly otherwise throughout the genome (Howe et al. 2013). Chr4R also contains unique gene families that encode mainly either NOD-like receptor (NLR) proteins, or zinc finger proteins (Znf) (Howe et al. 2013, 2016). While these unusual features had been ascribed to the entire long arm of Chr4 in earlier versions of the genome assembly, we wanted to understand the relationship of these repetitive genes and satellite sequences to the 4 specific regions on Chr4 defined by testis-vs-ovary gene expression, and so plotted the positions of these features on the GRCz11 assembly (Fig. 3).
Distinctive features of Chr4. a) Distribution of sequences across the 25 zebrafish chromosomes for satellite sequences MOSAT-2 and SAT2, genes encoding the protein-coding domains IPR007111 NACHT, genes encoding IPR013087 C2H2-type zinc fingers, and certain nonprotein-coding genes including maternal 5S rRNA genes (somatic 5S rRNA genes on Chr18 not shown), snRNA genes, and tRNA genes. b) Positions of these sequences along Chr4 are specifically enriched or excluded from Region-2 (Anderson et al. 2012; Howe et al. 2013; Howe et al. 2016). MOSAT-2 is largely restricted to Region-2, while reciprocally, SAT-2 is absent from Region-2. Maternal 5S genes, snRNAs, and tRNAs are largely restricted to Region-2, while NACHT domain containing NLR genes and zinc finger genes are distributed along the length of Chr4R. Although several of these factors had been reported to be specific for Chr4R, we show here that, except for the NACHT and ZNF families, they are unique to Region-2.
Results showed that the unique features of Chr4 are not evenly distributed along the chromosome. Strikingly, MOSAT-2 distribution was strongly located within Region-2 but was greatly reduced in Region-1 and absent from Regions-3 and -4 and most of the rest of the genome (Fig. 3). Reciprocally, SAT-2 was entirely absent from Region-2, but was again detected in Regions-1, -3, and -4 and across most of the rest of the genome (Fig. 3). Genes encoding NLR proteins (InterPro domain IPR007111) and C2H2 zinc finger proteins (InterPro domain IPR013087) extended beyond Region-2 all the way to the end of the chromosome (Fig. 3). 5S ribosomal genes are found in only 2 places in the genome: 5S rRNA genes in Chr4 Region-2 (Fig. 3) encode maternal-specific ribosomal components in oocytes that are replaced during development with somatic 5S sequences encoded by 12 somatic 5S rRNA genes on Chr18 (Locati, Pagano, Ensink et al. 2017). Chr4 Region-2 contains large numbers of snRNA (Anderson et al. 2012; Howe et al. 2013) genes that are maternal-specific U1 and U6 spliceosome components (Fig. 3), while the two U4 sequences located in Region-1 are somatic (Pagano, Dekker et al. 2020). It is not yet known whether the tRNAs on Chr4 (Anderson et al. 2012) that are accumulated in Region-2, are maternal or somatic, but it is tempting to speculate that they, too, may be components of maternal-specific translation machinery different from somatic tRNA genes. The presence of maternal 5S rRNA and snRNA genes in Region-2 is interesting because these genes are highly expressed in developing oocytes but their positions are interspersed with protein-coding genes, which conversely, are silenced or nearly so. These observations suggest a reciprocal relationship in Region-2 in ovaries between strong expression of these RNA genes and the silencing of expression for neighboring protein-coding genes. The 5S rRNA genes are transcribed by RNA polymerase III, as is U6 (Weinmann et al. 1974; Kunkel et al. 1986), but U1 and protein-coding genes are both transcribed by RNA polymerase II (Weil et al. 1979; Murphy et al. 1982). The expression of maternal U1 sequences from Region-2 suggests that genes in this section of the genome are not globally inaccessible to RNA polymerase II in oocytes and gene silencing in this region must occur through a more nuanced, but as yet unknown, mechanism.
About 80% of the genes on Chr4R lack a human ortholog (Howe et al. 2013). Analysis of orthologs annotated in ZFIN (Bradford et al. 2022) show that this phenomenon is more pronounced in Region-2 than in Regions-3 and -4 (Supplementary Table 5). In the euchromatic portion of Chr4R (Region-3 and Region-4 combined), 17 of 221 protein-coding genes (7.7%) had human orthologs, compared to 450 of 603 (74.6%) of genes with human orthologs in Region-1. None of the 651 protein-coding genes in Region-2 had a unique identified human ortholog.
Within Region-2, 177 of 651 genes (27.2%) contained the IPR007111 NACHT_NTPase domain found in NLR genes (Howe et al. 2016) (Fig. 3). An additional 24 genes lacked the IPR007111 NACHT_NTPase domain but contained other domains also found in NLR genes, including the IPR003877 SPRY_dom domain and the IPR032675 LRR_dom_sf domain (Howe et al. 2016) (Supplementary Table 6), so these 24 genes are also likely NLR genes that may not be fully annotated. Published analyses showed that the NLR genes on Chr4R belong to 3 distinct fish-specific NLR families (group 1, group 2a, and group 3a) that are not found elsewhere in the genome (Howe et al. 2016). The functions of this gene family have not been extensively studied in zebrafish, but NLR proteins generally play a role in innate immunity (Laing et al. 2008; Li et al. 2017; Forn-Cuni et al. 2019). Three genes within Region-2 contain the interleukin receptor domain IPR032356 IL17R_fnIII_D1. Together with the NLR genes, the clustering of these genes in Region-2 suggests a possible role in innate immunity.
Nearly half of Region-2 genes (321/651) were annotated as possessing the IPR013087 Znf_C2H2_type zinc finger domain (Fig. 3). Importantly, these Region-2 zinc finger genes are among the first genes to be transcribed during the onset of zygotic genome activation (White et al. 2017). These ovary-silenced Region-2 zinc finger genes belong to an evolutionarily young gene family under positive selection, with the youngest being among the earliest genes expressed during ZGA, while those ovary-silenced Region-2 znf genes expressed later during ZGA are evolutionarily somewhat older (Wells et al. 2023). In contrast, the znf genes with maternally deposited transcripts located in the euchromatic regions of Chr4R, Regions-3 and -4, are evolutionarily older than the ovary-silenced Region-2 znf genes expressed early (Wells et al. 2023).
A number of Region-2 genes have protein domains implicated in chromatin modification. Four genes (ENSDARG00000104681, ENSDARG00000076160, ENSDARG00000115416, ENSDARG00000103283) contain the SET domain IPR001214 and are predicted to have histone methyltransferase activity (Jenuwein et al. 1998). Some of these histone methyltransferase genes share expression profiles with Chr4R zinc finger genes (White et al. 2017). These genes are members of gene tree ENSGT00540000072423, which also contains several other genes in Region-2 that are not annotated with the SET domain. Two genes, ENSDARG00000077266 and ENSDARG00000104852, contain the IPR000953 Chromo/chromo shadow domain and are therefore predicted to play a role in chromatin modification (Koonin et al. 1995). Four zebrafish-specific genes in Region-2 (ENSDARG00000096491, ENSDARG00000101046, ENSDARG00000102659, ENSDARG00000101912) contain domain IPR032071, which corresponds to the “domain of unknown function” DUF4806, which is a subtype of the BEN DNA-binding domain (Pan et al. 2023). BEN domains are predicted to recruit chromatin-modifying factors (Abhiman et al. 2008), and it is possible that proteins encoded by these Chr4 Region-2 genes may do so as well. Expression of these chromatin modification factor genes at the mid-blastula transition might help in the remodeling of chromatin during ZGA.
In addition to the gene families mentioned above, Region-2 contains 3 genes (ENSDARG00000090355, ENSDARG00000087265, ENSDARG00000101392) with the IPR017348 PIM1/2/3 domain of protein serine/threonine kinases, and 3 genes (ENSDARG00000094500, ENSDARG00000113519, ENSDARG00000099337) with the IPR001810 F-box domain, a protein-protein interaction motif.
Region-2 has a unique repertoire of repetitive elements
The right arm of Chr4 has been reported to be rich in repetitive sequences and young transposable elements compared to the rest of the genome (Anderson et al. 2012; Howe et al. 2013; Chang et al. 2022). We wondered if this was a property of the whole of Chr4R or was specific to Region-2 like MOSAT-2 (Fig. 3b). To find out, we obtained the UCSC genome browser repeat annotation library (Navarro Gonzalez et al. 2021) and compared the repeat content of 5 different portions of the genome: each of the 4 Regions of Chr4, and the rest of the genome (Chr1–Chr3 plus Chr5–Chr25). We calculated the percent of each region covered by each repeat element. Hierarchical clustering showed that Region-2 was a clear outlier from the rest of the genome in terms of repetitive element content (Fig. 4a). In contrast, Region-1 was similar in repeat composition to the other 24 chromosomes in the genome, while Region-3 and Region-4 were similar to each other but distinct from the bulk of the genome and quite different from Region-2 (Fig. 4a).
Distribution of repetitive elements. a) Hierarchical clustering analysis of repetitive element composition in the 4 regions of Chr4 and in the rest of the zebrafish genome (Chr1–Chr3 plus Chr5–Chr25). b) PCA analysis of repetitive element composition shows that Region-2 of Chr4 is distinct from the rest of the genome. c) Genomic distribution of several repeat elements contributing to PC1 shows them to be clustered on Chr4R. d) Distribution of these repeats along Chr4 shows that they tend to concentrate in Region-2.
To further explore the distribution of repetitive elements, we conducted a principal component analysis (PCA) of repeat composition for the same 5 genomic regions. Results showed that PC1 and PC2 explained most of the variance (35.5 and 28.8%, respectively). Region-2 was clearly separated from the rest of the genome along the PC1 axis (Fig. 4b). Examination of the top 25 repeat elements populating PC1 (Supplementary Fig. 1) identified several repeat elements highly specific to Region-2 besides MOSAT-2, including DNA-2-8_DR, the LINE elements L1-2_DR, L1-3_DR, L1-4_DR, L1-5_DR, and L1-12 _DR, Copia4-I_DR, Gypsy164-I_DR, Gypsy16-LTR_DR, and Gypsy77-I_DR (Fig. 4, c and d). These results show that the uniqueness of Chr4R in terms of its repetitive element content is centered on Region-2.
We wanted to understand if Region-2 of Chr4 exhibited sex-specific transposon expression. To find out, we aligned the Nadia gonad RNA-seq dataset to the zebrafish genome and used Telescope (Bendall et al. 2019) to quantify transcript counts for each transposon locus. Results identified 20,509 differentially expressed TE insertions across the genome. Of these, 19,004 loci (92.7%) were upregulated in testis compared to ovary, while only 1,505 loci (7.3%) were upregulated in ovary relative to testis (Supplementary Table 7). These results are consistent with published results in medaka, which also found more TE expression in testis (Dechaud et al. 2021). Plotting differentially expressed TEs across the genome showed that they did not exhibit the ovary-specific downregulation in Region-2 that we observed for protein-coding genes (Supplementary Fig. 2).
Chr4 NLR and zinc finger genes exhibit different expression patterns
NLR and zinc finger genes occupy Chr4R, and we wondered whether expression of family members differed according to position along the chromosome. Using the Nadia dataset, we calculated the mean TPM for each NLR-family gene (IPR007111 NACHT domain) or zinc finger-family gene (IPR013087 C2H2 znf domain). A single Region-2 gene that was annotated with both domains (ENSDARG00000090160) was excluded. Comparing TPM values of zinc finger genes located in Region-2 vs Region-3 + Region-4 revealed significantly higher expression levels of znf genes located in Region-3 + Region-4 than those in Region-2 for both ovary (P = 2.20E^−30^, Wilcoxon rank sums test) and testis (P = 4.36E^−25^) (Fig. 5, a and b). This finding suggests that zinc finger gene family members in Region-3 + Region-4 have a regulatory mechanism distinct from those in Region-2, and thus, these genes might have a different function.
*Comparison of expression levels of zinc finger and NLR genes in Region-2 vs Region-3 + Region-4. a) In the ovary, znf genes in Region-2 were not expressed but some znf genes in Region-3 + Region-4 were expressed. b) In the testis, some znf genes in Region-2 were expressed weakly and many znf genes in Region-3 + Region-4 were expressed more strongly. c) In the ovary, NLR genes were nearly silent in both Region-2 and Region-3 + Region-4. d) In the testis, NLR genes in Region-2 were expressed about as much as NLR genes in Region-3 + Region-4. Significance levels: NS, not significant (P > 0.05); ***, significant at P < 0.0001.
Comparing TPM values for NLR genes in the heterochromatic Region-2 to NLR genes in the euchromatic portion of Chr4R (Region-3 + Region-4) identified no significant difference between these 2 regions in either ovary (P = 0.597) or testis (P = 0.078, Wilcoxon rank sums test), although expression of NLR genes in Region-2 and Region-3 + Region-4 was somewhat higher in testis than ovary (Fig. 5, a and b). These data further suggest that the Chr4R NLR genes may be regulated by mechanisms different from those regulating expression of their zinc finger gene neighbors. NLR gene silencing along the full-length of Chr4R in ovary suggests that the encoded NLR proteins may be deleterious to the function of either the ovary or the zygote if they were provided as maternal transcripts. In the 2 somatic organs we examined (liver and brain), the zinc finger genes and the NLR genes on Chr4R were significantly more highly expressed in Region-3 + Region-4 than in Region-2 (Supplementary Fig. 3), as expected from the distribution of heterochromatic and euchromatic portions on the chromosome.
Discussion
A chromosome domain with protein-coding genes silenced specifically in zebrafish ovaries
Well-differentiated sex chromosomes can have properties that are strikingly different from the rest of the genome (Schartl et al. 2016). Sex chromosomes can accumulate genes with sex-specific functions in addition to the sex locus, as in mammals and drosophila (Lahn and Page 1997; Zhang et al. 2020), although apparently not in birds (Xu and Zhou 2020). To explore the properties of sex chromosomes in zebrafish, we conducted RNA-seq experiments in the Nadia strain, which was not manipulated for utility in mutagenesis protocols, in contrast to the 2 commonly used “wild-type” strains AB and TU (Walker-Durchanek 1980; Streisinger et al. 1981; Mullins et al. 1994). NA and several other unmanipulated zebrafish strains have a ZZ/ZW sex chromosome system with Chr4R as the homomorphic sex chromosome (Anderson et al. 2006; Wilson et al. 2014). In the NA strain, all ZZ fish develop directly into males and ZW fish usually become females but some ZW fish develop as neomales, showing that the W is necessary, but not sufficient, for female development (Wilson et al. 2014, 2024; Valdivieso et al. 2022).
The RNA-seq experiments reported here defined several regions of Chr4 with respect to the relative expression of protein-coding genes in ZZ testis vs ZW ovary. The left arm Chr4L and the centromeric portion of the right arm Chr4R up to just right of the mir430 gene cluster (Region-1) were similar to the bulk of the genome, with somewhat testis-biased expression of protein-coding genes on average and a greater dynamic range in testis than ovary (Fig. 1). Chr4 Region-1 is euchromatic in cytogenetics (Pijnacker and Ferwerda 1995; Daga et al. 1996; Amores and Postlethwait 1999; Phillips et al. 2006) (Fig. 1c). Differential gene expression in the middle portion of the right arm (Region-2), however, was unique across the entire genome: the ratio of protein-coding gene expression in testis vs ovary was greatly biased toward testis with a reduced dynamic range compared to the rest of the genome (Fig. 1a). Region-2 also corresponded to the heterochromatic block on Chr4R (Fig. 1, b and c). To the right of Region-2, in Regions-3 and -4, relative expression ratios for testis vs ovary were again similar to those of autosomes and corresponded in cytogenetic position to a euchromatic region near the right telomere (Fig. 1, b and c). The euchromatic portion of Chr4 adjacent to the right telomere including Region-4 is also distinctive because many of the most strongly overexpressed genes in ovary vs testis are located there (Fig. 1b), and it also contains the major sex-determining locus in natural zebrafish strains (Wilson et al. 2014).
Chr4R contains the largest late-replicating and heterochromatic segment in the zebrafish genome detected in primary cultures of somatic cells (Pijnacker and Ferwerda 1995; Daga et al. 1996; Amores and Postlethwait 1999; Phillips et al. 2006), suggesting reduced transcriptional activity. Maintenance of heterochromatin relies on the histone demethylase Kdm2a (Frescas et al. 2008), and zebrafish larvae lacking kdm2aa activity have upregulated expression of protein-coding genes on Chr4R (Scahill et al. 2017), indicating a role of heterochromatin in inhibiting gene expression in this region. Gene silencing by heterochromatin in sex chromosomes can be a means of dosage compensation, as for Barr bodies in mammals (Barr and Bertram 1949; Ohno et al. 1959; Lyon 1961). The Chr4R heterochromatin block, however, is unlikely to act in dosage compensation for Region-2 genes because Region-2 is present on both the Z and the W in zebrafish and the genes in Region-2 were silenced in ovaries, are expressed in testes, and were downregulated relative to other parts of the genome in somatic organs in both sexes to about the same degree (Fig. 2).
Region-2 of Chr4 contains hundreds of copies of 2 protein-coding gene families with virtually no transcripts detected in ovaries but with positive, although low, transcript levels in testes, and in somatic organs (livers and brains), but without sex specificity in these somatic organs (Figs. 1 and 2). We assume that the lack of transcripts from these genes is due to the silencing of transcription but cannot formally rule out degradation of these specific transcripts exclusively in zebrafish ovaries. Many of these genes exhibit a burst of transcriptional activity in embryos at the maternal-to-zygotic transition (MZT; White et al. 2017). Although Region-2 is silenced for protein-coding genes, it actively expresses oocyte-specific genes for transcript splicing (U1 and U6 snRNAs) and maternal transcripts of RNAs necessary for translation of transcripts into proteins (5S rRNAs) (Locati, Pagano, Ensink et al. 2017), and it also possesses most of the tRNA genes in the genome (Anderson et al. 2012; Chan and Lowe 2016) (Fig. 3a).
To see if the anomalous transcription pattern of Chr4 Region-2 depends on the ZW/ZZ chromosomal sex mechanism, we tested ovaries and testes of AB-strain fish. AB is a domesticated lab line that is likely chromosomally WW due to its origin by gynogenesis (Walker-Durchanek 1980; Walker and Streisinger 1983) (see also https://zfin.org/action/genotype/view/ZDB-GENO-960809-7). Results verified testis-biased gene expression in gonads from domesticated AB zebrafish. This result from AB fish showed that testis-biased gene expression of Region-2 does not depend on the ZW/ZZ sex-determining mechanism, but rather on biological differences between ovary and testis.
To find whether male bias in Region-2 transcription is specific to the gonad or if it also appears in somatic organs, we tested male vs female brains and livers. Results showed that, relative to most of the genome, expression of Region-2 protein-coding genes was generally depressed in somatic organs like it was in the gonads, but that Region-2 expression depression was not sex-specific in somatic organs, in contrast to gonads (Fig. 2). We conclude that male vs female expression bias in Region-2 is specific to gonad development or function. It is possible that Region-2 genes are important for the functioning of testis but not ovary, or that expression of these genes in ovaries would inhibit oocyte differentiation, or that if these transcripts were maternally supplied in the egg, they would hinder zygote development.
Chr4R protein-coding genes
Protein-coding genes located on Chr4R are distinct from the rest of the genome. Chr4R in Ensembl Release 92 has 396 zinc finger genes (see also White et al. 2017) and 234 NLR genes (see also Stein et al. 2007, Howe et al. 2016, and White et al. 2017) that are distributed across Regions-2, -3, and -4 (Fig. 3). Transcripts of znf genes in Region-2 were reduced in our ovary samples due either to lack of transcription or, less likely, targeted transcript degradation. In contrast, transcripts of similar znf genes in Regions-3 and -4 did appear in ovaries (Fig. 5a). These results show first that it is not the whole of Chr4R that is anomalous with regard to transcriptional regulation, but just Region-2, and second that control mechanisms in the heterochromatic (Region-2) and euchromatic (Region-3 + Region-4) portions of Chr4R are likely different. In contrast to ovaries, transcripts from znf genes in Region-2 did appear in testes, but even in testes, more transcripts were found from znf genes located in Region-3 + Region-4 than in Region-2 (Fig. 5b). This result showed that transcripts for znf genes located in the heterochromatic portion of Chr4R were less abundant than transcripts for znf genes found in the euchromatic portion of Chr4R in both ovary and testis.
The znf genes in Region-2 are interesting not only because of their ovary-specific silencing but also because of their pattern of expression in zebrafish embryos. These Chr4R znf genes initiate expression in concert at the beginning of zygotic transcription, increasing from the 1,000-cell stage [3.0 hours post-fertilization (hpf)] to dome stage (4.3 hpf) and then decreasing by the 75% epiboly stage (8.0 hpf) (Kimmel et al. 1995; White et al. 2017). Thus, the Region-2 znf genes (1) are specifically silenced in ovaries; (2) are expressed at low levels in testes, brains, and livers; (3) are transcribed early in the MZT, and then (4) their transcripts are rapidly degraded. Furthermore, zebrafish oocytes lacking function of kdm2aa, which is necessary for normal heterochromatin function, overexpress Chr4R znf genes, and when mated to wild-type sperm, produce defective eggs and many inviable embryos (Scahill et al. 2017). These data suggest the hypothesis that the silencing of Chr4R znf genes in oocytes occurs either because these Znf proteins are harmful to oocyte development (most kdm2aa mutants develop as males, a phenotype consistent with oocyte death in the juvenile ovary (Rodriguez-Mari et al. 2010)) or detrimental to embryos if these proteins were present before the normal schedule of zygotic gene expression (Scahill et al. 2017). The difference in expression of the Region-2 znf genes and the more telomeric Region-3 and Region-4 znf genes suggests that the telomeric zinc finger proteins may have a different regulatory mechanism and function.
The NLR genes in Region-2, like the znf genes, were silenced in ovaries, but unlike znf genes, NLR genes in Regions-3 and -4 were also silent in ovaries, and even for testes, the difference in expression levels of NLR genes in Region-2 compared to Regions-3 and -4 was not significant (Fig. 5, c and d). This result suggests that NLR proteins are unnecessary or deleterious in ovaries. In brain and liver, expression of NLR genes in Region-2 was significantly less than in Regions-3 and -4, consistent with the heterochromatic inhibition of gene expression in Region-2 in somatic cells (Pijnacker and Ferwerda 1995; Daga et al. 1996; Gornung et al. 1997; Amores and Postlethwait 1999; Sola and Gornung 2001; Traut and Winking 2001; Phillips et al. 2006). Expression of at least 148 of the NLR genes on Chr4R was detected during embryogenesis, but they did not exhibit the same expression pattern as the znf genes (White et al. 2017). Thus, although the NLR genes of Chr4R are downregulated in ovary vs testis, in general, they do not show the burst of transcription around the MZT that their znf neighbors show; therefore, despite their interspersed location along Chr4R (Fig. 3b), expression of these znf and NLR genes must depend on different regulatory mechanisms.
Mucin genes in Region-4 included some of the most strongly overexpressed genes in adult zebrafish ovaries vs testes in the entire genome (Fig. 1b). The most strongly overexpressed ovary-vs-testis genes included CU467646.3, a member of a tandemly expanded mucin-2-like cyprinid-specific gene family (see gene tree ENSGT00630000090221) that encodes oocyte-expressed chorion proteins (Hau et al. 2020). None of the ovary-biased protein-coding genes in Region-4 are related to any known genes in the sex-determination pathway in other species. We doubt that eggshell proteins are primary sex-determining factors because these genes are not expressed in ZW gonads until oocytes develop to Selman Stage IB (Selman et al. 1993) and gonads in ZZ individuals develop directly into testes without even forming Stage IA oocytes (Wilson et al. 2024). We conclude that the egg chorion genes located in Region-4 likely represent another case of sex-specific genes captured on a sex chromosome.
Region-2 and mir430
A cluster of more than 50 mir430 genes lies adjacent to the left border of Region-2 (Nudelman et al. 2018; Desvignes et al. 2022). MicroRNAs can modulate gene expression by binding to messenger RNAs to regulate their translation or degradation (Fabian et al. 2010; Huntzinger and Izaurralde 2011). Mature products of mir430 genes regulate the clearance of maternal mRNAs from zebrafish embryos by causing deadenylation, and hence transcript decay, but they are not supplied to the egg during oogenesis (Giraldez et al. 2005, 2006; Lund et al. 2009; Takacs and Giraldez 2016). Instead, mir430 gene expression in zebrafish begins after the 64-cell stage (2 hpf) concomitant with the recruitment of the cohesin subunit Rad21 (Meier et al. 2018); mir430 expression peaks at 4 hpf, and then decreases after 24 hpf (Chen et al. 2005; Heyn et al. 2014). The bulk of zygotic gene transcription begins mostly after the 1,000-cell stage at 3 hpf as zygotic transcripts replace maternal transcripts (Kane and Kimmel 1993; Lee et al. 2014; White et al. 2017; Wells et al. 2023). Within the nuclei of early zebrafish embryos, mir430 genes form a transcription compartment with their neighboring znf genes (Hadzhiev et al. 2019). The regulatory relationships of mir430 and znf loci in Region-2 that silence them in ovaries and then activate them around the MZT, however, are not yet fully known.
Chr4R Region-2: a maternal source for the zygote's translation machinery
Maternal ncRNA genes are clustered on Chr4R. Vertebrate eggs generally contain large numbers of ribosomes packed into oocytes that aid in translating maternal transcripts before zygotic genes become transcriptionally active (Brown and Gurdon 1964). Zebrafish have distinct maternal and somatic RNA genes (Locati, Pagano, Ensink et al. 2017; Locati, Pagano, Girard et al. 2017; Locati et al. 2018; Ortega-Recalde et al. 2019; Pagano, Dekker et al. 2020; Pagano, Locati et al. 2020). On zebrafish Chr4R, Region-2 has nearly all of the genome's maternal 5S rRNA genes (Fig. 3b), which are expressed in oocytes but not in somatic cells of adult fish, while a locus on Chr18 contains twelve 5S rRNA genes that, reciprocally, are expressed in somatic cells of embryonic and adult fish but not in oocytes (Gornung et al. 2000; Locati, Pagano, Ensink et al. 2017).
While each of the many 5S rRNA genes in eukaryotic genomes is transcribed into a single 5S rRNA molecule, each of the several 45S rRNA genes is transcribed as a single transcript that is processed to form the 18S, 5.8S, and 28S ribosome components (Long and Dawid 1980). rRNA gene expression was not specifically assayed in our analysis due to selection for polyadenylated transcripts and alignment to annotated protein-coding genes, but our data showed a single gene in Region-2 that was upregulated in ovary vs testis (ENSDARG00000115819, Fig. 1), and sequence analysis showed that ENSDARG00000115819 is an 18S ribosomal gene that was misannotated as a protein-coding gene in Ensembl. This result shows that Region-2 has at least one 18S gene that is expressed maternally. In addition, three 45S genes misannotated as protein-coding genes (ENSDARG00000112275, ENSDARG00000113671, ENSDARG00000115978) in Chr4 Region-4 were overexpressed in ovaries vs testes in our dataset. These 45S genes are expressed exclusively in oocytes and encode rRNAs for maternally supplied ribosomes (Locati, Pagano, Girard et al. 2017; Ortega-Recalde et al. 2019; Tao et al. 2020). These maternal ribosomes also have a specific pattern of 2′-O-methylation different from somatic ribosomes (Ramachandran et al. 2020). Thus, Region-2 contains at least 1 maternal 18S gene and Region-4 contains at least 3 other maternal 18S RNA-encoding 45S rRNA genes, verifying the roles of Region-2 and other parts of Chr4R in supplying maternal ribosome components, while genes on other chromosomes provide somatic/zygotic ribosomal RNAs (Locati, Pagano, Girard et al. 2017; Tao et al. 2020).
Maternal spliceosome components also occupy Region-2 (Anderson et al. 2012). Genes encoding U1 and U6 spliceosome components that are specifically transcribed in oocytes are in Region-2, making these snRNAs available for splicing either maternal messages in the oocyte or the earliest transcripts in the embryo (Pagano, Dekker et al. 2020). The location of maternal 5S and spliceosome component genes in Region-2 on the zebrafish sex chromosome represent another example of sex-specific functions encoded on the sex chromosome.
Transfer RNA genes are also mostly concentrated in Region-2 of Chr4 (Anderson et al. 2012; Chan and Lowe 2016) (Fig. 3), although tRNAs have not yet been studied with respect to maternal vs zygotic/somatic cell expression. tRNA genes are also found elsewhere in the genome, so our hypothesis is that the Region-2 tRNA genes are maternal-specific and thus contribute to specialized maternally derived translation machinery for the embryo along with Region-2 ribosomal and spliceosomal genes.
Ch4R transposable elements
Chr4R carries a distinctive array of repetitive elements (Anderson et al. 2012; Howe et al. 2013; Chang et al. 2022), but we found that the overall pattern of differential expression of transposable elements in testis vs ovary was not different in Region-2 from the rest of the genome. Thus, for the gonads at least, the heterochromatic nature of Region-2 does not block the expression of several types of genes, including transposable element genes, maternal 5S rRNA genes, and maternal U1 and U6 snRNA genes.
A maternal-to-zygotic-transition gene block
In zebrafish, chromosome-4 plays an important role in gonad development and early embryogenesis (Fig. 6). Region-4 near the right telomere of Chr4 contains 3 groups of genetic elements important for egg development: (1) a factor that is necessary but not sufficient for the bipotential gonad to develop into an ovary in natural zebrafish strains (Wilson et al. 2014, 2024); (2) mucin genes encoding egg chorion proteins (Fig. 1); and (3) several 45S ribosomal RNA genes that are expressed specifically in oocytes but not in somatic cells, which express a different set of 45S rRNA genes (Locati, Pagano, Girard et al. 2017; Ortega-Recalde et al. 2019; Tao et al. 2020). Chr4 Region-2 has genes involved in various ways in the production of the maternally inherited gene products that drive the progression of embryonic cell cleavage and the embryo's transition from maternally derived components to zygotically derived machinery. Maternally expressed 5S rRNA genes and spliceosome factors (U1, U6) lie in Region-2 (Fig. 6a), while somatic alternatives for these RNAs are located elsewhere in the genome (Anderson et al. 2012; Locati, Pagano, Ensink et al. 2017; Pagano, Dekker et al. 2020). Ovaries also express at least one 18S rRNA gene, which are usually a part of a 45S rRNA gene (Fig. 1), although most maternal 45S rRNA genes are in Region-4 (Ortega-Recalde et al. 2019). While ovaries express these Region-2 translation machinery genes (Locati, Pagano, Ensink et al. 2017; Pagano, Dekker et al. 2020), they do not express the interspersed Region-2 protein-coding genes (Figs. 1, 5, and 6a).
A maternal-to-zygotic-transition gene block. a) In the ovary, oocytes express maternal 5S rRNA (green vertical bars on the chromosome sketch) and spliceosome genes (U1, U6; blue and mauve) in Region-2 but not znf genes (yellow) in Region-2 or the adjacent mir430 genes (white) (Locati, Pagano, Ensink et al. 2017; Locati, Pagano, Girard et al. 2017; Locati et al. 2018; Pagano, Dekker et al. 2020; Pagano, Locati et al. 2020). The 5S ribosomal and the spliceosomal maternal transcripts help make the machinery to translate proteins essential for embryogenesis from maternal mRNAs before expression of the zygotic genome. Maternal 45S genes are in Region-4 (not indicated in the sketch) (Ortega-Recalde et al. 2019). b) Reciprocally, early embryos at the beginning of the maternal-to-zygotic transition do not express maternal 5S rRNA or maternal spliceosome (U1, U6) genes in Region-2, but do express Region-2 znf genes and the nearby mir430 genes. Znf proteins may help to stabilize the mobility of transposable elements (White et al. 2017; Hadzhiev et al. 2019; Wells et al. 2023) and the mir430 mature products promote clearance of maternal messenger RNAs (Giraldez et al. 2005; Takacs and Giraldez 2016; Giraldez et al. 2006; Lund et al. 2009). R1, R2, R3, and R4 represent Regions-1 to -4.
Expression patterns of Region-2 genes are reversed in ovaries vs embryos (Fig. 6b). In late cleavage at about the 1,000-cell stage 3.0 hpf, the rate of cleavage slows, the cell cycle lengthens, and cell divisions become asynchronous in the mid-blastula transition (Kane and Kimmel 1993). About a half hour earlier, however, several znf genes in Region-2 begin to be expressed and many burst into transcription from dome stage to 75% epiboly (4.3–8 hpf) (White et al. 2017; Wells et al. 2023) (Fig. 6b). At least some proteins encoded by these activated Region-2 znf genes appear to repress the expression of retroelements during ZGA (White et al. 2017; Wells et al. 2023), and without that suppression, early development goes awry. Autosomal copies of translation machinery genes, including snRNA genes, ribosomal RNA genes, and we predict tRNA genes, become active (Locati, Pagano, Ensink et al. 2017; Pagano, Dekker et al. 2020), replace the maternal translation machinery, and translate zygotic mRNAs as the embryo transitions to using its own genome. These changes are associated with the redistribution of the cohesin nuclear architecture subunit Rad21 specifically from Region-2 to euchromatic regions of the genome (Meier et al. 2018). At this stage, mir430 genes near Region-2 become active (Fig. 6b) and mir430 gene products mediate the decay of maternal mRNA transcripts (Giraldez et al. 2005, 2006; Bazzini et al. 2012). The mechanism for the switch in gene expression is unknown, but transposable elements, including L1 elements, some of which are enriched in Region-2, can make strong contributions to developmental cell-type gene expression (Lu et al. 2020) (Fig. 4, c and d). One hypothesis is that Region-2's unique repeat composition contributes to the reciprocal regulation of this region's protein-coding vs RNA genes in ovaries and early embryos.
In conclusion, the dramatic lack of ovarian transcripts for Chr4 Region-2 protein-coding genes discovered here coupled with the oocyte's expression of Region-2 maternal translation machinery genes and the subsequent reversal of these patterns at the time of zygotic gene activation shows that Chr4 Region-2 is a maternal-to-zygotic-transition gene block.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Abhiman S , Iyer LM, Aravind L. 2008. BEN: a novel domain in chromatin factors and DNA viral proteins. Bioinformatics. 24(4):458–461. doi:10.1093/bioinformatics/btn 007.18203771 PMC 2477736 · doi ↗ · pubmed ↗
- 2Aharon D , Marlow FL. 2021. Sexual determination in zebrafish. Cell Mol Life Sci. 79(1):8. doi:10.1007/s 00018-021-04066-4.34936027 PMC 11072476 · doi ↗ · pubmed ↗
- 3Amores A , Postlethwait JH. 1999. Banded chromosomes and the zebrafish karyotype. Methods Cell Biol. 60:323–338. doi:10.1016/S 0091-679X(08)61908-1.9891345 · doi ↗ · pubmed ↗
- 4Anand L , Rodriguez Lopez CM. 2022. Chromo Map: an R package for interactive visualization of multi-omics data and annotation of chromosomes. BMC Bioinformatics. 23(1):33. doi:10.1186/s 12859-021-04556-z.35016614 PMC 8753883 · doi ↗ · pubmed ↗
- 5Anders S , Pyl PT, Huber W. 2015. HT Seq—a Python framework to work with high-throughput sequencing data. Bioinformatics. 31(2):166–169. doi:10.1093/bioinformatics/btu 638.25260700 PMC 4287950 · doi ↗ · pubmed ↗
- 6Anderson DM , Arredondo J, Hahn K, Valente G, Martin JF, Wilson-Rawls J, Rawls A. 2006. Mohawk is a novel homeobox gene expressed in the developing mouse embryo. Dev Dyn. 235(3):792–801. doi:10.1002/dvdy.20671.16408284 · doi ↗ · pubmed ↗
- 7Anderson JL , Rodriguez Mari A, Braasch I, Amores A, Hohenlohe P, Batzel P, Postlethwait JH. 2012. Multiple sex-associated regions and a putative sex chromosome in zebrafish revealed by RAD mapping and population genomics. Plos One. 7(7):e 40701. doi:10.1371/journal.pone.0040701.22792396 PMC 3392230 · doi ↗ · pubmed ↗
- 8Baldridge D , Wangler MF, Bowman AN, Yamamoto S, Schedl T, Pak SC, Postlethwait JH, Shin J, Solnica-Krezel L, Bellen HJ, et al 2021. Model organisms contribute to diagnosis and discovery in the undiagnosed diseases network: current state and a future vision. Orphanet J Rare Dis. 16(1):206. doi:10.1186/s 13023-021-01839-9.33962631 PMC 8103593 · doi ↗ · pubmed ↗
