Unique territorial and compartmental organization of chromosomes in the holocentric silkworm
José Gil, Emily Navarrete, Clio Hockens, Neil Chowdhury, Sameer Abraham, Gaétan Cornilleau, Elissa P Lei, Julien Mozziconacci, Edward J Banigan, Leah F Rosin, Leonid A Mirny, Héloïse Muller, Ines Anna Drinnenberg

TL;DR
The silkworm's genome has a unique 3D structure with a secluded compartment S, which is distinct from typical genome compartments and may involve new mechanisms of chromatin folding.
Contribution
Discovery of a novel chromatin compartment S in the silkworm genome, which is secluded and exhibits unique loop extrusion and epigenetic features.
Findings
The silkworm genome has a three-compartment structure including a secluded compartment S.
Compartment S is isolated from other genomic regions and shows high loop extrusion activity.
S domains are developmentally dynamic and correlate with gene expression changes.
Abstract
Hallmarks of multicellular eukaryotic genome organization are chromosome territories, compartments, and loop-extrusion-mediated structures, including TADs. However, these have mainly been observed in model organisms, and most eukaryotes remain unexplored. Using Hi-C in the silkworm Bombyx mori we discover a novel chromatin folding structure, compartment S, which is “secluded” from the rest of the chromosome. This compartment exhibits loop extrusion features and a unique genetic and epigenetic landscape, and it localizes towards the periphery of chromosome territories. While euchromatin and heterochromatin display preferential compartmental contacts, S domains are remarkably devoid of contacts with other regions, including with other S domains. In polymer simulations, this contact pattern can only be explained by high loop extrusion activity within compartment S, combined with low…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 17
Figure 18- —http://dx.doi.org/10.13039/100023581NSF | National Science Foundation Graduate Research Fellowship Program (GRFP)
- —http://dx.doi.org/10.13039/100000002HHS | National Institutes of Health (NIH)
- —http://dx.doi.org/10.13039/501100001665Agence Nationale de la Recherche (ANR)
- —http://dx.doi.org/10.13039/501100004794Centre National de la Recherche Scientifique (CNRS)
- —http://dx.doi.org/10.13039/501100002915Fondation pour la Recherche Médicale (FRM)
- —http://dx.doi.org/10.13039/501100010463Institut Curie (institut_curie)
- —http://dx.doi.org/10.13039/501100000781EC | European Research Council (ERC)
- —http://dx.doi.org/10.13039/100000062HHS | NIH | National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)
- —http://dx.doi.org/10.13039/100009633HHS | NIH | Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD)
- —http://dx.doi.org/10.13039/501100017210CNRS | Institut des sciences biologiques (INSB)
- —http://dx.doi.org/10.13039/501100001677Institut National de la Santé et de la Recherche Médicale (Inserm)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsChromosomal and Genetic Variations · Genomics and Chromatin Dynamics · Silk-based biomaterials and applications
Introduction
Three-dimensional (3D) genome organization regulates cellular physiology by influencing gene expression (Seitan et al, 2013; Hnisz et al, 2016; Lupiáñez et al, 2015; Brückner et al, 2023; Ogiyama et al, 2018; Kiefer et al, 2023, 2024; Misteli, 2020), DNA replication (Jackson and Pombo, 1998; Nakatani et al, 2024), and DNA repair (Piazza et al, 2021; Arnould et al, 2021), among other processes (Misteli, 2020; Dekker and Mirny, 2024; Peters, 2021; Batty and Gerlich, 2019; Ghosh and Meyer, 2021). This rich, multiscale spatial organization includes chromosome territories (CTs) (Lichter et al, 1988; Cremer and Cremer, 2001; Hildebrand et al, 2024), chromatin compartments (Lieberman-Aiden et al, 2009; Wang et al, 2016), and extrusion-mediated loops (Banigan and Mirny, 2020; Yatskevich et al, 2019; Fudenberg et al, 2016; Sanborn et al, 2015; Rao et al, 2017; Schwarzer et al, 2017; Gibcus et al, 2018; Goloborodko et al, 2016), each of which arises through a distinct physical mechanism (Dekker and Mirny, 2024; Ghosh and Meyer, 2021; Mirny et al, 2019). These mechanisms simultaneously act on chromatin, raising the possibility that they may interact with each other. Here, we study the genome folding of the silkworm Bombyx mori and uncover a novel interplay between two major mechanisms: compartmentalization and loop extrusion.
Chromatin compartmentalization is thought to arise from affinity-mediated phase separation of different types of chromatin. By this mechanism, similar types of chromatin (broadly, euchromatin and heterochromatin) may segregate into spatially distinct compartments (Dekker and Mirny, 2024; Mirny et al, 2019; Jost et al, 2014; Nuebler et al, 2018; Falk et al, 2019; Solovei et al, 2016). In Hi-C and other chromatin contact maps, this effect manifests as a distinctive checkerboarding pattern reflecting preferential interactions between segments of like chromatin state. Studies suggest that compartmentalization maintains epigenetic state, through maintaining its long-term memory (Murphy and Boettiger, 2024; Owen et al, 2023) and the co-regulation of epigenetic states between distal loci (Kraft et al, 2022).
Loop extrusion is driven by molecular motors known as the structural maintenance of chromosome (SMC) complexes, including cohesin and condensin. These motors bind to chromatin and extrude it into progressively enlarging loops. Cohesin-mediated extrusion is stalled by barriers, such as CTCF in vertebrates. This process forms TADs in Hi-C maps, which are insulated domains of enriched contacts demarcated by dots and stripes (Fudenberg et al, 2016; Sanborn et al, 2015; Rao et al, 2017; Schwarzer et al, 2017; Nora et al, 2012; Dixon et al, 2012; Rao et al, 2014; Vian et al, 2018). The interphase genome of vertebrates is typically fully partitioned into TADs, reflecting a relatively uniform distribution of cohesin across the genome (Yatskevich et al, 2019; Fudenberg et al, 2016; Sanborn et al, 2015; Rao et al, 2017; Schwarzer et al, 2017; Wutz et al, 2017; Nora et al, 2017). However, in certain developmental contexts, cohesin can be recruited to specific loci, forming a distinct Hi-C contact pattern, termed fountains/jets/plumes (Guo et al, 2022; preprint: Galitsyna et al, 2023; Kim et al, 2025; Shao et al, 2024; Wike et al, 2021; preprint: Isiaka et al, 2023). Thus, loop extrusion activity may generate different structures depending on cohesin loading patterns and the presence of extrusion barriers in the surrounding chromatin.
Studies of extrusion in vertebrate systems have often found an antagonistic relationship between loop extrusion and compartmentalization (Rao et al, 2017; Schwarzer et al, 2017; Gassler et al, 2017; Wutz et al, 2017; Haarhuis et al, 2017). Suppressing cohesin activity removes chromatin loops and thereby weakens TADs, while also strengthening compartments (Rao et al, 2017; Schwarzer et al, 2017; Gassler et al, 2017; Wutz et al, 2017). Increasing cohesin’s residence time or extrusion speed on chromatin lengthens loops, strengthens TADs, and weakens compartments (Wutz et al, 2017; Gassler et al, 2017; Haarhuis et al, 2017; preprint: Wutz et al, 2025; preprint: Shah et al, 2025). Polymer modeling reveals this antagonism may be caused by cohesin mixing different types of compartments and disrupting interactions between compartments of the same type (Nuebler et al, 2018; preprint: Wutz et al, 2025). However, recent studies have demonstrated that loop extrusion is not solely antagonistic to compartmentalization, as stabilized extrusion enhances microcompartments in simulations of mitotic chromosomes (Goel et al, 2025). Thus, the effect of loop extrusion on compartmentalization may depend on the precise activity of loop extrusion in a given context.
While many studies focus on loop extrusion in vertebrates, loop extrusion is well conserved across the kingdoms of life. Indeed, many important insights into vertebrate loop extrusion activity were first established in other organisms. For example, observations of SMC complexes in yeast (Lengronne et al, 2004; Gullerova and Proudfoot, 2008; Glynn et al, 2004; Bausch et al, 2007) and bacteria (Tran et al, 2017; Brandão et al, 2019) were foundational for subsequent studies explaining the nature of SMC-RNA Polymerase interactions in mammals (Busslinger et al, 2017; Banigan et al, 2023). Similarly, targeted loading of SMC complexes to specific loci in bacteria (Le et al, 2013; Marbouty et al, 2015; Wang et al, 2015, 2017) preceded the discovery of fountains/jets/plumes in vertebrate cells (Guo et al, 2022; preprint: Galitsyna et al, 2023; Wike et al, 2021). Therefore, studying diverse organisms beyond vertebrates can reveal new principles of loop extrusion and genome organization.
While a recent study of 3D genomics compared a diverse set of organisms, it excluded species with radically different linear chromosome organization, such as species with holocentric chromosomes (Hoencamp et al, 2021). These have centromeric function distributed along the entire length of each chromosome, unlike most eukaryotes that have a single centromeric region per chromosome. Few studies have investigated the 3D genome organization of holocentric organisms, with the notable exception of C. elegans (Crane et al, 2015; Gabdank et al, 2016; Rowley et al, 2020; Das et al, 2024). Thus, studies of such organisms could expand our understanding of the principles of genome architecture.
To this end, we characterize the genome architecture of the holocentric silkworm B. mori, which evolved holocentricity convergently with C. elegans. By combining Hi-C, epigenetic and transcriptomic profiling, polymer modeling, and Oligopaint FISH, we find a unique genome organization in B. mori. This organism exhibits a high degree of chromosome territoriality, three epigenetically distinct compartments, and a nonuniform pattern of loop extrusion activity. Our findings indicate that the B. mori genome is an extreme instance of loop extrusion interfering with compartmentalization. We uncover novel, secluded domains that are strikingly depleted of chromosome-wide compartment contacts, which our analysis and modeling suggest are established by targeted loop extrusion to these loci. This unique chromatin structure appears to rely on the interplay of extrusion and compartmentalization and does not arise by either mechanism alone in our simulations. Our study thus demonstrates how novel chromatin structures can emerge from new combinations of conserved genome folding mechanisms.
Results
Bombyx mori linear genome organization
The B. mori genome encompasses 27 autosomes, the Z sex chromosome (Chr01), and the W sex chromosome, which vary in length from approximately 8 to 22 Mb. Genome-wide analysis of GC content revealed a tripartite organization of B. mori chromosomes, with a central GC-poor region and two large telomere-proximal GC-rich regions (Fig. 1A). This linear organization is reminiscent of that of the nematode C. elegans (Fig. 1B), another holocentric organism whose chromosomes are divided into centers and arms (C. elegans Sequencing Consortium, 1998). We thus used our GC content track along B. mori chromosomes to define arms and center regions for each chromosome (Table EV1). In C. elegans, chromosomal arms are enriched in repetitive DNA, while centers are enriched in genes (C. elegans Sequencing Consortium, 1998) (Fig. 1B). In B. mori, we found a similar organization with significant, but less distinct, patterns of transposable elements (TEs) and gene coverage (Fig. 1B). The similarity in linear genome organization between these two species might reflect convergent evolution between the two holocentric lineages that evolved independently from different monocentric ancestors (Melters et al, 2012; Senaratne et al, 2022).Figure 1. Hi-C of Bombyx mori embryos reveals highly distinct chromosome territories.(A) Schematics of B. mori chromosomes drawn to approximate scale indicated in Mb below. For each chromosome, total GC content in percentage is noted, and local GC content per 100 kb window is indicated as a color scale from blue (< 20%) to red (> 60%). A suffix “c” added to a chromosome’s name indicates assembly corrections were made (Fig. EV9). (B) Box plots showing the distribution of GC content, TE coverage, and gene density per 100 kb window in center and arm regions of autosomes in C. elegans and B. mori. Statistical significance was tested using a Mann–Whitney U-test, **P value < 0.05, ***P value < 0.005. P values for GC content in C. elegans: 1.08 × 10^−8^; in B. mori: 0. P values for TE coverage in C. elegans: 0; in B. mori: 0. P values for gene coverage in C. elegans: 1.63 × 10^−10^; in B. mori: 4.92 × 10^−2^. Boxed regions correspond to data between the first and third quartiles. Lines indicate medians of respective distributions, while crosses correspond to their means. Whiskers extend to the lowest and highest data points, excluding outliers, which are shown as dots. (C) Sample generation for Hi-C and ChIP-Seq. (D) Contact map at 80 kb bin resolution of five selected B. mori chromosomes (on top). (E) Average inter-chromosomal (trans) ^observed^/expected matrix for all scaled B. mori chromosomes, computed at 40 kb bin resolution. (F) Heatmap illustrating ^observed^/expected inter-chromosomal contact frequencies as a divergent color scale from blue to red. Chromosomes have been clustered and ordered to reflect similar contact patterns. Source data are available online for this figure.
Chromosomes in B. mori embryos form highly segregated territories
We generated Hi-C datasets from whole B. mori embryos at three different embryonic post-diapause timepoints (2-, 24-, and 48-h after diapause release) and one adult stage from the p50 reference strain. For most of our analysis, we focused on the 24-h post-diapause timepoint (PD-D2) (Fig. 1C), for which we confirmed that a proportion of cells have re-entered the cell cycle and are thus no longer arrested in G2 (Fig. EV1) (Nakagaki et al, 1991).
The PD-D2 Hi-C contact pattern across all 28 chromosomes revealed very sparse inter-chromosomal contacts (Fig. 1D). This is consistent with recent DNA FISH data that revealed that B. mori chromosomes are tightly folded and occupy distinct CTs (Rosin et al, 2022). Despite the similarity in linear genome organization between B. mori and C. elegans, clustering between centers and arms of different chromosomes, which is easily seen on C. elegans contact maps (Crane et al, 2015; Gabdank et al, 2016), is not evident in B. mori. Nevertheless, the average trans contact matrix reveals enrichments between large sub-telomeric regions, indicating some degree of telomere or arm clustering between chromosomes in B. mori (Fig. 1E).
Despite the sparsity of inter-chromosomal contacts, we tested whether known patterns of sub-nuclear positioning of chromosomes seen in other organisms are present in B. mori. In humans, small, gene-rich chromosomes have been shown to preferentially interact with each other and localize more centrally within the nucleus (Lieberman-Aiden et al, 2009; Boyle et al, 2001; Croft et al, 1999; Tanabe et al, 2002). A length-dependent contact preference can also be identified among B. mori chromosomes, with a group of small chromosomes (Chr02, 28, 26, 20, and 16) having the highest average inter-chromosomal contact frequency (Figs. 1F and EV2A). However, the correlation between inter-chromosomal contacts and gene density is mild (Fig. EV2B). This might be due to the lower variation in length and gene content among B. mori chromosomes compared to human chromosomes. We also noticed a positive correlation between inter-chromosomal contacts and GC content (Fig. EV2C,D), but whether GC content contributes to observed contact preferences is unclear.
Our data reveal conserved principles in chromosome organization in B. mori, such as the formation of CTs and mild length-dependent inter-chromosomal contact preferences. In contrast to other organisms, however, inter-chromosomal contacts are not correlated with gene density, and B. mori chromosomes make very limited inter-chromosomal contacts overall, highlighting the remarkably strong CTs in this organism.
B. mori chromosomes are organized in three chromatin compartments
We next investigated compartmentalization of B. mori chromosomes. Many regions of the Hi-C maps displayed a characteristic checkerboard pattern of alternating regions of two types, displaying enriched homotypic (self-to-self) and depleted heterotypic (self-to-other) contacts (Fig. 2A, left and middle panels). This is indicative of compartmentalization as seen in all other studied animals (Misteli, 2020; Mirny et al, 2019; Pontvianne and Grob, 2020; Mirny and Dekker, 2022; Oudelaar and Higgs, 2021; Rowley et al, 2019; Rowley and Corces, 2018). However, unlike Hi-C maps from any other studied animal, we unexpectedly found large chromatin segments that do not display checkerboard patterning. Instead, these regions engage in a high degree of short-range contacts but few long-range contacts (Fig. 2A, left and right panels). Therefore, these regions are atypical compared to previously described compartments that are composed of checkerboarding domains that preferentially interact with one another chromosome-wide (Lieberman-Aiden et al, 2009; Dekker and Mirny, 2024; Mirny et al, 2019; Wang et al, 2016). In addition, these regions also appear to be different from close to diagonal, locally defined structures like TADs, which are described in vertebrates and, unlike compartments, do not show checkerboarding.Figure 2. Hi-C compartment analysis reveals a novel and secluded chromatin folding structure.(A) Left: heatmap above the diagonal represents the iterative-corrected Hi-C contact map of Chr21 at 40 kb bin resolution, and heatmap below the diagonal is the corresponding Pearson correlation matrix. Below the matrix are the gene track (black), colored boxes indicating the locations of domains of compartment A (orange), B (turquoise), and S (purple) and tracks of the first three principal components (gray) computed from the correlation matrix. Right: close-up on regions highlighted on the left, I: checkerboarding region, II: non-checkerboarding region. Below matrices are tracks for compartments and genes as in the left as well as corresponding RNA-seq at 1 kb resolution and ChIP-Seq tracks for various histone marks at 25 bp resolution. (B) Example of clustering along PC1 and PC2 of the Pearson correlation matrix for Chr19c. Clusters called using the k-means function of scikit-learn are shown in different colors. For each cluster, the centroid and the corresponding assignment to A, B, and S compartments is indicated. (C) Heatmap showing genome-wide enrichment of assayed histone marks across compartments A, B, and S. Enrichment values correspond to median of IP/input ratios at 40 kb resolution normalized to the genome-wide median for each mark. (D) Box plots showing the size distribution of A, B, and S domains. Boxed regions correspond to data between the first and third quartile. Lines indicate the medians of respective distributions, while crosses correspond to their means. Whiskers extend to the lowest and highest data points, excluding outliers, shown by dots. (E) Bar graph showing relative genomic coverage of A, B, and S. (F) Box plots showing distribution of gene coverage per 40 kb window of the whole genome (n = 11,143) or within A (n = 3294), B (n = 5772), and S (n = 1670). Line indicates median of distribution while cross corresponds to mean. Whiskers extend to lowest and highest data points, excluding outliers, which are shown as dots. All distributions in each compartment for all features are significantly different from each other and from the genome-wide distribution by the Mann–Whitney (95%, two-tailed) U-test. (G) Box plots showing range of gene expression (in TPM) whole genome (n = 13,869) or in A (n = 7404), B (n = 4500), and S (n = 608). Boxed regions correspond to data between the first and third quartile. Lines indicate medians of respective distributions, while crosses correspond to their means. Whiskers extend to the lowest and highest data points, excluding outliers, which have been removed. (H) Average cis off-diagonal (inter-domain) ^observed^/expected contact enrichment plots within and between all rescaled A, B, and S compartments. (I) Rescaled average cis on-diagonal (intra-domain) ^observed^/expected contact enrichments for the three compartment types. (J) Top: Contact probability P(s) as a function of genomic distance, s, for the PD-D2 Hi-C data. Bottom: Log-derivative of contact probability as a function of genomic distance. Curves represent the average P(s) of either chromosome-wide (gray) or for contiguous segments of a given compartment type (colored). Estimates of the average loop size in S (ℓS) and in B (ℓB), corresponding to the peaks in the log-derivative curves, are labeled on the x-axis. (K) Regions of the PD-D2 Hi-C maps displaying characteristic features of loop extrusion at 5 kb resolution, with compartment and gene tracks below. Source data are available online for this figure.
To annotate chromosomal loci engaged in this novel chromosome-wide contact pattern as well as those displaying checkerboard patterning, we clustered genomic regions along B. mori chromosomes according to their intra-chromosomal contact patterns. Following a previously developed approach to identify compartments based on their chromosome-wide contact patterns (Spracklin et al, 2023), we k-means clustered the leading principal components of the Pearson-correlated contact matrices, revealing groups of loci with similar chromosome-wide contact profiles (Fig. 2A,B; Table EV2). This strategy enabled us to define three main clusters along each chromosome, which we then unified across the genome based on their epigenetic composition (Methods).
We profiled active (H3K36me3 and H3K4me3) and silent (H3K27me3 and H3K9me3) histone marks by ChIP-seq in the PD-D2 embryonic stage of B. mori. We also included H2A.Z, a histone variant associated with transcriptional control (Giaimo et al, 2019), which is enriched in a sub-compartment that shows an attenuated checkerboarding pattern in the cancer cell line HCT116 (Spracklin et al, 2023). Additionally, we included H4K20me1, a mark associated with centromeric nucleosomes in vertebrates (Hori et al, 2014), as well as a variety of processes including transcriptional regulation, chromosome replication and segregation, DNA damage response, and chromosome compaction (Beck et al, 2012; Meyer, 2022). Consistent with the compartmentalization of euchromatin and heterochromatin observed in other animals, we found that the two checkerboarding clusters were enriched either in active or inactive histone marks (Figs. 2A,C and EV3A–C) and thus were named compartment A or B accordingly. The third cluster captured the novel, non-checkerboarding regions. We termed this cluster “compartment S,” reflecting its secluded, spatially segregated behavior observed in the Hi-C maps. Compartment S displayed an epigenetic signature distinct from both A and B (Figs. 2C and EV3A,B). Similar to B but unlike A, S is enriched in the repressive mark H3K27me3 and depleted of active histone marks (H3K4me3 and H3K36me3) and H2AZ. Conversely, similar to A but unlike B, S is, on average, enriched in H4K20me1 although this enrichment does not extend across the entire S chromatin (Fig. EV3A). Thus, S forms a distinct cluster along the H3K36me3 and H4K20me1 axes (Fig. EV3B) underscoring its unique chromatin state.
Domains, which we define as contiguous segments of the same compartment type, of each compartment are heterogeneous in length and cover different fractions of the genome (Fig. 2D,E). Overall, A and S domains are smaller, with a median size of 80 kb, while B domains are generally larger, with a median size of 120 kb. A and B compartments cover ~50 and 30% of the genome, respectively, while S covers 15%, with 5% of the genome not assigned to any of the three clusters. Consistent with our description of B. mori linear genome organization, we found domains of compartment A are also enriched along chromosome centers, while those of compartment B are enriched along the arms (Fig. EV3E). S domains are distributed throughout the genome, lacking preferential positioning at either chromosome centers or arms (Fig. EV3D,E).
We next investigated whether compartment S is distinct from compartments A and B with respect to genetic features. We found that compartment S, like compartment A, has a lower GC and transposable element content compared to the whole genome (Fig. EV3F). Despite these similarities to A, compartment S corresponds to gene-poor regions and is even more depleted in genes than compartment B (Fig. 2F). However, unlike B, S does not appear to be a repressive compartment; the expression levels of the few genes that are located within S are in the range of the genome-wide distribution. Out of the 608 genes within S, about half (300) are expressed at the PD-D2 stage (TPM > 10) (Fig. 2G; Dataset EV1). Gene ontology analyses revealed genes within S are enriched in DNA binding and transcription regulation processes (Table 1). Given that one of the largest S domains in the genome corresponds to the Hox cluster, which is expressed in the PD-D2 stage (Fig. EV4), we tested for an enrichment of homeotic and Polycomb group (PcG) response genes. Using previously published datasets (Chai et al, 2008; Li et al, 2012), we generated a list of 399 homeotic and PcG response genes in B. mori and assigned them to A, B, or S (Dataset EV2, Methods). We did not find an enrichment of homeotic or PcG response genes in compartment S but rather in A (Dataset EV2).Table 1. ShinyGO molecular function enrichment of B. mori genes in S.PathwayGO IDEnrichment FDRNumber of genes in categoryTotal number of pathway genesFold enrichmentDNA bindingGO:00036776.10 × 10^−23^665924.45Nucleic acid bindingGO:00036767.43 × 10^−14^8914612.43DNA binding transcription factor activityGO:00037003.18 × 10^−13^312205.63Transcription regulator activityGO:01401105.13 × 10^−12^322624.88Sequence-specific DNA bindingGO:00435657.10 × 10^−08^191395.46DNA binding transcription factor activity, RNA polymerase II-specificGO:00009812.811 × 10^−06^13806.49Nuclear receptor activityGO:00048794.05 × 10^-03^41213.32Ligand-activated transcription factor activityGO:00985314.05 × 10^−03^41213.32Steroid hormone receptor activityGO:00037071.22 × 10^−02^4169.99Signaling receptor bindingGO:00051022.77 × 10^−02^7674.17
We conclude that B. mori chromosomes segregate into three chromatin compartments. In addition to having the well-conserved compartments A and B*, B. mori* chromosomes have a third compartment type, compartment S. Compartment S exhibits a previously undescribed contact pattern, which, unlike A and B, does not display checkerboard patterning, i.e., preferential contacts with other regions of the same type. Despite this difference, S constitutes a bona fide compartment as it (1) shows a unique pattern of long-range contacts and (2) was identified through the same procedures—clustering of whole-chromosome contact patterns—used to define new compartment types in other studies (Lieberman-Aiden et al, 2009; Rao et al, 2014; Spracklin et al, 2023; Xiong and Ma, 2019; Bonev et al, 2017). In contrast to local structures, such as TADs, dots and jets/fountains, S is distinct in that it exhibits chromosome-wide contact patterns that differ from those of A and B compartments. S exhibits a unique epigenetic landscape and covers about one-sixth of the B. mori genome at the PD-D2 timepoint. While S largely encompasses gene-poor regions, many genes located within S are transcriptionally active.
Compartment S exhibits novel contact patterning and key evidence of loop extrusion activity
To quantify the contact patterns for compartments A, B, and S, we calculated the average compartment contact strengths for all domains on the same chromosome. For A and B, we found enriched homotypic (A-A and B-B) and depleted heterotypic (A-B) contacts, confirming that A and B are conventionally compartmentalized (Fig. 2H). Further, we found that domains of S were depleted in contacts with all other compartments. These formed smooth lanes of contact depletion, such that contacts between A-S, B-S, and S-S_inter_ (contacts between different S domains) were all depleted at similar strengths (Fig. 2H). In stark contrast to this finding, we found contacts for pairs of loci within the same, contiguous S domain (on-diagonal, S_intra_) were strongly enriched (Fig. 2I). To our knowledge, this unexpected discordance between on- and off- diagonal Hi-C contacts has not been observed in other organisms, and it raises the possibility that this pattern arises from a previously undescribed mechanism.
In vertebrates, cohesin extrudes loops resulting in domains of local contact enrichment known as TADs and decreased compartmentalization (Schwarzer et al, 2017; Nuebler et al, 2018; Wutz et al, 2017; Haarhuis et al, 2017). Given both the high contact enrichment within S domains (Fig. 2I) and their lack of checkerboard patterning (Fig. 2A,H), we hypothesized that S domains might represent regions of high loop extrusion activity. We therefore searched for evidence of loop extrusion activity in the PD-D2 Hi-C data. Key established indicators of this activity are (1) the shape of the contact frequency P(s) as a function of genomic separation s (Hildebrand et al, 2024; Schwarzer et al, 2017; Gassler et al, 2017; Samejima et al, 2025; Polovnikov et al, 2023) and (2) the presence of specific Hi-C patterns, such as insulated domains, dots, and stripes, which also require extrusion barriers (Fudenberg et al, 2016; Rao et al, 2017; Schwarzer et al, 2017; Vian et al, 2018; Nora et al, 2017; Fudenberg et al, 2017).
In the chromosome-wide P(s) curve (Fig. 2J), we observe a characteristic “shoulder” of increased contacts and a corresponding peak in the log-derivative of P(s), which are indicative of loop extrusion (Hildebrand et al, 2024; Schwarzer et al, 2017; Gassler et al, 2017; Samejima et al, 2025; Polovnikov et al, 2023). The peak in the log-derivative plot represents the average loop size (Gassler et al, 2017; Polovnikov et al, 2023), which in this case is ~40–60 kb. We next computed these contact frequency curves separately for continuous segments of each compartment type (Fig. 2J). Consistent with our hypothesis, the P(s) shoulder is most prominent for compartment S and indicates an average loop size of 40–60 kb. The P(s) curve in compartment B also suggests some loop extrusion activity, though extrusion in B is likely sparser with larger loops, as suggested by the smaller height and right-shift of the peak. Analysis of P(s) therefore indicates that loop extrusion is active in B. mori and enriched in S domains.
In further support of loop extrusion in B. mori, we found many genomic regions with features of barrier-restricted extrusion (dots and stripes). By visual inspection, we often found such features nested within each other inside and at the edges of S domains (Fig. 2K).
The Hi-C data strongly suggests that loop extrusion occurs on B. mori interphase chromosomes and that it is much more prominent in S domains. This preferential enrichment of extrusion differs from previous observations in vertebrates, which raises the mechanistic question of whether it is responsible for the nontrivial patterns of S contacts in B. mori.
Targeted loading of loop extruders to S domains can reproduce S compartment Hi-C features
Given the novel and stark contact pattern of compartment S (Fig. 3A), we employed polymer modeling to systematically study which mechanisms might underlie its formation. We sought to reproduce the following three Hi-C features that together make compartment S unique (Fig. 3A): (i) contacts are enriched within contiguous, on-diagonal S domains (locally enriched, S_intra_); (ii) each S domain is depleted in off-diagonal, compartment contacts with other S domains (distally depleted, S-S_inter_); (iii) the depletion of contacts between S domains and the rest of the chromosome is homogenous (smooth lanes of contact depletion, S-S_inter_ ≈ S-A ≈ S-B). These features are the hallmarks of compartment S and have either not been previously observed together or at all.Figure 3. Mechanistic polymer modeling reveals S chromatin organization can arise from targeted loop extrusion in S chromatin but not from other known genome folding mechanisms.(A) Hi-C map and sub-compartment annotations highlighting the three distinct features of S compartments: (i) on-diagonal contact enrichment within S domains, (ii) off-diagonal contact depletion between different S domains, and (iii) smooth depletion of contacts between S and the rest of the chromosome. Hi-C data corresponds to chromosome 22: 11,625,000–18,435,000 from PD-D2 samples mapped with distiller (see Methods). (B) Illustration of the approach to testing different mechanistic models of S compartment formation. Polymer segments are assigned to compartments A, B, or S, which specifies their pairwise affinities. One model without loop extrusion and two models with loop extrusion were considered. Simulations were evaluated based on whether their contact maps display features i, ii, and iii. Compartment strength was also measured. (C) Three-species phase separation modeling results. Left: example 3D polymer conformations and corresponding Hi-C maps of chromosome models with increasing phase separation driven by increasing both A-to-A and B-to-B affinities, ϵ_A-A_ and ϵ_B-B_ (from left to right). Right: scatter plot of ^observed^/expected contact enrichments of S features i and ii for all simulated A-to-A and B-to-B affinities. The purple boxed region reflects the broadest possible value range for S-like features. Data points corresponding to chromosome models shown on the left are circled in the scatter plot. (D) Phase separation and uniform loop extrusion with barriers modeling results. Left: Hi-C maps of models with increasing chromosome-wide separation, d, of loop extruders. Phase separation was induced in all maps on the left by setting ϵ_A-A_ = ϵ_B-B_ = 0.05 kT. Right: scatter plot of ^observed^/expected contact enrichments of S features i and ii for models varying in extruder separations, A-A, and B-B affinities, with points colored by compartment strength as in (C). For all models shown, processivity was fixed to 100 kb and S-to-S affinity, ϵ_S-S_, set to 0.0 kT. (E) Phase separation and loop extrusion with targeted loading to S modeling results. Left: Hi-C maps of models with increasing separation of loop extruders in A and B (d_A&B_), while extruder separation in S (d_S_) was fixed at 50 kb. Phase separation was induced in all maps on the left by setting ϵ_A-A_ = ϵ_B-B_ = 0.05 kT. Right: scatter plot of ^observed^/expected contact enrichments of S features i and ii for a range of A-to-A and B-to-B affinities and separations in A and B (d_A&B_), with d_S_ fixed to 50 kb. Data points sharing d_A&B_ (i.e., varying ϵ_A-A_ and ϵ_B-B_) were connected by line, with the value of d_A&B_ reflected in line thickness. The line corresponding to d_A&B_ = 50 kb reflects the uniform loading scenario. (F) Impact of S-targeted extruder loading on the smoothness of S compartment contact depletion (feature iii). S compartment smoothness was computed as the standard deviation of mean ^observed^/expected S-A, S-B, and S-S_inter_ contacts, with smaller y values corresponding to smoother compartment patterning. Data points with comparable compartment strengths (≈ 0.4) are circled and corresponding Hi-C maps are shown in the insets. Separation in S (d_S_) was fixed at 50 kb.
In addition to its unique loop extrusion activity, we found S exhibits a unique epigenetic composition (Fig. 2C), suggesting it may phase separate from compartments A and B. Because loop extrusion and phase separation independently drive chromatin folding, we asked which of these processes, either individually or in combination, are responsible for generating S features. We therefore tested three major classes of mechanistic models (Fig. 3B). We first tested models in which chromatin organization was solely driven by phase separation. We then tested models with uniform loop extrusion and extrusion barriers at the edges of S domains, akin to loop extrusion behavior that generates TADs in vertebrates. These first two classes represent mechanisms of genome folding described in other animals. Finally, we tested models in which loop-extruding complexes preferentially accumulate in S over A and B.
We performed polymer simulations for each class of model, using simplified (chromatin) polymer chains with eight equally sized segments of A, B, and S monomers. Monomer identity was used to define compartment-specific affinity (ϵ) and loop extrusion properties (Fig. 3B). For each class of model, we tested a wide range of A/B compartment strengths to recapitulate the A/B compartments observed in the experimental data.
Phase separation without loop extrusion
We investigated whether affinity-mediated phase separation, which drives the formation of chromatin compartments, could recapitulate the features of S. Experimental Hi-C maps revealed clear compartmentalization of A and B regions (i.e., A/B checkerboarding), whereas S lacked such compartment patterning. This led us to first ask whether S could emerge if A and B chromatin, but not S, exhibited self-affinity.
To test this in our polymer simulations, we systematically varied per-monomer homotypic affinities of A-to-A and B-to-B monomers to obtain a broad range of compartment strengths (Fig. 3C). As expected, applying A-to-A and B-to-B affinities produced the desired checkerboard patterning of A and B compartments, consistent with their phase separation. However, S compartmentsalso checkerboarded, meaning none of the simulated models were able to replicate the distal features of S (features ii and iii). While we observed a subset of models with on-diagonal enrichment of S_intra_ contacts (feature i), it was always accompanied by off-diagonal enrichment of S-S_inter_ contacts—contrary to the depletion required for feature ii.
To further understand this relationship, we quantified contact enrichment and depletion from ^observed^/expected Hi-C maps for each model (Methods). We assessed whether the Hi-C maps were consistent with the first two S features (S_intra_ and S-S_inter_) using the most lenient criteria consistent with features i and ii: enrichment was defined as an average ^observed^/expected contact strength greater than one, while depletion was defined as ^observed^/expected contact strength less than one. This analysis (Fig. 3C, right) confirmed that none of the phase separation models accurately captured the experimentally observed features of S. Enrichment of S_intra_ (feature i) and depletion of S-S_inter_ (feature ii) did not co-occur because contact enrichment within contiguous domains (S_intra_, feature i) was correlated with contact enrichment between S domains (S-S_inter_, opposite of feature ii). By quantifying compartment strength from each Hi-C map (color scale in Fig. 3C, right), we found that both S_intra_ and S-S_inter_ enrichment increased as chromatin compartmentalization strengthened. As evident in the simulated Hi-C maps, models with sufficiently high A-to-A and B-to-B affinities to produce A/B checkerboarding always resulted in S-S_inter_ enrichment and S checkerboarding (Figs. 3C and EV5A). Together with polymer conformations and the Hi-C maps (Fig. 3C, left), these results indicate that phase separation of A and B chromatin induces phase separation of S chromatin, even in the absence of S self-attraction.
To confirm that phase separation of S leads to these undesired, conventional compartment patterns in S (enriched S-S_inter_ and S checkerboarding), we introduced self-attraction to S monomers and repeated the A-to-A and B-to-B affinity sweep. As expected, S-to-S self-attraction resulted in increased S_intra_, S-S_inter_, and S checkerboarding (Fig. EV5B). We therefore propose that phase separation does not drive S-like organization.
Uniform loop extrusion with A and B phase separation
Informed by evidence of loop extrusion activity in the PD-D2 Hi-C maps (Fig. 2J,K), we tested whether loop extrusion could play a role in S compartment formation. Given that TAD formation via the loop extrusion process results in domain enrichment and compartment suppression (Schwarzer et al, 2017; Nuebler et al, 2018; Wutz et al, 2017; Haarhuis et al, 2017), we hypothesized that loop extrusion might promote S_intra_ contacts (feature i, domain enrichment) while suppressing both S-S_inter_ contacts and the checkering of S (features ii and iii, compartment suppression). We therefore tested whether S domains are akin to TADs in vertebrates, formed by uniformly loaded, genome-wide loop extruders that are stalled by barriers (Yatskevich et al, 2019; Fudenberg et al, 2016; Sanborn et al, 2015; Rao et al, 2017; Schwarzer et al, 2017; Wutz et al, 2017; Nora et al, 2017).
In these models, loop extruders randomly bind to the chromatin fiber and progressively extrude the chromatin polymer into loops. Extrusion proceeds until the extruder unbinds or stalls. Given the evidence for extrusion barriers in and at the edges of S domains (Fig. 2K), we placed loop-extrusion-stalling sites at the ends of S domains. We then varied the relevant simulation parameters: A and B homotypic affinities, mean separation between extruders (d), and extruder processivity (λ, the average loop size extruded by an unobstructed loop-extruder). We assumed S monomers lacked preferential attractions to other S monomers.
As evident in the resulting contact maps, these models cannot generate S compartments (Fig. 3D, left). We observed on-diagonal extrusion domains in S, with barrier features visually similar to those in the experimental Hi-C data (feature i). However, we did not observe off-diagonal contact depletion (feature ii). Like the phase-separation-only models, this class of models failed to achieve the desired compartment-level organization. Though loop extrusion counteracted compartmentalization, this antagonistic effect was not restricted to S; when compartment features (checkering off-diagonal contacts) were suppressed in S, they were likewise suppressed in A and B. Conversely, when compartment checkerboarding was present in A and B, it was also present in S.
Similar to the phase-separation-only models, upon quantifying Hi-C contact enrichments, we found that S_intra_ and S-S_inter_ contacts are highly correlated and are largely reflective of compartment strength (Fig. 3D, right). Consistent with the antagonism between loop extrusion and compartmentalization, stronger extrusion (smaller separations and/or larger processivities) demands higher affinities to achieve the same level of compartmentalization (Fig. EV5C). While two models narrowly satisfied the quantitative criteria for features i and ii, we find that their Hi-C maps visually lacked compartmentalization of any type (S-S_inter_ ≈ A-A ≈ B-B contacts, all low), as expected for the high density of extruders (d = 50 kb) (Fig. 3D, left). We conclude that the uniform loading model is unable to reliably generate S compartments in our simulations. Therefore, the presence of S features in experimental maps indicates that some aspect of the loop extrusion process in B. mori differs from that of vertebrates.
Loop extrusion with targeted loading in S and A/B phase separation
To preferentially suppress compartmentalization of S (off-diagonal S features ii and iii) while preserving compartmentalization of A and B, we hypothesized that loop extrusion may be targeted to S. We therefore tested a new class of models in which loop extruders were more likely to load in S than in A and B, resulting in lower mean separation of extruders in S (d_S_) than in A and B (d_A&B_). Our hypothesis of more extruders in S would lead to smaller loops in S, consistent with our observations from the Hi-C data (Fig. 2J).
Models with targeted loading to S reliably produced S-like features in their Hi-C maps, unlike the previous two classes of models. Weak extrusion in A and B allowed A/B compartments to form, while strong extrusion in S both compacted S domains (S_intra_ enriched, feature i) and inhibited S-S_inter_ compartment contacts (feature ii) (Fig. 3E, left). Additionally, Hi-C maps with weaker extrusion in A and B had smoother lanes of S depletion (feature iii).
Analyzing the Hi-C maps quantitatively showed that targeting loading of extruders to S alters the balance between S_intra_ and S-S_inter_ contacts (Fig. 3E, right). As extrusion in A and B weakened (d_A&B_ increased, increased relative targeting), both features i and ii began to look more S-like: S_intra_ enrichment strengthened (gain of feature i) and distal S-S_inter_ enrichment was lost (gain of feature ii). Therefore, unlike the phase-separation-only and uniform loop extrusion models, targeting extrusion to S both compacts S domains and secludes them from contacting each other.
Additionally, models with stronger targeted loading produced S features more robustly. First, decreasing extrusion in A and B (increasing d_A&B_) allowed features i and ii to co-exist across a broader range of compartment strengths (Fig. 3F). Furthermore, S features persist upon the introduction of S-to-S affinity comparable to those of A-to-A and B-to-B (Fig. EV5D).
We next measured feature iii as the standard deviation of average ^observed^/expected S-A, S-B, and S-S_inter_ contact frequencies. Smoother lanes of depletion (off-diagonal S contacts) result in smaller standard deviations. In the uniform loading model, once compartment features began to emerge (compartment strength > 0.1), the standard deviation increased (feature iii disappeared), indicating a less smooth pattern of inter-compartmental contacts. However, as extrusion in A and B decreased (more targeted loading, i.e., d_A&B_ increased), higher compartment scores could be tolerated before S smoothness was lost. We found that the gain in smoothness (small standard deviation) due to targeted loading is largely driven by the suppression of S-S_inter_ (feature ii) as well as additional contributions from A/B compartmentalization (Fig. EV5E).
Thus, simulations with preferential loading of loop extruders to S in combination with self-affinities of A and B reliably produce the patterns of B. mori Hi-C maps. The latter compartmentalize the genome, while loop extrusion targeted to S generates the unique features of S compartments.
The different features of S rely on distinct aspects of targeted extrusion and compartmentalization
To understand how these principles may generate the specific folding patterns of B. mori chromosomes, we developed a quantitative chromosome-scale model employing the principles from the above minimal model study. Focusing on a 6.5 Mb segment of chromosome 15 (Fig. 4A), we assigned compartment identities to the simulated chromosome based on those from the PD-D2 Hi-C maps. To compare models to experimental Hi-C, we computed ^observed^/expected contact enrichments between all types of compartments, partitioning S-S contacts into S-S_inter_ and S_intra_. Models varied in five parameters: A-to-A affinity, B-to-B affinity, extruder processivity (λ), the average separation between extruders inside S (d_S_), and the average separation between extruders outside of S (inside A and B regions, d_A&B_).Figure 4. Chromosome-scale polymer modeling reveals how different features of S arise.(A) Approach to generating chromosom-scale polymer models of B. mori. Locus corresponds to chromosome 15: 0–6,500,000 mapped with distiller (see Methods). Polymer segments are assigned to compartments A, B, or S based on sub-compartment assignments of the region. Simulations are performed for three different phase separation and/or extrusion models. ^Observed^/expected contact enrichments are computed for different intra- and inter-compartment interactions for comparison to the experimental Hi-C data. (B) Scatter plot of ^observed^/expected contact enrichments of S features i and ii from chromosome-scale polymer modeling of three-species phase separation and no loop extrusion over a range of different A-to-A and B-to-B affinities. The corresponding S_intra_ and S-S_inter_ values from chromosome 15: 0-6,500,000 of the PD-D2 Hi-C data are shown as a purple x. (C) Scatter plot of ^observed^/expected contact enrichments of S features i and ii from chromosome-scale polymer modeling of three-species phase separation and loop extrusion with uniform loading for two different separations, d = 19 or 190 kb, and two processivities, λ = 19 and 190 kb, over a range of A-to-A and B-to-B affinities and extrusion processivities. (D) Scatter plot of ^observed^/expected contact enrichments of S features i and ii from chromosome-scale polymer modeling of three-species phase separation and loop extrusion with targeted loading to S for two different d_S_ and d_S_/d_A&B_ ratios over a range of different affinities. (E) ^Observed^/expected contact enrichments of S_intra_ plotted as a function of relative abundance of loop extruders (d_A&B_/d_S_) for a series of chromosome models with λ, d_S_, and d_A&B_ varied. The experimental value of S_intra_ is shown as a purple line. A-to-A and B-to-B affinities were fixed at 0.12 and 0.04 kT, respectively. (F) ^Observed^/expected values of S-S_inter_ plotted as a function of separation between loop extruders in S (d_S_) for a series of chromosome models. Each curve represents a series of models, which share loop extruder processivity. All models lacked extrusion in A and B chromatin and shared homotypic attractions of ϵ_A-A_ = 0.12, ϵ_B-B_ = 0.04, and ϵ_S-S_ = 0.00 kT, to yield an A/B compartment strength similar to that observed in the experimentally generated PD-D2 Hi-C maps. The experimentally measured value of S-S_inter_ is shown as a purple line. (G) S compartment smoothness plotted as a function of the chromosome-wide compartment strength. Simulations varied in ϵ_A-A_, ϵ_B-B_, λ, d_S_, and d_A&B_. Smoothness was computed as the standard deviation of ^observed^/expected S-A, S-B, and S-S_inter_ contacts. The experimentally measured value of S smoothness from the PD-D2 maps is shown as a purple line. (H) Comparison of the experimental data for chromosome 15: 0-6,500,000 (mapped with distiller) versus the best polymer model (ϵ_A-A_ = 0.16 kT, ϵ_B-B_ = 0.08 kT, ϵ_S-S_ = 0.00 kT, λ = 55 kb, d_S_ = 19 kb, d_A&B_ = 190 kb). Left: Experimentally generated Hi-C (top, right half of the map) versus the in silico generated Hi-C map from the best model (bottom, left half of the map). Right: Summary statistics (^observed^/expected off-diagonal compartment and on-diagonal S_intra_ enrichments) for the experimentally generated data (left) and best model (right). (I) Parameter values for extruder properties (left) and homotypic A-to-A and B-to-B affinities (right) of the successful models, sorted by rank of their mean square error of their summary statistics compared to those of the experimental data. All models lacked S-to-S affinities.
We first reaffirmed our key findings from the minimal model that neither phase separation alone (Fig. 4B) nor with uniform extrusion (TAD-like) (Fig. 4C) can generate secluded S domains in our simulations. For phase separation-only models, we varied A and B homotypic affinities in the absence of loop extrusion, but this did not produce the desired combination of S_intra_ (feature i) and S-S_inter_ values (feature ii) (Fig. 4C). With uniform extrusion, we similarly could not reproduce features i and ii quantitatively, regardless of processivity (λ) and extruder density (d_S_) (Fig. 4C). For the targeted loading models, some, but not all, models quantitatively reproduced experimental values for features i and ii. We noticed that different aspects of extrusion dynamics influenced either feature i or feature ii (Fig. 4D). Specifically, the relative abundance of extruders in S versus A and B (d_A&B_/d_S_) appeared to have a strong effect on feature i. Likewise, the overall abundance of extruders in S (separation in S, d_S_) influenced feature ii (Fig. 4D, arrows). These observations prompted us to more deeply investigate how different parameters of our model control the different features of S.
We first asked how such strong domains of S_intra_ contacts (feature i) arise from targeted loading. We systematically varied processivity (λ), mean extruder separation in S (d_S_), and mean separation in A and B (d_A&B_) independently. Consistent with Fig. 4D, our simulations revealed that feature i is driven by the relative abundance of loop extruders in S (d_A&B_/d_S_) and is insensitive to processivity (Fig. 4E). For a given set of compartment affinities (A-to-A affinity = 0.12 kT, B-to-B affinity = 0.04 kT), achieving the desired level of relative compaction (S_intra_ = 1.78, feature i) required extruders to be approximately ten-fold more abundant in S compared to A and B (i.e., d_A&B_/d_S_ ≈ 10–15, Fig. 4E). Increasing A/B compartment strength allowed the relative abundance in S to decrease slightly to achieve the desired S_intra_ enrichment (d_A&B_/d_S_ ≈ 5–10, Fig. EV6A). This suggests that S compaction results from substantial recruitment of extruders to S, comparable to targeted cohesin loading in other contexts (preprint: Galitsyna et al, 2023).
We next uncovered that S-to-S compartment attenuation (feature ii) is predominantly controlled by the overall abundance of extruders in S (d_S_) (Fig. 4F). This contrasts with vertebrates, where extrusion modulates compartment strength by both extruder processivity (λ) and separation (d). Because S domains are small (median length of 75 kb for our simulated region), even modest extruder processivities (≥ 56 kb) generate high loop coverage in S (Fig. EV6B). Therefore, extruder abundance governs S compaction (Fig. EV6C) and in turn S-S_inter_ seclusion (Fig. 4F). To generate distal S-S_inter_ depletions similar to those measured in the Hi-C, the abundance of extruders in S must be high (small d_S_). Based on the modeled region, where the experimentally measured ^observed^/expected S-S_inter_ depletion (feature ii) is 0.7, we estimate extruder separation (d_S_) is 20–40 kb (Fig. 4F). A precise estimate depends on the strength of A/B compartmentalization, as stronger compartmentalization can promote S-S_inter_ contacts (i.e., loss of feature ii, Fig. EV6D).
We next sought to explain the most notable feature in B. mori Hi-C maps: S smoothness (feature iii). For smooth lanes of contact depletion, S must be similarly depleted in contacts with A, B, and other S domains. While S-S_inter_ depletion (Fig. 4F) is primarily controlled by extrusion in S (d_S_), loop extrusion in S has only a minor effect on S-A and S-B contacts (Fig. EV6E). Rather, S-A and S-B contacts are predominantly controlled by A and B phase separation: loss of S-A and S-B contacts reflect the demixing (phase separation) of A and B, respectively (Fig. EV6E). Therefore, to achieve similar levels of S-A and S-B contacts, the level of phase separation of A and B must be approximately equal (i.e., ^observed^/expected A-A contacts ≈ ^observed^/expected B-B contacts). Additionally, when A/B compartmentalization strengthens, smoothness is also lost because S-S_inter_ increases. Therefore, A/B compartmentalization must be low to maintain smoothness (Figs. 4G and EV6E,F).
In our simulations, we find B. mori chromatin organization arises from a unique interplay of multiple biophysical parameters, allowing us to estimate those parameters (d_S_, d_A&B_/d_S_, A-to-A, and B-to-B affinities) from the different Hi-C features of S (Fig. 4E–G). To develop a quantitatively accurate B. mori chromosomal model, we integrated our estimates and insights from Fig. 4E–G, sweeping over estimated extrusion and compartment parameters. Models with high levels of targeted extrusion (such as d_S_ = 19 kb, d_A&B_ = 190 kb; d_A&B_/d_S_ = 10) and equally modest A and B compartmentalization reproduced Hi-C maps qualitatively and ^observed^/expected enrichments quantitatively (Fig. 4H,I). Other successful models also had these characteristics (Fig. 4I). As predicted, any parameter outside of our estimated ranges (increasing d_S_, reducing d_A&B_, high or unequal A-A and B-B contact strength) systematically caused models to fail.
Our models reveal a previously uncharacterized interplay of loop extrusion and chromatin compartmentalization that can shape the 3D genome. The modeling, combined with our experimental Hi-C analysis, support the conclusion that loop extrusion in B. mori is targeted to S domains, which leads to higher S compaction and suppression of (off-diagonal) compartment contacts between S. Together, these mechanisms may generate the striking chromatin contact patterns that dictate 3D genome architecture. We next used our B. mori chromosomal model to make predictions on the spatial localization of A, B, and S compartments.
S domains are preferentially found on the surfaces of chromosome territories (CTs)
The spatial partitioning of active and inactive chromatin within the nucleus has previously been attributed to conventional compartmentalization (Wang et al, 2016; Falk et al, 2019; Solovei et al, 2016). We asked whether the seclusion of compartment S confers distinct spatial positioning using our best-performing model of chromosome 15 (Fig. 4H). By analyzing the radial positioning of A, B, and S, we found a strong preference for S domains to localize to the periphery of CTs (Figs. 5A,B and EV7A). Compartments A and B both showed preference for localizing to the center of the territory, with A being more centrally biased. When compared to an analogous model without extrusion, S was more interspersed throughout the territory (EV7A). Models with higher targeting of extrusion to S (no loading in A and B) showed even stronger radial preference of S. Therefore, our models predict that extrusion in S modulates its spatial localization, biasing it to the periphery of CTs.Figure 5. Compartment S is peripherally located within chromosome territories, with modeling indicating a role for loop extrusion in this positioning.(A) Example conformations for three models of chromosome 15, with the top row’s renderings reflecting the entire spherically confined simulation and the bottom row’s renderings representing cross sections through the center of the sphere. Each column represents a model introducing a new source of organization to the chromosome; (left) a rendering of a phase-separation-only model (ϵ_A-to-A_ = 0.16 kT, ϵ_B-to-B_ = 0.08 kT, ϵ_S-to-S_ = 0.00 kT); (middle) a rendering of the same compartment model with loop extrusion within S only (λ = 55 kb, d_S_ = 19 kb); (right) a rendering of the same compartment model as the far left but with different loop extruder separations in S versus A and B (λ = 55 kb, d_S_ = 19 kb, d_AB_ = 190 kb; the best model from the previous section). (B) Relative monomer densities for A, B, and S monomers for the three models detailed in (A). n = 150 independent CT conformations. The error bars represent standard deviations. (C) Four-color Oligopaint FISH labeling single A (red), B (green), and S (purple) domains as well as the corresponding CT (white). First column: Oligopaint labeling of domains, with white dashed lines indicating CT edges. Second column: Oligopaint labeling of domains merged with Oligopaint labeling of CTs. Third column: zoom-in views corresponding to boxes traced in column two. Fourth column: 3D rendering of zoomed CT from TANGO (Ollion et al, 2013). Microscope images are Z-projections of 10 Z stacks. The background in the CT channel acts as a proxy for the nuclear edge. (D) Shell analysis measuring compartment foci positions within their CTs for Chr04, 17, and 23. The location of Oligopaint FISH probes within the chromosome is shown above each plot. Dots indicate means of three biological replicates (different embryos, n > 250 nuclei). Error bars show standard error of the mean. P values were generated from unpaired t-tests (Kolmogorov–Smirnov tests) between distributions and are indicated at the bottom left of each graph. (E) Average ^observed^/expected trans (inter-chromosomal) contacts within and between all A, B, and S compartments. Source data are available online for this figure.
To test these findings, we used Oligopaint FISH to label portions of single A, B, or S domains as well as the whole CTs for Chr04, Chr17, and Chr23 in embryonic nuclei (Figs. 5C and EV7B,C). Shell analysis (Methods) revealed that S domains are more likely to occupy peripheral CT shells compared to A and B domains (Fig. 5D). In addition, measuring the distance from the domain center to the CT edge showed that S domains are closer to the CT edge than A or B domains for all chromosomes and loci analyzed (Fig. EV7D). These observations were confirmed by additional experiments with super-resolution microscopy (Fig. EV7E,F). Taking advantage of the fact that on Chr04 we labeled two S, two A, and two B domains located at comparable genomic distances from one another, we next tested whether S domains are positioned further apart compared to pairs of A and B domains. We found that the distance between the two S domains was significantly larger than between the two A or B domains (Fig. EV7G–J), as expected given the peripheral positioning of S domains on CTs and the depletion of Hi-C contacts between different S domains.
Although inter-chromosomal contacts were sparse in the Hi-C data, we asked whether this peripheral localization could influence the average contact frequency among S domains in trans. Therefore, we computed the average trans ^observed^/expected contacts between each compartment type (Fig. 5E). We found that the average value of S-S_inter_ trans contacts is higher compared to any other combination in trans, and, in particular, A-A and B-B contacts. This is consistent with the preferential positioning of S towards the periphery of CTs, a favored location for trans contacts.
Our Oligopaint FISH experiments reveal that S domains are preferentially located at CT peripheries, which is further supported by the Hi-C. Our modeling indicates that this may be caused by a previously unknown effect of loop extrusion: its ability to influence the spatial positioning of chromatin in the context of its chromosome territory.
Compartment S changes during development
Taking advantage of our Hi-C datasets from different developmental stages, including three embryonic stages and one adult stage, we next explored the developmental dynamics of compartment S. Based on initial visual inspections, we could identify domains switching to or from S on multiple chromosomes between timepoints (see examples in Fig. EV8A). To test the dynamics of S compartment switching more systematically, we repeated the compartment calling protocol for the adult stage (Adult Heads, AH). We found that a comparable fraction of the AH genome (12%) folds into compartment S, compared to the PD-D2 embryonic stage (Fig. EV8B). Nevertheless, several embryonic S domains visually show checkerboard patterning in the AH Hi-C maps, including three of the largest S domains on Chr06 and Chr23 (Figs. 6A and EV8C). To compare the two datasets further, we restricted our analyses to large (> 200 kb) domains to allow for visual confirmation of compartment assignment based on the Hi-C contact pattern. We found that about 45% of S domains defined in the PD-D2 embryonic stage are maintained in the AH (referred to as S ➔ S) (Fig. 6B). This fraction is much lower compared to that of the A or B domains. Large S domains that change compartment assignment in the AH exclusively turn into B domains (referred to as S ➔ B). Notably, this change coincides with a significant reduction in gene expression levels (Figs. 6C and EV8D). The Hi-C contact maps at S domains that turn into B show a loss of features associated with loop extrusion, including insulation points and off-diagonal dots (Figs. 6A and EV8C). Weakening of insulation is also evident when comparing pileups of contacts of boundaries called within large S domains that are maintained (S ➔ S) or lost (S ➔ B) in the AH data (Fig. 6D). The loss of extrusion features and intra-domain compaction, together with the increased checkerboard patterning of S domains that turn into B, are consistent with our model that targeted loop extrusion underlies the spatial segregation of compartment S by counteracting compartmentalization.Figure 6. Compartment S changes during development.(A) Hi-C contact maps of the full Chr23 at 40 kb resolution and Chr23:3,000,000-80,000,00 region at 10 kb resolution in PD-D2 embryos on the left and in adult head (AH) on the right. Below or above each matrix are gene locations, RNA-seq expression aggregated at 1 kb resolution and compartments A, B, and S assignments, for the corresponding stage and region. (B) Compartment assignment in AH datasets of large domains (> 200 kb) assigned to compartments A, B, or S in PD-D2, in % of the compartments assigned in PD-D2. (C) Box plots of distributions of the log_2_ ratio of gene expression (TPM) between PD-D2 embryos and AH. In purple is the distribution for genes that are in domains assigned to compartment S in the two stages (S➔S) (n = 119), and in green is the distribution for genes that are in regions assigned to compartment S in PD-D2 and that switched to B in AH (S➔B) (n = 120). The boxed region corresponds to the data between the first and third quartiles. The line indicates the median of the distribution, while the cross corresponds to the mean. Whiskers extend to the lowest and highest non-outlier data points, and dots outside correspond to outliers. Statistical significance was tested using a Kolmogorov–Smirnov test (P = 0.026). (D) Pileup plots centered on internal insulation points within all large S compartment domains (> 200 kb, excluding 40 kb on each side of domain boundaries) in AH or S➔S and S➔B categories, as described in (B), at 5 kb resolution and extending to 30 kb on each side. (E) Two-color STED labeling Chr04 CT and a single genomic domain that changes compartment identity from S in PD-D2 embryos (left) to B in adult heads (right). Chr04 is shown in blue, and the domain (Chr04: 12.1–12.3 Mb) is shown in purple. The dashed line indicates the nuclear edge as determined by confocal microscopy. (F) Shell analysis measuring S domain positions within Chr04 in embryos (blue) and adult heads (tan). Mid line = average of 2–3 biological replicates (n > 30 nuclei each). Error bars (light shading) show the standard error of the mean. P values were generated from unpaired t-tests (Kolmogorov–Smirnov tests) between distributions and are indicated at the bottom right of the graph. Source data are available online for this figure.
To test whether the change in compartment identity also translates to a change in domain position within CTs, we took advantage of the fact that one of the three S domains that we chose for Oligopaint FISH analyses turned into B in the AH Hi-C dataset. Consistent with the change in compartment identity, super-resolution microscopy revealed that the same region on Chr04 localizes more internally in its CT in nuclei of the adult B. mori head compared to embryos (Fig. 6E,F). In contrast, a B domain used as a control remained unchanged between the two stages (Fig. EV8E,F). These results were further validated using widefield microscopy, labeling A, B, and S domains as well as the entire CT on Chr04 in AH. Notably, in AH cells, the S domain that transitioned into B no longer exhibited a tendency to be positioned closer to the CT periphery (Fig. EV8G,H). The developmental dynamics of several S domains argue against a strict genetic specification of compartment S but rather support the presence of epigenetic features involved in its formation.
Discussion
Our investigation of B. mori’s genome organization reveals both conserved principles and novel genome-folding behaviors. As seen in other eukaryotes, we observe the formation of CTs and spatial segregation of chromatin into active A and inactive B compartments. However, unlike other eukaryotes, we have observed much stronger chromosome territoriality and a novel type of compartment, S, which lacks the characteristic checkerboarding of A/B compartments and appears to be rich in loop extrusion activity. Our modeling indicates that a unique interplay of chromatin compartmentalization and loop extrusion may drive the formation of these spatially secluded domains that appear to change throughout development.
The remarkably strong CTs and the low frequency of inter-chromosomal contacts are consistent with a recent whole-chromosome Oligopaint study of six B. mori chromosomes, which likewise revealed highly spatially distinct CTs (Rosin et al, 2022). Folding, volume, and intermixing of CTs have been associated with condensin II (Hoencamp et al, 2021; Rosin et al, 2018). Condensin II subunits are present in the B. mori genome (Xiang et al, 2023; King et al, 2019), and it would be interesting to evaluate whether the substantial degree of territoriality is caused by uniquely high activity of condensin II in B. mori. CT strength is also intriguing from an evolutionary point of view. Previous studies in D. melanogaster cell lines (Rosin et al, 2019) and across human cancers (Engreitz et al, 2012) have described an inverse relationship between the frequency of inter-chromosomal contacts and the incidence of genomic translocations. Interestingly, karyotypes and synteny are highly conserved across Lepidoptera, including B. mori. We therefore hypothesize that the pronounced CTs may contribute to low structural variation and high karyotype conservation in these organisms by preventing structural rearrangements during mitotic divisions of cells destined to form the future germline.
The presence of compartment S, with domains that strongly self-interact but segregate from the rest of their chromosome, is remarkable. To our knowledge, there is no precedent for a compartment with contact or epigenetic profiles similar to compartment S. While other compartment types beyond A and B have been detected (Rao et al, 2014; Spracklin et al, 2023; Xiong and Ma, 2019), they typically represent sub-types of A and B and show preferential homotypic contacts (Spracklin et al, 2023). Although the previously described “intermediate” compartment I (Schwarzer et al, 2017; Johnstone et al, 2020; Vilarrasa-Blasi et al, 2021) shares some characteristics with compartment S, such as H3K27me3 enrichment and developmental plasticity, they differ in their most prominent features. Compartment I is enriched in contacts with A, B, and I. A recently identified sub-compartment in HCT116 cells, termed B_0_ (Spracklin et al, 2023), likewise displays relatively low levels of compartment contrast (i.e., its compartments are smoother and they checkerboard less than other compartment types), though it does preferentially interact with other B_0_ domains. Domains of compartment S, in contrast to I or B_0_, are homogenously depleted in contacts with any other domain. This pattern, plus its gene composition and distinct epigenetic makeup, make S unique compared to any previously identified chromatin structure.
Based on our modeling, the formation of compartment S appears to require an interplay of mechanisms, distinct from any other described organism. While conventional compartmentalization is believed to rely on affinity between regions that share epigenetic composition (Jost et al, 2014; Solovei et al, 2016; Hildebrand and Dekker, 2020), we find that S-to-S affinity-mediated phase separation does not lead to the formation of compartment S. Instead, we propose that compartment S is primarily formed by loop extrusion activity targeted to S domains. Targeting extrusion to S domains selectively compacts them and suppresses their compartmentalization. This loop extrusion behavior is different than in vertebrates, where loop-extruding cohesin (during interphase) and condensin (in metaphase) are believed to load mostly uniformly across the genome, without preference for a specific compartment type (Gibcus et al, 2018; Spracklin et al, 2023). Such localization of extruders to many broad epigenetic domains (tens to hundreds of kb each) has not been identified in other systems. Future experimental analyses will be important to test our prediction of S-localized loop extrusion and to further assess its contribution to S compartment formation.
Targeted SMC loading to certain genetic elements is well-described across various biological systems. In the bacterium B. subtilis, condensins are loaded at ParS sites by the ParB DNA-binding protein (Gruber and Errington, 2009; Sullivan et al, 2009). In yeast, it has been suggested that sequence context antagonizes SMCs from centromere binding (Meneu et al, 2025). In C. elegans, the condensin-containing dosage compensation complex is recruited by specific sites to the X chromosome (Crane et al, 2015; Kim et al, 2022). Most recently, targeted loading of SMCs to enhancers has been shown to form fountains/jets in Hi-C maps in C. elegans (Kim et al, 2025; preprint: Isiaka et al, 2023) and vertebrates (Guo et al, 2022; preprint: Galitsyna et al, 2023; Rinzema et al, 2022). Such targeted loading in these systems may be guided by differences in DNA accessibility, sequence-specific DNA binding proteins (Gruber and Errington, 2009; Sullivan et al, 2009), or via enhancer-specific histone marks (Guo et al, 2022; preprint: Galitsyna et al, 2023; Rinzema et al, 2022). It is possible that similar factors may recruit loop-extruding complexes to compartment S in B. mori.
In the context of targeted loading of SMC complexes, the enrichment of H4K20me1 in compartment S is interesting. In C. elegans, it has been shown that the SMC-based dosage compensation complex enriches H4K20me1 on the inactive X chromosome by means of a condensin-associated demethylase (Meyer, 2022). H4K20me1 is likewise deposited during the process of silencing and compacting the X chromosome in mice (Tjalsma et al, 2021). Compartment S shares additional similarity to inactivated sex chromosomes by virtue of being locally compacted and isolated from the rest of the genome, while it is distinct from inactivated sex chromosomes by virtue of its permissiveness to gene expression. Whether enrichment of H4K20me1 aids in chromatin compaction, and what its relationship is to loop extrusion, is unclear for B. mori.
In addition, our analyses also support a previously unknown consequence of loop extrusion. Not only may loop extrusion activity lead to the formation of dense and secluded S domains, but our model predicts that it may also drive their peripheral localization within CTs. This novel predicted effect of loop extrusion raises the possibility that loop extrusion may similarly affect large-scale spatial organization in other organisms or contexts. The underlying physics by which extrusion may achieve this architecture is yet to be understood.
The function of compartment S and the role of loop extrusion there remain intriguing questions. First, in view of the holocentric architecture of B. mori chromosomes, we consider the possibility that compartment S is involved in centromere specification to be unlikely. This conclusion is guided by our previous findings in B. mori cell lines showing that kinetochore assembly occurs in regions anticorrelated with transcriptionally active chromatin (Senaratne et al, 2021), an epigenetic landscape that is more consistent with compartment B than S. Instead, the isolated genomic environment of S domains might ensure the precise transcriptional regulation of the genes that they contain. Our finding that S-located genes are functionally enriched in transcription-related processes might suggest that S represents a developmental transition state to either A or B, as hypothesized for compartments I and B_0_ (Spracklin et al, 2023; Johnstone et al, 2020; Vilarrasa-Blasi et al, 2021). Such a model is supported by the observation that many S domains are variable among different developmental stages. Furthermore, S domains may represent development-control units, such as the Hox cluster, which comprises a large S domain in B. mori embryos (Fig. EV4). By analogy to the Hox cluster, where loop extrusion appears to be key to the precise sequence of gene activation (Rekaik and Duboule, 2024), compartment S may recruit a high density of extruders to achieve precisely timed activation of genes during development. Our observation of the developmental plasticity of S domains supports this hypothesis.
Broadly, our observation of localized loop extrusion in B. mori may also hold true for other insects. Conflicting evidence of the loop extrusion process exists in another insect, D. melanogaster, in which some signatures of extrusion are evident in Hi-C, but others are missing (Abed et al, 2019). It is possible that, akin to B. mori, D. melanogaster chromatin has loop extrusion activity localized to specific genomic regions.
In summary, our study describes the unique genome organization of B. mori. This non-model organism has an exceptional degree of chromosome territoriality and reveals a striking new genome folding structure. Our modeling indicates that this novel structure, compartment S, can be formed by loop extrusion targeted to a specific chromatin type. Genome organization in B. mori thus suggests that the interplay of two major genome-folding processes, loop extrusion and compartmentalization, can generate unique and unexpected patterns. These findings open a new line of questions for how evolutionarily conserved mechanisms may interact differently in different organisms, as well as how changes in the epigenetic landscape during development may guide loop extrusion to reshape 3D genome architecture.
Methods
Reagents and tools tableReagent/resourceReference or sourceIdentifier or catalog number Experimental models Bombyx mori p50Kyushu University facilityBombyx mori embryosCoastal SilkwormsBombyx mori female adultsFramsChams Panther Chameleons Recombinant DNA
Antibodies rabbit polyclonal anti-H3K36me3 antibodyAbcamCat#ab9050, RRID:AB_306966Rabbit polyclonal anti-H3K4me3 antibodyDiagenodeCat#C15410003, RRID:AB_2924768Rabbit monoclonal anti-H3K27me3 antibodyCell SignalingCAT#9733S, RRID:AB_2616029Rabbit polyclonal anti-H3K9me3 antibodyAbcamCat#ab8898, RRID:AB_306848Rabbit polyclonal anti-H4K20me1 antibodyAbcamCat#ab9051, RRID:AB_306967Rabbit monoclonal anti-H2A.Z antibodyAbcamCat#ab150402, RRID:AB_2891240 Oligonucleotides and other sequence-based reagents Oligopaint librariesThis study, Twist BiosciencesDataset EV13 Chemicals, enzymes and other reagents TRIzol™ ReagentThermo Fisher ScientificCat#15596026Formaldehyde solutionSigma-AldrichCat#F8775Arima-Hi-C+ kitArima GenomicsKAPA HyperPrep KitRocheCat#KK8504Illumina TruSeq sequencing adaptersIDTCat#20020590cOmplete™, EDTA-free Protease Inhibitor CocktailRocheCat#04693132001SPRIselect beadsBeckman CoulterCat#B23318HiScribe™ T7 High Yield RNA Synthesis Kit for probe synthesisNew England BioLabsCat#E2040Taq DNA Polymerase with Standard Taq BufferNew England BioLabsCat#M0273RNasin® Plus Ribonuclease InhibitorPromegaCat#N2615Maxima™ H Minus Reverse TranscriptaseThermo Scientific™Cat#EP0751, EP0752, or EP0753Corning 22×22 mm Square #1½ Cover GlassCorningCat#2850-22Premium Frosted Microscope SlidesFisherbrand™Cat#12-544-3ATTO565SigmaCat#75784ATTO647NSigmaCat#18373NucGreen Dead 488InvitrogenCat#R37109NucSpot 470 Nuclear StainBiotium IncCat#40083Prolong Diamond AntifadeInvitrogenCat#P36965 Software Python Language Reference, version 2.7 or 3.6 http://www.python.org R v4.2.3 https://www.R-project.org/ Jupyter-notebook v7.0.1Kluyver et al, 2016tRNAscan-SE 2.0 OnlineLowe and Chan, 2016; Chan and Lowe, 2019Matplotlib v3.5.2Hunter, 2007BEDTools suite v2.31.1Quinlan and Hall, 2010; Quinlan, 2014DeepTools suite v3.5.1Ramírez et al, 2016SAMtools suite v1.16.1Danecek et al, 2021pyGenomeTracks v3.8Lopez-Delisle et al, 2021, Ramírez et al, 2018RepeatMasker software v4.1.1Smit et al 2015Bowtie2 v2.2.9Langmead et al 2012“twoBitToFa” function of the UCSC packageShinyGo v0.741Ge et al, 2020STAR v.2.7.8aDobin et al, 2013Rnanorm v2.1.0 https://pypi.org/project/rnanorm/ sanbomics suite v0.0.6 https://github.com/mousepixels/sanbomics PyDESeq2 suite v0.3.3Muzellec et al, 2023HiCPro suite v2.11Servant et al, 2015Cooler suite v0.9.3Abdennur and Mirny, 2020cooltools v0.5.4 10.5281/ZENODO.5214125 Coolpup.py suite v1.1.0Flyamer et al, 2020Scikit-learn v1.3.2Buitinck et al, 2013, Pedregosa et al, 2011HiCExplorer utilities v3.7.2.Wolff et al, 2018, Wolff et al, 2020distiller-nf version 0.3.3 https://github.com/open2c/distiller-nf OligoMiner pipelineBeliveau et al 2018Leica Application Suite X https://www.leica-microsystems.com/products/microscope-software/p/leica-las-x-ls/ Huygens deconvolution software https://svi.nl/Huygens-Deconvolution ImageJ/TANGOOllion et al, 2013Prism 9 by GraphPad https://www.graphpad.com/features polychrom 10.5281/ZENODO.3579473 OpenMMEastman et al, 2017 Other 4200 TapeStation SystemAgilentRRID:SCR_018435NovaSeq XIlluminaRRID:SCR_024569Covaris E220Evolution ultrasonicatorCovarisN/ALeica DMI6000 B inverted microscopeLeicaRRID:SCR_018713Leica DMi8 microscopeLeicaRRID:SCR_026672Leica STELLARIS 8 STED microscopeLeicaRRID:SCR_024662
Bombyx mori strains
All experiments have been carried out using Bombyx mori embryos of the p50 reference strain from the Kyushu University facility (silkworm Genetic Resource Database https://shigen.nig.ac.jp/silkwormbase/topAction.do), except for Oligopaint FISH (see dedicated section). Upon arrival, diapaused eggs were stored in the cold (4 °C) for more than 70 days before being placed at room temperature (~25 °C) to break the diapause and restart embryonic development. Late embryos (1 to 48 h post-diapause release) are harvested to perform Hi-C and ChIP-seq experiments. For this study, we mostly focused our analyses on the 24 h post-diapaused timepoint (PD-D2) because at this stage, at least some cells have reentered the cell cycle (Nakagaki et al, 1991). Consistent with this, we observe about 1–2% of cells in metaphase, with the majority of cells in interphase (Fig. EV1). To include an adult tissue time-point, upon egg hatching (~10 days after placing eggs at 25 °C), we fed B. mori larvae with mulberry leaves until cocoon spinning. After adult hatching from the cocoon and mating, we dissected and fixed the insects’ heads to use for the Hi-C experiment.
Bombyx mori genome update
Based on visual inspection of Hi-C contact maps of the p50 reference strain mapped to the reference genome assembly published in 2019 (Kawamoto et al, 2019) (here referred to as Bmori_v3), we have corrected four scaffold inversions (Chr11, 19, 22 and 24 (Fig. EV9A)) and provide an updated genome assembly. We have built a new reference assembly for the 28 chromosome-length scaffolds (without the additional short unmapped scaffolds that are present in the Bmori_v3 assembly). Corrected chromosome assemblies are indicated with the addition of a “c” suffix to the chromosome names. The updated assembly is available on NCBI under BioPropject ID PRJNA1132488 (Bmori_v4_base.fasta) and the accompanied gff3 annotation file with the coordinates of inverted scaffold adjusted accordingly is provided as Dataset EV3.
Additionally, after defining transposable elements (TEs) positions on the genome (see section on repeat analysis), we also provide a simplified annotation file consisting only of the endogenous protein-coding gene features of the gff annotation file and cleared from all transposable element genes or pieces of those (often also annotated as “Gag-Pol”, “endonuclease”, “reverse transcriptase” or “transposase” proteins). For this, we first used “bedtools coverage” (Quinlan, 2014; Quinlan and Hall, 2010) to remove genes that are covered by TEs more than 80% of their length (2541 removed/16471 genes) and coding sequences (CDS) that are covered by TEs more than 90% of their length (5485 removed/95324 CDS). These thresholds have been determined empirically by visual inspection of features that were removed. In cases where all CDS of a gene have been removed, yet the full gene originally passed the threshold, the corresponding gene was removed as well (61 genes removed). We provide the list of remaining protein-coding genes in Dataset EV4 and show this as gene tracks for all plots, as well as for our analysis on gene coverage.
rDNA array
In the Bmori_v4 assembly, we realized that the rDNA locus was incompletely assembled and misplaced in the reference genome. Indeed, the Hi-C contact map of Chr11c still reveals a strong insulation point (Fig. EV9B) between the ~5 Mb left end and the rest of the chromosome. This location might correspond to the rDNA array, which is known to display strong insulation in other species (Lazar-Stefanita et al, 2017; Mercy et al, 2017). Although our search of rDNA sequences in the published genome assembly indicates a different position on the same chromosome, our Hi-C contact data support the location of the rDNA array at the insulation point (Fig. EV9B). We thus sought to verify the sequence of one full unit of the array. For this, we amplified ~8 kb of rDNA with the following primers rDNA_F: CGGTTTATGCGAAATCTCGG and rDNA_R: GTCGTTTCGGTAAGTCAGTC and re-sequenced the full fragment walking on the DNA. We provide the obtained rDNA unit on NCBI under BioProject ID SUB14588182.
tRNA annotation
tRNA genes are not annotated in the current B. mori genome assembly. We identified tRNAs in the B. mori genome using tRNAscan-SE 2.0 Online (Chan and Lowe, 2019; Lowe and Chan, 2016). We provide the summary of the program output listed all tRNA genes in Dataset EV5.
GC content analysis
The GC content for B. mori and C. elegans genomes was calculated using the BEDTools suite (Quinlan, 2014; Quinlan and Hall, 2010). We used the function “bedtools nuc” on 40, 100, or 250 kb non-overlapping windows of the genome to extract the GC% for each window. For windows including “N” nucleotides and windows at the end of chromosomes, the window size has been adjusted accordingly (considering only the size of “As”, “Ts”, “Cs”, and “Gs”) for accurate GC% calculation.
The Python library Matplotlib v3.5.2 (Hunter, 2007) has been used on 100 kb windows to plot heatmaps with a color code corresponding to the GC% of each window along all chromosomes.
GC content distribution genome-wide and inside each compartment or region are shown as box plots that correspond to the GC% per 40 or 100 kb (as indicated) genomic window for each category, using the “bedtools intersect” function of the BEDTools suite. Gene track for GC% is plotted per 40 kb window along chromosomes of interest using pyGenomeTracks (Lopez-Delisle et al, 2021; Ramírez et al, 2018).
Bombyx mori repeat analysis
To determine the positions of transposable elements (TEs) in the Bm_v4 assembly, we used the RepeatMasker software (Smit et al, 2015) on a custom B. mori transposon library kindly provided to us by Dr. Ramesh Pillai using the “-nolow” parameter. We then aggregated the output of defined TE coordinates along the genome using the “bedtools merge” function of the BEDTools suite (Quinlan, 2014; Quinlan and Hall, 2010) and provided the obtained list on Dataset EV6. This list allowed us to confirm the overall genome-wide coverage of TE of 46% in B. mori. The list has been further used to determine the TE coverage per 40 kb windows genome-wide and inside each compartment using “bedtools makewindows”, “bedtools coverage”, and the “bedtools intersect” function of the BEDTools suite.
We also analyzed simple repeats and low complexity DNA of the genome with the RepeatMasker software using the “-noint” parameter. With this analysis, we confirmed the absence of very large regions composed of satellite-like repeats in the genome apart from telomeric repeats (TAACC)n, with ~2% of all simple repeats being larger than 100 bp and only 32 larger than 1 kb (including 18 assembled telomeres). We provide the output file on Dataset EV7.
As a complementary approach, we evaluated the proportion and content of reads that are filtered out due to multimapping using Bowtie2 alignment on the Bmori_v4 genome assembly. In our sequencing pipeline, we are using PE100; thus, we started by generating all possible 100 nt DNA sequences from the genome by parsing the genome in sliding windows of 100 bp overlapping by 1 bp using the “bedtools makewindows” function of BEDTools and the “twoBitToFa” function of the UCSC package. We then aligned all these sequences to the Bmori_v4 genome using Bowtie2 with default parameters. From the aligned sequences, we then selected the multimapping reads (grep “XS:i”). Those were sorted and indexed using the SAMtools suite (Danecek et al, 2021) and consolidated using “bedtools merge” to build a file recapitulating the coordinates of all regions of the genome that are ambiguously mapped in our NGS sequencing and analysis. We provide this file on Dataset EV8. Overall, using 100 nt reads sequencing, 187, 369,332 bp of the B. mori genome are invisible out of 445,114,022 bp, i.e., 42%.
Using BEDTools' “intersect” on the two generated datasets, we could verify that the two approaches identified similar regions of the genome. 90% of all identified TE fall into unmappable regions, and 82% of the latter correspond to TEs.
Arms vs center analysis in B. mori and C. elegans
For C. elegans, the coordinates of arms and centers were according to Liu et al. (Liu et al, 2011). For B. mori, we used the GC content track generated in 250 kb windows and called arm borders corresponding to the coordinate where the GC content curve crosses the genome-wide mean and becomes lower for the first time, starting from each telomere.
Gene coverage analysis
The gene coverage across 40 kb genomic windows genome-wide or in each compartment has been computed using the “bedtools makewindows” and “bedtools coverage” function of the BEDTools suite (Quinlan, 2014; Quinlan and Hall, 2010) based on the gene list cleared from TEs previously generated.
Gene ontology analysis
To analyze gene ontology of genes within S domains, GO analyses were performed using the gene models and IDs from the previous genome annotation (International Silkworm Genome Consortium, 2008), as no online tool is yet available on the most recent gene models (Kawamoto et al, 2019). Corresponding IDs have been identified by reciprocal best hit analyses searching the two proteomes against each other and identification of gene IDs from the previous version overlapping the coordinates of genes of interest (Dataset EV1). GO Molecular Function analyzed using ShinyGo (Ge et al, 2020) sorted by fold enrichment are listed in Table 1.
Identification of homeobox or Polycomb response gene in compartments
To build a comprehensive list of homeobox and Polycomb response genes, we started from the list published in Chai et al. (Chai et al, 2008), comprising 102 homeobox genes and complemented it with the list of 331 deregulated genes that are common upon BmPHO and BmSCM (Polycomb proteins) knockdown experiments from Li et al. (Li et al, 2012). This list of genes corresponds to the previous annotation of the B. mori genome (International Silkworm Genome Consortium, 2008). Using the coordinates of these genes, we identify the corresponding genes in the most recent B. mori genome annotation (Kawamoto et al, 2019). Genes in the list that could not be attributed to a new gene model were not considered. To complement that list, we added any gene with a “homeobox” annotation in the current gff if it was not already included. Together, these are recapitulated in a list of 399 genes listed in Dataset EV2. We used the “bedtools intersect” function of the BEDTools suite (Quinlan, 2014; Quinlan and Hall, 2010) to identify genes from this list within domains of the A, B or S compartment. Genes overlapping borders have been discarded from the analysis.
RNA sequencing and read processing
Total RNA from p50 embryos 24 h after diapause release (PD-D2) and adult heads have been extracted using TRIzol™ Reagent (Thermo Fisher Scientific #15596026) following manufacturer instructions with minimal adaptation to account for the use of animal tissue instead of cells as experimental material. For embryos, two batches of ~100 eggs have been gathered in 1.5 Eppendorf tubes, and 400 µl of TRIzol™ has been added to each tube. Egg chorions were broken mechanically using piston pellets (Dutcher #45650) and embryos incubated for 5 min at RT. After a quick centrifugation step to pellet the broken chorion, the supernatants are transferred to clean tubes for RNA isolation. For adults, the head and thorax of live animals have been isolated from the rest of the body using a razor blade and placed in a 1.5 Eppendorf tube. The head and thorax of one animal have been used per replica by adding 400 µl of TRIzol™. Tissues were then broken mechanically using piston pellets (Dutcher #45650) and incubated for 5 min at RT before subsequent RNA isolation. RNA samples were then used to generate Illumina libraries using the stranded mRNA prep ligation protocol and sequenced as PE100 on NovaSeq at the CurieCoreTech high-throughput sequencing platform.
Reads have been analyzed using an in-house RNA-seq pipeline developed by the bioinformatics platform of the Institut Curie (P. La Rosa, N. Servant) using STAR v.2.7.8a aligner (Dobin et al, 2013) to obtain gene counts for each gene model. TPM have then been calculated using rnanorm (https://pypi.org/project/rnanorm/), providing the sum of exons length for each gene for normalization. Expression data are provided in the Dataset EV9.
Expression (TPM) distribution genome-wide and in each compartment or subgroup of compartment has been extracted using the “bedtools intersect” function of the BEDTools suite (Quinlan, 2014; Quinlan and Hall, 2010) and shown as box plots. Genes overlapping compartment boundaries have been discarded from the analysis.
Bigwig tracks for plotting purposes have been generated at 1 kb resolution from the .bam alignment output file from STAR and using BPM (Bins Per Million mapped reads) normalization of the “bamCoverage” function of the DeepTools suite (Ramírez et al, 2016).
Volcano plots showing differential gene expression between the two developmental stages (PD-D2 and AH) have been generated from raw stranded counts from the STAR output and the PyDESeq2 suite (Muzellec et al, 2023), and plotted using the Python sanbomics suite (https://github.com/mousepixels/sanbomics).
Hi-C experiment and read processing
Embryos of B. mori have been fixed by pocking chorions of ~30 eggs per experiment with a fine needle, directly followed by immersion in 3% final formaldehyde solution (Sigma-Aldrich F8775) in PBS for 30 min at RT. The reaction was then quenched by addition of glycine to a final concentration of 200 mM, 20 min at RT. Eggs were washed in PBS and dissected using fine forceps while immersed in PBS. Embryos were transferred to 1.5 ml Eppendorf tubes, followed by a short centrifugation step to pellet the embryos. Any residual PBS was removed, and embryos were frozen at −80 °C. To generate Hi-C libraries from adult moths, heads were dissected and immersed in 3% formaldehyde, quenched in 200 mM glycine, washed in PBS and frozen.
All samples have then been processed for Hi-C using the Arima-Hi-C+ kit from Arima Genomics following the manufacturer’s instructions. Illumina libraries have been prepared using the KAPA HyperPrep Kit from Roche (ref: KK8504) coupled to the Illumina TruSeq sequencing adapters (ref 20020590) following the modified version of the protocol by Arima.
Hi-C libraries have been sequenced as PE100 on NovaSeq at the CurieCoreTech high-throughput sequencing platform. Pair-end reads have been processed using the HiCPro suite (Servant et al, 2015). “ValidPairs” from HiCPro were used to build the Hi-C matrix file in cooler format using the cooler suite (Abdennur and Mirny, 2020). Matrices were then further normalized, manipulated, and visualized using cooler, cooltools (Venev et al, 2021), Matplotlib v3.5.2 (Hunter, 2007) or other in-house commands in Python (Python Software Foundation). Python Language Reference, version 2.7 or 3.6. Available at http://www.python.org) or R (R Core Team, 2021) (https://www.R-project.org/), as specified. All matrices were filtered for low coverage bins and normalized using “cooler balance” with the option “--mad-max 5”. Raw reads and processed matrices are available at GSE228401. Recapitulation of the number of reads processed per experiment can be found in Tables EV3 and EV4.
To improve the number of reads for all comparisons between the experimental Hi-C data of the first 6.5 Mb of Chr15 and our simulation results, the pair-end sequencing reads from the PD-D2 sample were mapped to Bmori_v4 using distiller-nf version 0.3.3 (https://github.com/open2c/distiller-nf). Distiller-nf was run with all default parameters, except for the “walks-policy” flag for pairtools, which was modified to “all,” allowing complete genome mappings for multiple ligation products to be stored—as opposed to the default parameter of “mask” which stores just the mappings of the two ends from such reads.
ChIP experiment and read processing
ChIP-seq experiments have been performed on p50 B. mori embryos 24 h after diapause release. Approximately 200 eggs were gathered in a 1.5 ml Eppendorf tube per ChIP experiment. About 200 µl of methanol-free formaldehyde (FA) (Thermo Fisher Scientific #28906) 1% in PBS was added to the samples, and egg chorions were mechanically broken using a piston pellet (Dutcher #45650). FA volume was then adjusted to 1 ml. Embryos were fixed for 10 min at RT on a rotating wheel, and the reaction was quenched by addition of glycine to a final concentration of 0.125 M for 5 min. Fixed embryonic cells in the supernatant were transferred to a clean 2 ml Eppendorf tube, and chorions were crushed with piston pellets and washed with 200 µl of PBS two to three times until only clear chorion pieces were visible and all embryonic cells were recovered in the 2 ml tube. Cells were then centrifuged for 3 min at 600 × g, 4 °C, washed with 1 ml PBS and resuspended in 130 µl Lysis buffer w/o SDS (EDTA 10 mM, Tris-HCl (pH 8.1) 50 mM + Protease Inhibitor (cOmplete™, EDTA-free Protease Inhibitor Cocktail, Roche #04693132001). Embryonic tissues were ruptured using Covaris sonication with the following parameters on an E220Evolution apparatus: 60 s, Duty 10%, Power 75 W, cycles/burst 1000, 7 °C. Cells were then lysed by addition of SDS to a final concentration of 1% and incubated for 30–60 min on ice. At this stage, an aliquot of 10 µl was taken as a control to check for the quality of the DNA. The remaining chromatin sample was sheared using Covaris sonication on an E220Evolution apparatus: 8 min (480 s), Duty 10%, Power 105 W, cycles/burst 200, 7 °C. To allow for immunoprecipitation (IP), ChIP buffer (Triton X-100 1%, NaCl 150 mM, EDTA 2 mM, Tris-HCl 20 mM, pH 8.1 + Protease Inhibitor) was added to a final volume of 1500 µl and the sample was centrifuged for 10 min at 16,000 × g, 4 °C. Cell debris were discarded and the supernatant containing soluble chromatin was transferred to a clean 2 ml Eppendorf tube. A sample of 50 µl of chromatin was kept as Input and the rest was added to the antibody (Ab) of choice (H3K36me3: Abcam ab9050, H3K4me3: Diagenode C15410003, H3K27me3: Cell Signaling 9733S, H3K9me3: Abcam ab8898, K4K20me1: Abcam ab9051, H2A.Z: Abcam ab150402) bound to Dynabeads protein A (Thermo Fisher Scientific #10002D) according to manufacturer’s instructions. The samples (Input and IP) were incubated on a rotating wheel at 4 °C overnight. IP samples containing the Ab bound to Protein A and magnetic beads were then washed once in cold TSEI (SDS 0.1%, Triton X-100 1%, NaCl 150 mM, EDTA 2 mM, Tris-HCl 20 mM, pH 8.1), four times in cold TSEII (SDS 0.1%, Triton X-100 1%, NaCl 500 mM, EDTA 2 mM, Tris-HCl 20 mM, pH 8.1) and three times in cold TE (Tris-HCl 10 mM, pH 8.1, EDTA 1 mM), incubated 2 min at RT on a rotating wheel between each wash and recovered on a magnetic rack. Finally, DNA from all samples (QC, Input and IP) was extracted by the addition of Extraction buffer (Tris-HCl 20 mM, pH 8.1, EDTA 10 mM, EGTA 5 mM, NaCl 300 mM, SDS 1%, RNase 0.1 mg/ml) to a final volume of 100 µl. After 15 min incubation at 37 °C to allow for RNase activity, 3 µl of Proteinase K (20 mg/ml, Qiagen #19131) was added to each sample, and incubation is continued at 65 °C for at least 4 h or overnight. DNA was purified using 1.2 x SPRIselect beads (Beckman Coulter #B23318) according to the manufacturer’s instructions, and the quality and quantity of samples were evaluated on a TapeStation instrument (Agilent 4200 TapeStation System).
Input and IP samples were processed to generate an Illumina library using the KAPA HyperPrep Kit from Roche (ref: KK8504) coupled to the Illumina TruSeq sequencing adapters (ref 20020590) and sequenced as PE100 on NovaSeq at the CurieCoreTech high-throughput sequencing platform. Recapitulation of the number of reads processed per experiment can be found in Table EV3. Reads have been analyzed using in-house pipeline written by the bioinformatics platform of the Institut Curie (Valentin Laroche, Nicolas Servant), aligning reads with Bowtie2 using default parameters and then filtered for a minimum mapping quality of 10 and non-ambiguously mapping reads (grep -v “XS:i”). Log2 ratios of IP over their corresponding input reads in various bin sizes were generated using the “bamCompare” function of the DeepTools suite (Ramírez et al, 2016) and plotted using pyGenomeTracks (Lopez-Delisle et al, 2021; Ramírez et al, 2018) or Matplotlib v3.5.2 (Hunter, 2007). All ChIP experiments have been done in duplicate except for H2A.Z, and Pearson’s correlations between samples have been assessed using “multiBamSummary bins”, “multiBigwigSummary bins” and “plotCorrelation” of the deepTools v3.5.0 suite. The correlations are shown in Fig. EV10. The first replicate for each mark has been used for all plots and data analyses. All raw reads and log2 ratio files are available at GSE228401.
Compartment calling: k-means clustering
The 40 kb bin resolution matrix for p50_D2-24h embryos have been processed using the HiCExplorer utilities (Ramírez et al, 2018; Wolff et al, 2020, 2018) to produce the corresponding Pearson correlation matrix. After normalization, principal component (PC) 1 to 3 have been extracted using Scikit-learn (preprint: Buitinck et al, 2013; Pedregosa et al, 2011). Although the first two PC explain most of the signal for all chromosomes, we kept the first three PC (explaining 80% of the matrix variance ratio) to apply the k-means clustering method of Scikit-learn on the Pearson correlation matrix for each chromosome of the genome. We initially called three clusters per chromosomes, to fit our visual observation of the matrix. We realized that this was overcalling the cluster corresponding to the S compartment on chromosomes where S is sparse. Thus, for Chr15, 17, 21, and 28, we decided to call 5 clusters instead, among which one corresponds to the sparse S compartment, and the four others were fused into two and correspond to compartments A and B. In addition, we also adjusted our analyses for Chr11c and Chr24c. Chr11c contains the rDNA locus and is thus split into two large non-interacting parts. We therefore treated the two parts (1:4479999 and 4480000:20440007) as if they were two independent chromosomes to be able to match the PCAs to the contact patterns. Chr24c is interrupted by a very large repeat-rich and unmappable region. Thus, we also treated the two large mappable regions (1:11599999 and 14200000:17359173) as individual chromosomes. For all chromosomes, clusters have been attributed to compartments A, B, or S according to the positions of their centroids. For most chromosomes, PC1 was mainly used to distinguish between A and B compartments, and PC2 corresponded to compartment S. For some chromosomes (Chr11c: 1-4479999, Chr24c: 14200000-17359173 and Chr27), PC1 and PC2 signals were inverted, so the compartment calling has been adjusted accordingly. All Python scripts have been written using Python 3 in Jupyter-notebook (Kluyver et al, 2016; Loizides and Schmidt, 2016). We provide the compartment coordinates for all chromosomes as files Datasets EV10, EV11, and EV12.
We emphasize that the contact profile of the S compartment is not due to assembly or mapping issues. Any type of large mis-assembly or recent translocation event would be readily visible as a strong intra- or inter-chromosomal signal far from the main diagonal, and this signature is not apparent in our matrices. In addition, a previous study that used both whole-chromosome and sub-chromosomal Oligopaints designed on B. mori chromosomes that contain large S domains showed a colinear organization in meiotic prophase, thus providing another line of evidence to argue against mis-assemblies (Rosin et al, 2021). Regarding mapping concerns, low-quality alignments often resulting from repeated regions would result in filtered bins (“white lines”) in Hi-C contact maps. This is not the case in S domains, which are also repeat poor compared to the full genome.
P(s) curves and derivative P(s) curves
Chromosome-wide and compartment-specific P(s) plots were generated with the expected_cis function of cooltools (Venev et al, 2021) with a smoothing sigma = 0.1. These contact decays were computed from the PD-D2 Hi-C data at 1 kb resolution. Derivative curves were generated via the numpy gradient function on the log-transformed expected contacts and distances. Distances with less than five unique domains (or chromosomes) were filtered out.
Inter-chromosomal contact analysis
All intra-chromosomal contact data were omitted from this analysis. The inter-chromosomal contact probability for each pairwise chromosome combination was calculated by dividing the observed sum of contacts between the two chromosomes by the expected sum. The observed value was calculated by taking the sum of all the contacts between each pairwise combination of chromosomes. The expected value was calculated by taking the sum of all the contacts between each pairwise combination of chromosomes from an “expected” matrix where each inter-chromosomal bin had a value equivalent to the mean of all inter-chromosomal values from the observed data. We determined the order of the chromosome pairs via hierarchical clustering of the observed/expected values.
ChIP-seq enrichment profiles in compartments
For each ChIP-seq dataset, we generated the IP/Input ratio using deepTools “bamCompare” (Ramírez et al, 2016) to show as tracks below the Hi-C matrices. Using BEDTools “intersect” (Quinlan, 2014; Quinlan and Hall, 2010), we extract the regions of the ChIP-Seq track pertaining to the loci of each compartment. We calculate the median ChIP-Seq enrichment score (ratio of read counts) within each compartment. We then divide this score by the median ChIP-Seq enrichment score genome-wide. ChIP enrichment profiles per compartments have been computed using the “computeMatrix scale-regions”, “plotProfile”, and “plotHeatmap” of the deepTools v3.5.0 suite on the log2 ratio files.
Average contact frequency
Average vs expected plots for rescaled full chromosomes in trans (inter-chromosomal contacts) or rescaled compartment domains in cis on-diagonal (within domains contacts), in cis off-diagonal (between domains contacts) or in trans, have been computed using the “pileup” function of the Coolpup.py suite (Flyamer et al, 2020). Cis on-diagonal contacts have been plotted with rescaled flanking regions of 50% of the size of the considered domains. Cis off-diagonal and trans average contacts are plotted without flanking regions.
Insulation points analyses
Looking at compact structures close to the diagonal in the Hi-C maps, we determined that A, B, and S compartment domains can include insulation points of various strengths. We used the “insulation” function of the cooltools suite (Venev et al, 2021) on the 5 kb resolution matrix of B. mori to determine insulation scores genome-wide and call significant (strong) boundaries in 25,000 kb windows using the Li threshold. Insulation region comprising filtered bins have been discarded to avoid a false positive boundary (min_frac_valid_pixels = 1.0). To avoid misassignment between compartment borders and smaller scale internal insulation points, and taking into account the resolution at which compartments have been described (40 kb), we used BEDTools “intersect” (Quinlan, 2014; Quinlan and Hall, 2010) to determine the coordinates of internal boundaries within large compartments (> 200 kb) and excluding 40 kb on each side of compartment borders to exclude them. Pileup plots of internal boundaries have been computed using the “pileup” function of the Coolpup.py suite (Flyamer et al, 2020).
Polymer simulations
General implementation
Configuration
We performed coarse-grained molecular dynamics simulations to model chromosomal segments. All simulations employed polychrom (Imakaev et al, 2019), a wrapper for OpenMM (Eastman et al, 2017). We modeled our polymers by integrating the key structural components of B. mori genome organization as follows: strong chromosome territoriality imposed by spherical confinement as a boundary condition; compartmentalization via non-specific affinities between monomers of like types, and loop extrusion via dynamic bonds joining monomers.
Chromosomes were modeled as linear chains of 1 kb monomers connected by harmonic bonds. To recapitulate the strong chromosome territoriality we observed in the Hi-C data, polymers were modeled in spherical confinement. We confirmed that the observed features of compartment S were robust to the choice of boundary conditions by repeating a subset of simulations under periodic boundary conditions (consistent with (Nuebler et al, 2018)). The confinement radius was generated such that the resulting monomer density was 0.2 monomers of unit-length per unit volume. Polymer stiffness was implemented for triplets of monomers with bond stiffness of k = 1.5 kT. Polymer conformations were initialized as random walks.
Polymer dynamics
Polymers were simulated according to a variable Langevin integrator with error tolerance of 0.01 and a collision rate 0.01 ps^−1^.
Modeling of compartmentalization
Compartments were modeled by applying non-specific attractions to like monomers. This was implemented via the polychrom force function heteropolymer_SSW. A basal attraction energy and all heterotypic (A-B, A-S, and B-S) attraction energies were set to 0 kT. The radius at which attraction could be experienced was 1.5 times the monomer length. Unless otherwise stated, attractions were applied to A and B monomers, while S-S attractions were set to 0 kT.
Modeling of loop extrusion
Consistent with previous studies from our group, extrusion dynamics involved: (1) binding of extruders to a pair of monomers, (2) progressive bidirectional translocation of the extruder along the chain, (3) stalling at a barrier or other loop extruder, (4) unbinding of the extruder from chromatin and release of the loop. Loop extrusion dynamics were simulated using extrusionlib (Samejima et al, 2025; Fudenberg et al, 2016).
Separations of extruders were implemented by defining the number of extruders in the system, and in the case of targeted loading, by imposing different probabilities of extruder binding to a monomer according to whether the monomer was assigned to S. The probability of a loop extruder becoming captured by a barrier (not extruding) was 100% and the release probability (loop extruder regaining extrusivity) was 0%, meaning that when a loop extruder approached a barrier, it would always be stalled at that barrier until it dissociated from the chromatin. These probabilities were in effect, regardless of the direction at which a loop extruder approached the boundary (whether the loop extruder approached the barrier from within S, A, or B chromatin). When two loop extruders collided with each other, they stalled until one of the extruders unbinds from the chromosome, allowing the other to resume extruding. Loop extruders were implemented in the polymer simulations as harmonic bonds that updated in position with each extrusion step. Loop extruder bonds were updated every 750 molecular dynamics timesteps. Assuming that the same motor extrudes loops across the genome, we used a single processivity for extruders across the full region, regardless of the underlying compartment type.
Data processing
Equilibrated polymer conformations were utilized for Hi-C map generation and subsequent analyses. Both the equilibration time and the number of captured conformations (overall simulation time) varied, depending on the model class (minimal model versus data-fitting model, see below for model-specific details). Equilibration was estimated as the timepoint after which P(s) became time invariant across all length scales.
Simplified model
Each chain consisted of 3600 monomers. Compartments were each 400 monomers with an arbitrary pattern of: B, A, S, A, B, A, S, A, B. Each simplified model was simulated three times in parallel, with each confined sphere containing ten copies of the 3600-monomer chain. 20,000 equilibrated conformations were recorded and used for analyses. Hi-C maps were normalized via iterative correction and visualized at bin sizes corresponding to five monomers per bin. We found that the high regularity of compartment sizes introduced artifacts to the scaling of contacts versus distance. Therefore, we performed simulations of the same chain with all attraction energies set to 0 kT, which we used as the reference for expected contacts for given distances for visualized observed-over-expected maps. Each expected map for a simulation with extrusion was likewise simulated without affinity.
Model of a B. mori chromosome
Each chromosome consisted of three connected copies of the first 6.5 Mb of Chr15, yielding a chain of 19,500 monomers within a sphere of confinement. For each model, five independent replicate spheres were simulated. Monomer affinities were assigned according to the compartment assignments generated from the PD-D2 Hi-C data. Across all simulations in this section, S-S attractions were set to 0 kT. Conformations for 8500 equilibrated timesteps were recorded and used for analyses.
Initial sweep of loop extrusion parameters
Extrusion models were simulated while varying the following three parameters:
- Processivity (λ): 19, 28, 56, 84, 112, 140, 280 kb
- Separation in S (d_S_): 19, 28, 56, 84, 98, 112, 140, 280, 560 kb
- Separation in A and B (d_A&B_): 112, 190, 280, 560 kb
These models were simulated with two different sets of homotypic attraction energies for A monomers and B monomers:
- A-A attraction energy = 0.12 kT and B-B attraction energy = 0.04 kT
- A-A attraction energy = 0.18 kT and B-B attraction energy = 0.08 kT
Initial sweep of compartment parameters
For models without loop extrusion, each chromosome territory consisted of three unconnected copies of 0–6.5 Mb segments of Chr15. Homotypic attraction energies for A monomers were swept between 0.00 and 0.28 kT in 0.02 kT intervals, and B monomers were swept between 0.00 and 0.18 kT in 0.02 kT intervals.
Narrowed sweep of both loop extrusion and compartment parameters
The final parameter sweep consisted of all permutations of the following parameters:
- Loop extruder processivity (λ): 55 and 110 kb
- Loop extruder separation in S (d_S_): 19, 37, 55, 110 kb
- Loop extruder separation in A and B (d_A&B_): no binding, 5d_S_, 10d_S_. Models with λ/d_A&B_ greater than 0.3 were excluded, given that models with λ/d_A&B_ below this value had minimal effects on P(s) curves computed for contiguous regions of A and B (not shown).
- Monomer affinities: Sets of attraction energies that yielded both A-A and B-B off-diagonal enrichments within 25% of the experimental values were selected for subsequent modeling. A total of 90 sets of A-A and B-B affinities satisfied this criterion.
Hi-C maps for the chromosome territories were generated using polychrom’s monomerResolutionContactMapSubchains function, which merged the various replicates. The maps were subsequently binned and normalized via mean centering, followed by iterative correction via the numutils.iterative_correction_symmetric function from cooltools (Imakaev et al, 2012; Open2C et al, 2024). Maps corresponding to Chr15: 0–6.5 Mb were subsequently generated by averaging the three copies of this segment chromosome territory that comprised the chromosome-length chain. P(s) was computed from monomer-resolution heatmaps with a capture radius of six times the monomer length, while all other Hi-C level comparisons were made at 10 kb resolution with a capture radius of 7.5 times the monomer length.
The numutils.observed_over_expected function was used to compute observed-over-expected maps from chromosome-length maps for both the PD-D2 experimental and model Hi-C maps. Compartment interactions (A-A, A-B, A-S, and so on) were computed from the observed-over-expected maps, filtering all contacts below the length of the longest compartment (first 60 diagonals were filtered). Contact enrichments within contiguous domains of S (S_intra_) were also computed from these observed-over-expected maps, without any diagonals filtered. The experimental Hi-C map used to compute the enrichments was balanced for Chr15 independently, to maximize similarity between how the experimental and simulated datasets were processed.
From these enrichments, we considered a model successful based on general agreement with the three S features outlined in the text and A/B compartment strength. As such, we evaluated model success as those satisfying the following criteria:
- model’s S_intra_ is within 20% of the experiment’s S_intra_;
- S_inter_ < 1;
- S-A, S-B, and S-S_inter_ are within 10% of each other;
- A/B compartment score within 20% of the experimental value. We additionally evaluated models based on the mean-squared error of these seven enrichment levels between the model and the experimental data.
Analyzing radial positioning from polymer models
Radial positioning analysis was performed for the following three models:A-A attraction energyB-B attraction energyS-S attraction energyλd_S_d_A&B_Model 1 – compartments only0.12 kT0.06 kT0.00N/AN/AN/AModel 2 – compartments and extrusion only in S0.12 kT0.06 kT0.0055 kb18 kbN/AModel 3 – compartments and extrusion0.12 kT0.06 kT0.0055 kb18 kb220 kb
Note, Model 3 was the best fitting model from the Hi-C data-fitting model, as defined by the lowest mean-squared error from experimental values, and also passing the semi-quantitative criteria.
About 120 conformations were analyzed per model. The area corresponding to the polymer confinement was sectioned into 25 shells of equal volume. Monomers’ positions within these spheres were recorded and are displayed in Fig. EV7A. Distributions for A, B, and S were subsequently normalized based on the total number of monomers of their respective type (A = 7200 monomers/sphere, B = 8625 monomers/sphere, S = 3675 monomers/sphere).
Code pertaining to polymer modeling and related analyses are stored at https://github.com/mirnylab/bombyx2024.
B. mori embryos, adults, and slide preparation for FISH experiments
Embryos were freshly laid in the lab from moths derived from embryos from Coastal Silkworms. Embryos were kept in diapause at 4 °C for at least 6 months. Adults (all females) were derived from larvae from FramsChams Panther Chameleons and reared at room temperature on artificial diet.
For experiments with embryos, embryos were incubated at 25 °C for 24 h before harvesting cells. For slide preparation, chorions were softened by soaking embryos in 50% bleach for 10 min at RT. Chorion and vitelline membranes were then manually removed with forceps in a glass dissecting dish containing RT Sf-900 media (Gibco). Cells were then harvested directly from the dissecting dish, centrifuged at 600 × g for 5 min at RT, and resuspended in 1X PBS. Cells in PBS were settled on poly-L-lysine-treated glass slides for 15 min before fixing with 4% PFA in PBS-T (0.1% Triton X-100 in PBS) before proceeding with FISH.
For cells from adult heads, the heads were cut off, and the head and antennae were removed. Heads were cut into small pieces using micro scissors and then washed twice with RT Sf-900 media. Head fragments were then crushed with a pestle and digested with collagenase and papain to a final concentration of 2 mg/mL and 2 mg/mL, respectively, in media. The tissue was incubated at 37 °C for 30 min at 650 rpm. About 1 mL of media was then added to stop enzymatic digestion. Cells were filtered through a 40-uM mesh, and the filtrate was centrifuged at 600 × g for 5 min before being resuspended in 1X PBS and settled for 15 min on a poly-L lysine-coated slides. Finally, cells on slides were fixed with 4% PFA in PBS-T for 15 min before proceeding with FISH.
Oligopaint design, synthesis, DNA FISH, and image analysis
Oligopaint libraries were designed using a modified version of the Oligominer pipeline (Beliveau et al, 2018) based on Bm_v4 genome assembly. CT paints were previously published (Rosin et al, 2022, 2021). Coordinates and all information for single-domain paints can be found in the following table. Oligopaints were synthesized as previously described by adding barcodes to each 70 bp oligo for PCR-based amplification (Shav-Tal, 2019) (Dataset EV13).DomainPaint startPaint stopSize (kb)Probes/kbTotal # of oligosChr04_set1_A880000090000002003.0603Chr04_set2_A13000000132000002002.9576Chr04_set1_B336000035600002004.4874Chr04_set2_B756000077600002004.0811Chr04_set1_S12100000123000002003.0611Chr04_set2_S16300000165000002003.3656Chr17_A504000053200002802.3643Chr17_B896000092400002802.4681Chr17_S12040000123200002802.4679Chr23_A15920000162400003202.4782Chr23_B910000094200003202.4765Chr23_S458000049000003202.4776
FISH with oligopaints was performed as previously described (Rosin et al, 2022). Briefly, after fixation, cells on slides were washed 3×5 min each in PBS-T at RT. Slides made from adult heads were permeabilized with 100% methanol at −20 °C, then washed once in PBS-T. All slides were permeabilized with PBS containing 0.5% Triton X-100 for 15 min at RT. Cell nuclei were predenatured using the following washes: 1 × 5 min in 2xSSCT (0.3 M NaCl, 0.03 M sodium citrate, 0.1% Tween-20) at RT, 1 × 5 min 2xSSCT/50% formamide at RT, 1 × 2.5 min in 2×SSCT/50% formamide at 92 °C, 1 × 20 min at 60 °C in 2xSSCT/50% formamide. Primary Oligopaint probes in hybridization buffer (10% dextran sulfate/2xSSCT/50% formamide/4% polyvinylsulfonic acid) were applied to slides and sealed under a 22 × 22 mm coverslip with rubber cement. Once rubber cement had dried, cells were then denatured at 92 °C for 2.5 min and placed in a humidified chamber at 37 °C overnight. The next day, slides were washed as follows: 2xSSCT at 60 °C for 15 min, 2xSSCT at RT for 15 min, 0.2xSSC at RT for 5 min. Secondary probes (10 pmol/25 µL) containing fluorophores were added to slides, resuspended in hybridization buffer as described above, covered with 22 × 22 mm coverslip and sealed with rubber cement. Slides were incubated at 37 °C in a humidified chamber for 2 h before repeating the above washes. Finally, some slides were stained with DNA stain* in PBS for 15 min, followed by 2 × 5 min washes in PBS-T. All slides were mounted in Prolong Diamond (Invitrogen) and left to cure at least 24 h at RT before sealing with nail polish.
For widefield imaging of embryonic cells, primary probes were used for either 3- or 4-color FISH. For 3-color FISH, 2 domains plus the CT were painted, and cells were stained with DAPI. For 4-color FISH, 3 domains plus the CT were painted. Data from these two experiment types were grouped for all analyses. A total of at least three biological replicates (10–20 pooled embryos each) were performed for each FISH experiment (three reps for widefield and three separate reps for STED), measuring domain position in the CT. For widefield imaging of adult head cells, primary probes were used for 5-color FISH (DAPI, CT, plus 3 domains). Two biological replicates (2 heads per rep) were performed for adult head experiments.
For all STED imaging, whole-chromosome paints for ch4 were combined with single-domain probes (set 1 S or B), labeled with ATTO565 (domain) and ATTO647N (CT), and DNA was stained with NucGreen Dead 488 (Invitrogen) or NucSpot 470 Nuclear Stain (Biotium Inc).
Widefield images of embryonic cells were acquired on a Leica DMi6000 widefield fluorescence microscope using an HCX PL APO 100x/1.40-0.70 Oil objective (Leica) and Leica DFC9000 sCMOS Monochrome Camera. DAPI, CY3, CY5, and FITC filter cubes were used for image acquisition for all experiments. Widefield images of adult cells were acquired on a Leica DMi8 widefield inverted fluorescence microscope using an HCX PL APO 100x/1.40-0.70 Oil objective (Leica), K8 CMOS Monochrome Camera, and LED8 light source. The DMi8 is equipped with DAPI, FITC, CY3, CY5, and CY7 filter cubes. All widefield images were post-processed using Huygens deconvolution software (SVI, Hilversum, Netherlands). All STED images were acquired using TauSTED with Xtend technology on a Leica Stellaris 8 with 3D STED module equipped with a 775-nm depletion laser, Power HyD X SP detector, SuperZ Galvo stage, and 100X/1.40 STED WHITE oil-immersion objective.
All images were processed using the LasX software and Huygens deconvolution software, and tiffs were created in ImageJ. For quantification, deconvolved images were segmented and measured using a customized pipeline in the TANGO 3D-segmentation plug-in for ImageJ (Ollion et al, 2013), using either the “Hysteresis” or “Spot Detector 3D” algorithms. Each channel is segmented independently using one dataset, and then the same segmentation (which includes thresholding) is applied to all channels and all datasets. For all chromosomes, the Oligopaints used for the small foci are ~4 probes per kb, and the CTs are ~1 probe per kb. The increased number of fluorophores on compartment probes leads to more out-of-focus light on a widefield microscope, creating a slight artifact of small objects appearing larger than they really are. Additionally, deconvolution, which redistributes out-of-focus light, does further contribute to this issue. To mitigate this potential artifact, all measurements were made from the center of the foci, not the edge. For shell analyses, chromosome territories are divided into 3D layers of equal volume (shells) that are indexed. By determining which of the 3D shells contains the center of the focus, the numerical index of the corresponding shell is assigned, indicating the focus’ position within the chromosome territory. The “normalization” factor accounts for variations in the size and shape of the nucleus within a cell population. In addition to the projections shown in Figs. 5 and EV7 and EV8, we also show a series of optical sections through the chromosome to show A, B and S positioning for Chr04 set 1 (Fig. EV7K). To show the effect of deconvolution, we also show raw images (no deconvolution) side-by-side with deconvoluted images in Fig. EV7L.
All measurements were performed in 3D, and cells were mounted in Prolong Diamond. While this does slightly flatten cells, curing mounting media such as Prolong Diamond are most ideal for reducing out-of-focus light by minimizing changes in light diffraction as the light passes through the coverslip and to the mounting media. Statistical analyses were performed using Prism 9 software by GraphPad. Figures were assembled in Adobe Illustrator.
Supplementary information
Table EV1 Table EV2 Table EV3 Table EV4 Peer Review File Data Set EV1 Data Set EV2 Data Set EV3 Data Set EV4 Data Set EV5 Data Set EV6 Data Set EV7 Data Set EV8 Data Set EV9 Data Set EV10 Data Set EV11 Data Set EV12 Data Set EV13 Source data Fig. 1 Source data Fig. 2 Source data Fig. 5 Source data Fig. 6 Expanded View Figures
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Buitinck L, Louppe G, Blondel M, Pedregosa F, Mueller A, Grisel O, Niculae V, Prettenhofer P, Gramfort A, Grobler J et al (2013) API design for machine learning software: experiences from the scikit-learn project. Preprint at ar Xiv 10.48550/ARXIV.1309.0238
- 2Das M, Semple JI, Haemmerli A, Volodkina V, Scotton J, Gitchev T, Annan A, Campos J, Statzer C, Dakhovnik A et al (2024) Condensin I folds the Caenorhabditis elegans genome. Nat Genet 56:1737–174910.1038/s 41588-024-01832-539039278 · doi ↗ · pubmed ↗
- 3Galitsyna A, Ulianov SV, Bykov NS, Veil M, Gao M, Perevoschikova K, Gelfand M, Razin SV, Mirny L, Onichtchouk D (2023) Extrusion fountains are hallmarks of chromosome organization emerging upon zygotic genome activation. Preprint at bio Rxiv 10.1101/2023.07.15.54912010.1038/s 41467-026-69105-9PMC 1301819141690937 · doi ↗ · pubmed ↗
- 4Isiaka BN, Semple JI, Haemmerli A, Thapliyal S, Stojanovski K, Das M, Gilbert N, Glauser DA, Towbin B, Jost D et al (2023) Cohesin forms fountains at active enhancers in C. elegans. Preprint at bio Rxiv 10.1101/2023.07.14.549011
- 5Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat. Methods 9:357–35910.1038/nmeth.1923 PMC 332238122388286 · doi ↗ · pubmed ↗
- 6Meneu L, Chapard C, Serizay J, Westbrook A, Routhier E, Ruault M, Perrot M, Minakakis A, Girard F, Bignaud A et al (2025) Sequence-dependent activity and compartmentalization of foreign DNA in a eukaryotic nucleus. Science 387:eadm 946610.1126/science.adm 946639913590 · doi ↗ · pubmed ↗
- 7Mercy G, Mozziconacci J, Scolari VF, Yang K, Zhao G, Thierry A, Luo Y, Mitchell LA, Shen M, Shen Y et al (2017) 3D organization of synthetic and scrambled chromosomes. Science 355:eaaf 459710.1126/science.aaf 4597 PMC 567908528280150 · doi ↗ · pubmed ↗
- 8Muzellec B, Teleńczuk M, Cabeli V Andreux M (2023) Py DE Seq 2: a python package for bulk RNA-seq diff erentialexpression analysis. Bioinformatics 39:btad 54710.1093/bioinformatics/btad 547PMC 1050223937669147 · doi ↗ · pubmed ↗
