Single-cell glycome and transcriptome profiling uncovers the glycan signature of each cell subpopulation of human iPSC-derived neurons
Haruki Odaka, Hiroaki Tateno

TL;DR
A new method reveals hidden cell types in stem cell-derived neurons and identifies unique sugar markers to detect non-neural cells.
Contribution
The study introduces glycan markers and detection probes for non-neuronal cells in iPSC-derived neurons using single-cell glycan-RNA sequencing.
Findings
Four subpopulations of iPSC-derived neurons were identified, including non-neural undiffNPCs and mesenchymal cells.
Glycan markers like α1,3-fucose and poly-LacNAc were found to distinguish undiffNPCs and mesenchymal cells.
Pseudotime analysis showed NPCs can transition into mesenchymal cells during differentiation.
Abstract
Human induced pluripotent stem cell (iPSC)-derived neurons are often heterogeneous, posing challenges for disease modeling and cell therapy. We previously developed single-cell glycan and RNA sequencing (scGR-seq) to analyze the glycome and transcriptome simultaneously. Here, we applied scGR-seq to examine heterogeneous populations of human iPSC-derived neurons. We identified four subpopulations: mature neurons, immature neurons, undifferentiated neural progenitor cells (undiffNPCs), and mesenchymal cells (MCs). Lectin-binding patterns indicated high α1,3-fucose expression in undiffNPCs. MCs exhibited strong binding of a poly-LacNAc-recognizing lectin (rLSLN) and high expression of B3GNT2, a poly-LacNAc synthetic enzyme. Pseudotime analysis revealed that a subpopulation of NPCs acquired mesenchymal features and differentiated into MCs. Immunocytochemistry confirmed the specific…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Glycosylation and Glycoproteins Research · Pluripotent Stem Cells Research
Introduction
Induced pluripotent stem cells (iPSCs) are useful cell sources for disease modeling, drug screening, and regenerative medicine (Aboul-Soud et al., 2021) since they can be differentiated into any somatic cell. However, the regulation of the differentiation process of iPSCs is complex: various degrees of differentiation states and lineages of iPSCs are generated during the differentiation process. Contamination of undifferentiated cells or differentiated cells skewing from the desired lineage may confound data interpretation at disease modeling, undermine the reliability of drug screening systems, and increase the risk of tumorigenicity in cell therapy (Lee et al., 2013). Therefore, it is essential to understand the characteristics of undifferentiated and non-target cells that contaminate the target cell population and to ensure quality control by detecting and removing these cells using appropriate markers.
The surface of all living cells is decorated with glycans produced through the combined activity of hundreds of enzymes. Cell surface glycans vary depending on cell types and cellular states such as differentiation, tumorigenesis, metastatic transformation, aging, and inflammation (Fujitani et al., 2013). In particular, stem cells and tumor cells have been reported to express specific glycan epitopes that are minimally expressed in normal cells, making them promising candidate molecules for cell surface markers (Alghazali et al., 2024; Čaval et al., 2023). In fact, many stem cell surface markers such as SSEA3/4, Tra-1-60/81, and H type3 are glycans.
However, conventional glycome analytics such as mass spectrometry, high-performance liquid chromatography, and lectin microarrays require at least thousands of cells for the analysis and could not analyze the glycan expression of each cell. Therefore, it was not possible to understand the heterogeneity of complex cell populations. Recently, we have developed single-cell glycan sequencing using DNA-barcoded lectins and next-generation sequencing (scGlycan-seq) and then combined it with single-cell RNA-seq (scRNA-seq) to realize the integrated analysis of glycan and RNA in single cells (scGR-seq) (Keisham et al., 2024; Minoshima et al., 2021). The single-cell transcriptomic data allow us to analyze the gene expression of glycosylation enzymes in single cells; however, predicting cellular glycome only from gene expression data is difficult due to the complex glycosylation mechanism (Keisham et al., 2024). Together with transcriptome data, scGR-seq can also directly analyze the glycan profile of each cell from the binding signal of a panel of 39 DNA-barcoded lectins, which cover various glycan epitopes including sialylation, fucosylation, mannosylation, and galactosylation. Therefore, scGR-seq can reveal each cell’s cell surface glycan expression, constituting a complex cell population, such as a tissue, and search for specific detection probes for target cells.
Human iPSC-derived neurons are expected to be sources of cell-based therapies for brain diseases such as Parkinson's disease, spinal cord injury, and stroke (Nagoshi et al., 2020; Palma-Tortosa et al., 2021; Takahashi, 2020). Despite these advantages, in vivo transplantation of neural progenitor cells (NPCs) from certain iPSC lines can lead to graft overgrowth and tumor formation due to the appearance of differentiation-resistant NPCs (de Luzy et al., 2021; Nori et al., 2015). Additionally, neural crest-like cell (NCC) contamination has been reported during differentiation into NPCs that generate undesired grafts after transplantation (Colleoni et al., 2010; Curchoe et al., 2010; Isoda et al., 2023). However, the glycan diversity in these undifferentiated or contaminated cells remains unknown.
In this study, we performed single-cell glycomic profiling of human iPSC-derived neurons by scGR-seq. Simultaneous analysis of transcriptome and glycan profiles has enhanced the precision of cell clustering and enabled accurate correlation of each cell type with its corresponding cell surface glycans. Two non-neural populations, undifferentiated NPCs and mesenchymal cells, were identified in iPSC-derived neurons. scGR-seq analysis has characterized the glycan profiles of these non-neuronal cells and successfully identified novel cell surface glycan markers. In conclusion, scGR-seq not only reveals the cell surface glycans of each cell comprising a complex cell population, such as iPSC-derived cells, but also allows for the discovery of detection probes that target cell surface glycans.
Results
Generation of iPSC-derived neural cells
First, we differentiated iPSCs using a dual Smad inhibition strategy to generate NPCs (Figure 1A). Immunocytochemistry (ICC) showed that almost all cells were positive with pan-NPC markers, NESTIN, and PAX6 (Figure S1A). Increased expression of NPC marker genes (NESTIN, PAX6, and SOX1) and decreased expression of a pluripotency marker gene, OCT4, during neural differentiation were also confirmed by quantitative polymerase chain reaction (qPCR) (Figure S1D). NPCs were further differentiated into neurons for 21–28 days. qPCR analysis showed a robust increase of pan-neuronal (TUBB3 and MAP2), telencephalon (FOXG1), excitatory cortical deep layer neuron (TBR1 and CTIP2), and inhibitory GABAergic neuron (GAD67) markers by differentiation into neurons. We also detected a slight increase in cortical upper layer neuron (CUX1), dopaminergic neuron (TH), and peripheral neuron (PRPH) markers, but not peripheral motor neuron (HB9) and astroglial (GFAP) markers (Figures S1E and S1F). ICC confirmed the high purity of TUJ1 (TUBB3-encoding neural protein)-positive neurons, and subsets of these neurons expressed FOXG1, TBR1, CTIP2, and GAD67 (Figures S1B and S1C). These results demonstrated the successful differentiation of iPSC-derived NPCs and neurons, including cortical excitatory and inhibitory neurons.Figure 1scGR-seq of iPSC-derived neural cells(A) Schematic illustration of neural differentiation of iPSCs.(B) UMAP plots using only RNA data, and glycan data, and both RNA and glycan data of iPSCs (n = 81 cells, red), NPCs (n = 76 cells, green), and neurons (n = 49 cells, blue).(C) Dot plot of differentially binding lectins in each cell group.(D) Violin plots of iPSC-binding lectin (rBC2LCN), NPC-binding lectin (rSRL), and neuron-binding lectin (rPhoSL) using scGR-seq data and fluorescent staining image with the same lectins. Insets show high magnification of selected fields. Scale bars, 100 μm.
scGR-seq of iPSC-derived neural cells revealed both the glycan and transcriptome signatures
To characterize the glycan expression of each cell type derived from iPSCs, we analyzed iPSCs (96 cells), NPCs (96 cells), and neurons (72 cells) by scGR-seq analysis (Odaka et al., 2022). After quality control of the single-cell data as indicated in the methods section, we obtained the scGR-seq datasets of iPSCs (81 cells), NPCs (76 cells), and neurons (49 cells), which satisfy quality criteria for downstream analysis. A median of 9,058,151 mapped reads per cell and a median of 16,273 genes per cell were detected in scRNA-seq data (Figures S1G and S1H). In scGlycan-seq, a median of 1,960 reads per cell and 30 lectins per cell were detected (Figures S1I and S1J). These RNA and glycan expression data were integrated by weighted-nearest neighbor (WNN) analysis with Seurat v4, and cells were plotted on Uniform Manifold Approximation and Projection (UMAP) based on RNA expression data, glycan expression data, or both modalities (Figure 1B) (Hao et al., 2021). In either RNA-data- or glycan-data-based UMAP plots, a small portion of cells was inadequately clustered, whereas the UMAP plot using the integrated data of two omics layers showed a clear separation of the three cell types (Figure 1B).
Next, we identified differentially expressed genes (DEGs) in each cell cluster. The highly expressed genes were as follows: iPSCs (184 genes), NPCs (97 genes), and neurons (183 genes) (Figure S1K; Table S1). These DEGs include conventional marker genes of each cell type, such as POU5F1, NANOG, and GDF3 for iPSCs; NES, PAX6, and SOX1 for NPCs; and DCX, STMN2, and GAP43 for neurons (Figure S1L). To further characterize DEGs in each cell cluster, Gene Ontology (GO) enrichment analysis was performed (Table S2). DEGs in iPSCs showed enrichment of GO terms such as “amino acid transporter” and “somatic stem cell population maintenance,” suggesting its high consumption of amino acid and stemness. In NPCs, GO terms “extracellular matrix organization,” “cell migration,” and “axon guidance” were enriched. Neuron-enriched GO terms include “nervous system development” and “chemical synaptic transmission,” consistent with its neural identity.
We also identified lectins with statistically significant differential binding in each cell group from scGR-seq data (Figure 1C). The highly binding lectins were as follows: iPSCs (10 lectins), NPCs (13 lectins), and neurons (14 lectins) (Table S3). Consistent with our previous reports, the H-type3-binder (rBC2LCN) showed higher binding to iPSCs than NPCs and neurons, while the Man-binder (rBanana) and the Galβ1-3GalNAc-binders (rABA and rSRL) showed higher intensity to NPCs than the other two cell types (Minoshima et al., 2021). The core-fucose-binder (rPhoSL) showed higher intensity in neurons than other cells, and core-fucosylated glycans have been previously reported as brain-enriched glycan structures (Williams et al., 2022). Fluorescence staining corroborated the scGR-seq data, with high binding of rBC2LCN in iPSCs, rSRL in NPCs, and rPhoSL in neurons being confirmed by lectin staining (Figure 1D). These results showed that scGR-seq data properly illustrate the characteristic glycan and RNA expression profiles of iPSC-derived NPCs and neural cells.
Sub-clustering analysis of iPSC-derived neurons identified a unique glycan epitope in non-neural contaminants
To better understand the heterogeneity in iPSC-derived neurons, we focused on scGR-seq data originating from neuron culture samples (Figure 2A). Clustering data identified two neuron clusters, such as mature neurons (mNeuron, 15 cells) and immature neurons (imNeuron, 14 cells), and two non-neuron clusters, such as undifferentiated NPCs (undiffNPC, 14 cells) and mesenchymal-like cells (MC, 6 cells), which were named based on their DEGs. Both mNeuron and imNeuron clusters expressed pan-neural gene markers. The expression of synaptic genes was higher in mNeuron than in imNeuron, suggesting synaptic maturation (Figure 2B; Table S4). GO enrichment analysis also showed significant enrichment of synapse-related GO terms such as “chemical synaptic transmission” in mNeurons (Table S5). On the other hand, undiffNPCs expressed pan-NPC, stemness, and proliferation markers (Figure 2B). MCs expressed pan-NPC and mesenchymal cell markers (Figure 2B). Extracellular matrix (ECM)-related GO terms, including “extracellular matrix organization” and “collagen catabolic process,” were enriched in MC, suggesting a higher production of ECM. Lectins showing statistically significant differences in binding among the four sub-populations are shown in Figure 2C and Table S6. Higher binding lectins were found in undiffNPCs and MCs but not in mNeuron and imNeuron clusters. Increased binding signals were observed in the undiffNPC population for rAAL, a lectin that recognizes α1,2-, α1,3-, and α1,6-fucosylated glycans (Figure 2D). In contrast, lectins specific to α1,2-fucose (TJAII, UEA-I, and rBC2LCN) and α1,6-fucose (rPhoSL) showed no significant differences in binding among cell types. Based on this pattern, α1,3-fucosylated glycans, such as Lewis X, were considered the most likely contributors to the increased rAAL signal in undiffNPCs. Among fucosyltransferase genes, FUT10 exhibited higher expression in the undiffNPC cluster (Figures S2A and S2B). While FUT10 was previously reported as an α1,3-fucosyltransferase involved in Lewis X synthesis in the brain (Kumar et al., 2013), recent studies have redefined it as a protein O-fucosyltransferase (Hao et al., 2025). Thus, its contribution to the increase in α1,3-fucosylation in this context remains unclear. MCs also showed higher binding of several lectins, and the most significant differentially binding lectin was a polyLacNAc-binder (rLSLN) (Figures 2C and 2D; Table S6). Consistently, MCs showed higher expression of B4GALT1 and B3GNT2 genes, which are involved in the poly-LacNAc elongation, suggesting that poly-LacNAc might be upregulated in this cell type (Figure 2E). To further support this observation, scatterplots illustrating the relationship between rLSLN binding and the expression levels of B4GALT1 and B3GNT2 are shown in Figure S2C. These plots demonstrate relatively strong correlations, with Spearman’s rank correlation coefficients of 0.58 and 0.40, respectively.Figure 2. Sub-population analysis in iPSC-derived neurons(A) UMAP plot using RNA and glycan data of neurons (n = 49). Four sub-populations are identified by clustering analysis. mNeuron (n = 15): mature neuron, red; imNeuron (n = 14): immature neuron, green; undiffNPC (n = 14): undifferentiated NPC, blue; MC (n = 9): mesenchymal-like cell, violet.(B) Heatmap of canonical marker gene expressions in each cell cluster.(C) Dot plot of differentially binding lectins in each cell cluster.(D) Violin plots of undiffNPC-binding lectin (rAAL) and MC-binding lectin (rLSLN) in scGlycan-seq data.(E) Violin plots of the mRNA expression of B4GALT1 and B3GNT2.
iPSC-derived NPCs are composed of two subclusters, characterized by neurogenic and mesenchymal features
To further investigate the origin of sub-populations in neuron cultures, we performed a sub-population analysis in NPC cultures (76 cells). Cluster analysis identified two cell clusters in NPC cultures, neuronal NPCs (nNPCs, 45 cells), and mesenchymal-like NPCs (mNPCs, 31 cells) (Figure 3A). Identification of the DEGs between the two clusters showed that nNPCs were characterized by higher expression of early neural genes such as DCX, MAP1B, and HES6, while mNPCs highly expressed mesenchymal genes such as ACTA2, PDGFRB, and MCAM (Figure 3B; Table S7). Note that both nNPCs and mNPCs expressed a comparable level of pan-NPC, stemness, and proliferation marker genes, suggesting that these clusters cannot be distinguished by canonical NPC markers (Figure S3). GO enrichment analysis for DEGs showed significant enrichment of “nervous system development” in nNPCs, supporting its commitment to differentiation into neurons (Table S8). On the other hand, mNPCs showed significant enrichment of ECM-related GO terms, including “extracellular matrix organization,” which were also enriched in the MC cluster. rPhoSL, which showed higher binding to neurons than iPSCs and NPCs (Figure 1C), also showed higher binding to nNPCs than mNPCs, as expected (Figures 3C and 3D; Table S9). Consistently, mNPCs showed a higher signal in rLSLN and higher expressions in B4GALT1 and B3GNT2 genes than nNPCs, similar to MCs (Figures 3C–3E).Figure 3. Sub-population analysis in iPSC-derived NPCs(A) UMAP plot using RNA and glycan data of NPCs (n = 76). Two sub-populations are identified by clustering analysis. nNPC (n = 45): neuronal NPC, red; mNPC (n = 31): mesenchymal NPC, green.(B) Violin plots of neuronal and mesenchymal gene expressions in each cell cluster.(C) Dot plot of differentially binding lectins in each cell cluster.(D) Violin plots of mNPC-binding lectin (rLSLN) and nNPC-binding lectin (rPhoSL).(E) Violin plots of the mRNA expression of B4GALT1 and B3GNT.
In previous studies, it was found that NCCs in iPSC-derived NPCs could become contaminated and potentially differentiate into mesenchymal cells (Colleoni et al., 2010; Curchoe et al., 2010; Isoda et al., 2023). We examined the expression of NCC marker genes (SOX9, SOX10, PAX3, PAX7, and NGFR) in mNPCs. Unexpectedly, mNPCs showed lower expression of these NCC marker genes than nNPCs (Figure S7). PAX3 and PAX7 were not detectable in mNPCs and nNPCs. Furthermore, we observed a strong expression of PAX6, a negative marker of NCC, in mNPCs. This led us to conclude that mNPCs are not NCCs, but NPCs that have acquired mesenchymal features.
Integrated analysis of RNA and glycans with pseudotime revealed the dynamic changes in glycan and transcriptome signatures along the cell trajectory
Next, we performed pseudotime analysis to clarify the relationship between neural sub-populations. The cells from NPCs and neuron cultures were plotted on a UMAP plot based on both glycan and RNA modalities and then their trajectory was determined by Monocle 3 (Figures 4A and 4B). Pseudotime analysis identified two distinct lineage trajectories: the neuron lineage trajectory and the MC lineage trajectory. The neuron lineage trajectory consists of the sequential transitions from nNPCs and undiffNPCs to imNeuon/mNeurons, illustrating differentiation from NPCs to neurons. The MC lineage trajectory, on the other hand, consisted of transitions from nNPCs and mNPCs to MCs. These results supported the idea that nNPCs and mNPCs are the progenitors of neurons and MCs, respectively.Figure 4. Pseudotime trajectory analysis with scGR-seq(A) UMAP plot using RNA and glycan data of NPCs (n = 76) and neurons (n = 49). The graph shows the identified sub-populations in Figures 2 and 3.(B) UMAP plot with pseudotime in each cell with a branched trajectory graph inferred by Monocle 3. The neuron lineage trajectory consists of sequential transitions from nNPCs and undiffNPCs to imNeuon/mNeurons. The MC lineage trajectory consisted of transitions from nNPCs and mNPCs to MCs.(C) Heatmap plots of co-regulated gene modules in each cell cluster.(D) Heatmap plot of the correlation between co-regulated gene modules and the binding profile of lectins.(E and F) Dynamically regulated gene modules (E) and lectins (F) along the pseudotime of neuron-linage trajectory. The x axis represents the pseudotime of the neuron-linage trajectory.(G and H) Dynamically regulated gene modules (G) and lectins (H) along the pseudotime of MC-linage trajectory. The axis represents the pseudotime of the MC-linage trajectory.
Simultaneous measurements of glycan and RNA in single cells provide a means to investigate the relationship between gene expression profile and glycan profile. To summarize the gene expression profile in NPC and neuron culture datasets, co-regulated genes are grouped into gene modules by Monocle 3 (Cao et al., 2019). Twelve gene modules were obtained and examined expression in each cell cluster and enrichment of GO terms to characterize gene modules (Figures 4C and S4A; Table S10). Modules 1, 2, 4, 5, and 7 were highly expressed in the imNeuron/mNeuron cluster and showed enrichment of neuron-related GO terms, including “nervous system development” and “chemical synaptic transmission.” Module 10 is also expressed in a portion of the imNeuron/mNeuron cluster and shows GO term enrichment “response to bacterium,” consisting of C3 and DHX58 genes (Figure S4A; Table S10). Module 11 comprises cell cycling genes annotated as “cell division” and is highly expressed in proliferating cells such as mNPCs/nNPCs. Modules 3 and 6 showed enrichment of ECM-related GO terms such as “cell adhesion” and were enriched in MC and mNPC clusters, respectively. Module 9 is highly specific in the undiffNPC cluster and includes several cancer-associated long non-coding (lnc) RNA genes (CASC15, NFIA-AS2, LINC01748, LINC00689, DNAJC27-AS1, and FEZF1-AS1) that are expressed in neuroblastoma, glioma, and other cancers (Liu et al., 2019; Russell et al., 2015; Xin et al., 2020; Zhi et al., 2015). The expression database of human lncRNAs (LncExpDB) showed that these lncRNA genes are also highly expressed in the embryonic brain but merely detected in the postnatal brain (Figure S4B) (Li et al., 2021). Modules 8 and 12 were highly expressed in nNPC and showed enrichment of neurogenesis-related and signaling-cascade-related terms.
We next examined the correlation between gene modules and the binding profile of lectins (Figure 4D). Based on the correlation pattern with lectins, gene modules were mainly separated into two clusters, one consisting of modules 1, 2, 4, 5, 7, 9, and 10, and another consisting of modules 3, 6, 8, 11, and 12. Modules 1, 2, 4, 5, 7, 9, and 10 were enriched in mNeuron, imNeuron, and undiffNPC, while modules 3, 6, 8, 11, and 12 were enriched in nNPCs, mNPCs, and MCs, suggesting the similarity of glycan profile within each cluster. Both modules 3 and 6, which were enriched in mNPCs and MCs, showed a high correlation with the poly-LacNAc-binder rLSLN, consistent with its high binding with mesenchymal populations (Figure S4C). Although neuron/imNeuon-enriched module 7 and undiffNPC-enriched module 9 showed similar lectin correlation patterns, the Fuc-binder rAAL stands out with a distinct correlation compared to the other lectins, supporting the potential usefulness of glycan ligands of rAAL as a marker for undiffNPC (Figures S4D and S4E).
Pseudotime analysis with scGR-seq also highlighted unique genes and lectins that showed differential expression along the cell trajectory. In the neuron lineage trajectory, gene module 11 was gradually decreased while gene module 2 was gradually increased (Figure 4E). Gene module 9 was spiked in the middle, then decreased after neuralization (Figure 4E). In addition, the core-Fuc binder rPhoSL was increased in the late stage, while the Fuc-binder rAAL showed a high peak at the middle stage, followed by a sharp decrease in the late stage (Figure 4F). In the MC lineage trajectory, the gene module 11 was high in the middle stage (mNPC) while the gene module 3 was increased in the late stage (MC) (Figure 4G). As expected, the mNPC/MC marker lectin rLSLN was also increased in the middle to late stage (Figure 4H). Taken together, the integrated analysis of glycan and RNA revealed its lineage identity and improved understanding of correlations and expression dynamics of the two modalities.
Identification of sub-populations in iPSC-derived NPCs and neurons through the expression of cell surface markers
Finally, we examined whether identified markers can detect sub-populations found in scGR-seq. Due to the relatively low fold change of differentially binding lectin between nNPCs and mNPCs (Figure 3), we searched the membrane-protein-encoding genes that can distinguish two populations in NPCs. Receiver operating characteristic (ROC) curve analysis identified that the PDGFRB gene showed a high performance in distinguishing mNPCs and nNPCs (AUC = 0.882) (Table S7). Flow cytometry analysis with PDGFRB antibody showed that a subpopulation of iPSC-derived NPCs highly expressed PDGFRB (Figure 5A). These PDGFRB^high^ NPCs also showed higher binding with rLSLN than PDGFRB^low^ NPCs, consistent with the glycan profile of mNPCs (Figure 5A). We then isolated PDGFRB^low^ and PDGFRB^high^ NPCs with a cell sorter (Figures S5A and S5B). PDGFRB^low^ NPCs consisted of colony-formed epithelial-like cells and polarized cells, which were reminiscent of neuroepithelial cells and neuroblasts, respectively (Figure 5B). On the other hand, PDGFRB^high^ NPCs were composed of flat cells, which were reminiscent of mesenchymal cells (Figure 5B). qPCR analysis confirmed higher expression of mNPC marker genes (PDGFRB, ACTA2) and lower expression of nNPC marker genes (DCX, MAP1B) without alternation in pan-NPC marker (NES) in PDGFRB^high^ NPCs (Figure 5C). In addition, NCC marker genes (NGFR and SOX9) were not enriched in PDGFRB^high^ NPCs, and SOX10 expression was undetectable in both PDGFRB^low^ and PDGFRB^high^ NPCs, consistent with the results of scGR-seq analysis (Figure S5C). These results guarantee the existence of mNPC and nNPC sub-populations in NPC cultures.Figure 5. Fluorescence staining of sub-population in iPSC-derived NPCs and neurons(A) Flow cytometric analysis of PDGFRB expression and binding to rLSLN in iPSC-derived NPCs. The left panel shows the fluorescence intensity of APC-conjugated PDGFRB antibody and the gating of PDGFRB^low^ and PDGFRB^high^ subpopulations. The right panel shows the fluorescence intensity of PE-conjugated rLSLN in PDGFRB^low^ and PDGFRB^high^ subpopulations.(B) Phase-contrast image of PDGFRB^low^ and PDGFRB^high^ NPCs after isolation with FACS. Arrows in PDGFRB^low^ cells show colony-formed epithelial-like cells, while arrowheads show polarized cells. Arrows in PDGFRB^high^ cells show flat cells. Scale bars, 100 μm.(C) qPCR analysis of mesenchymal and neural marker gene expressions in PDGFRB^low^ and PDGFRB^high^ subpopulations (N = 3, each from an independent sorting experiment). Relative mRNA expression of mesenchymal markers (PDGFRB and ACTA2), neural markers (DCX and MAP1B), and NPC marker (NES) were shown. Expression levels of each marker were normalized with GAPDH expression. Mean ± SD. ∗p < 0.05, ∗∗p < 0.01. N.S.: not significant.(D) Expression of Lewis X and NESTIN in a subpopulation of iPSC-derived neuron culture. Scale bars, 100 μm.(E) rLSLN-positive non-neural cells in iPSC-derived neuron culture. The arrow indicates rLSLN-positive large flat cells. Scale bars, 100 μm.
Next, we examined whether the identified glycan markers can detect non-neural populations (undiffNPC and MC) in iPSC-derived neurons. Increased binding signals were observed in the undifferentiated NPC (undiffNPC) population for rAAL, a lectin known to recognize α1,2-, α1,3-, and α1,6-fucosylated structures, whereas no significant changes were detected for lectins specific to α1,2-fucose (TJAII, UEA-I, and rBC2LCN) or α1,6-fucose (rPhoSL). Based on this, we tested Lewis X, an α1,3-fucosylated glycan, as the most plausible candidate. ICC showed the presence of NESTIN and Lewis X double-positive undiffNPCs in iPSC-derived neuron cultures (Figure 5D). To further quantify their spatial association, we performed colocalization analysis using the Coloc 2 plugin. Eight microscopic fields consistently showed statistically significant colocalization between NESTIN and Lewis X signals, with a mean Pearson’s correlation coefficient of 0.414 ± 0.119 and a Costes p value of 1.00 ± 0.00 (N = 8), indicating non-random co-localization (threshold for significance: Costes p value >0.95). On the other hand, MCs showed a high signal of the poly-LacNAc-binder (rLSLN) and high expression of B4GALT1 and B3GNT2 genes, which mediate poly-LacNAc elongation (Figure 3D). Lectin staining determined rLSLN-positive large flat cells, which are similar to mNPC morphology (Figure 5E). Furthermore, we validated rLSLN binding in MCs by flow cytometry, confirming that PDGFRB^high^ MC subpopulations exhibit elevated rLSLN signals (ΔMFI of rLSLN: PDGFRB^low^: 1843.3, PDGFRB^high^: 3599.3) (Figures S5D and S5E). These results demonstrated the presence of two types of non-neuronal cells, undiffNPCs and MCs, detectable by cell surface glycans, in iPSC-derived neural cell cultures.
Discussion
Cell surface glycans such as SSEA3/4, Tra-1-60/81, and H type3 are widely used as pluripotent stem cell markers (Alghazali et al., 2024; Hirabayashi et al., 2015). While conventional glycan analysis provides only for bulk-level samples, the recently developed scGR-seq enables simultaneous glycan and RNA profiling at the single-cell level (Minoshima et al., 2021). This allows identification of glycan profiles in heterogeneous populations and may facilitate the development of cell surface markers for purifying therapeutic cells derived from iPSCs.
To test this, we applied scGR-seq to iPSC-derived neurons, potential sources for treating brain diseases. Integrated glycan and RNA analysis in single cells distinguished iPSCs, NPCs, and neurons into clusters consistent with sample origins. Marker genes (e.g., POU5F1, NES, and DCX) and lectins (e.g., rBC2LCN, rSRL, and rPhoSL) matched previously reported markers. Sub-clustering revealed heterogeneity in NPCs and neurons. We identified poly-LacNAc expression in mesenchymal cells (detected by rLSLN) and proliferating undifferentiated NPCs marked by Lewis X glycan and stained with anti-Lewis X antibody. These findings demonstrate that single-cell glycan and RNA profiling enables unbiased identification of subpopulations and their surface glycan markers.
Differentiation-resistant NPCs may cause uncontrolled NPC expansion, reducing neuronal yield and increasing tumorigenicity risk in transplantation (Nori et al., 2015). Here, we found the differentiation-resistant NPC clusters in iPSC-derived neurons, which retained robust expression of stemness and proliferation marker genes and cancer/embryo-associated lncRNA. While their glycan profiles resembled those of imNeuron/mNeuron clusters, undiffNPCs showed notably high Lewis X expression. This epitope is also known to be enriched in neural stem/progenitor cells in the embryonic mouse brain ventricular zone (Kumar et al., 2013) and in human ESC-/iPSC-derived neural stem cells (Yuan et al., 2011). These findings support Lewis X as a potential glycan marker of NPC stemness.
Glycosyltransferase genes are often expressed at low levels and thus may not be consistently captured in scRNA-seq datasets with lower sequencing depth. In this study, we took advantage of the high sequencing depth afforded by plate-based scRNA-seq—compared to droplet-based methods—which enabled more reliable detection of glycosyltransferases to allow for a direct comparison between glycosyltransferase mRNA expression and lectin-binding signals at the single-cell level. For example, poly-LacNAc—the glycan epitope recognized by rLSLN and enriched in the MC cluster—showed a strong positive correlation with the transcript levels of its biosynthetic enzymes B4GALT1 and B3GNT2. This finding suggests that, at least for certain glycan structures, glycosyltransferase mRNA expression can be a major determinant of glycan abundance. In contrast, this relationship was not observed for the Lewis X epitope, which was enriched in undiffNPCs. Canonical α1,3-fucosyltransferases—enzymes typically responsible for Lewis X biosynthesis—did not show increased expression in undiffNPCs. Among the fucosyltransferase genes, FUT10 was the only enzyme that exhibited elevated gene expression. While it had previously been proposed as a Lewis X-synthesizing enzyme in the brain (Kumar et al., 2013), recent studies have reclassified FUT10 (along with FUT11) as protein O-fucosyltransferases rather than α1,3-fucosyltransferases (H. Hao et al., 2025). This discrepancy highlights that glycosyltransferase transcript levels do not always predict the levels of their associated glycan products. Indeed, cell-surface glycan expression is governed by a complex, multilayered regulatory network, including substrate availability, precursor supply, correct intracellular localization of enzymes, and the abundance of carrier glycoproteins at the plasma membrane. As a result, glycosyltransferase mRNA levels do not always correlate with the abundance of their associated glycan products. Therefore, while glycosyltransferase mRNA expression can contribute to glycan expression in certain cases, it is not sufficient to fully predict glycosylation profiles. In this regard, our lectin-based profiling approach offers a valuable and complementary strategy for characterizing glycosylation at the product level. By quantifying glycan epitopes directly through lectin-binding intensity, this method provides important insights into the glycome that cannot be inferred from transcriptomics alone.
A previous study reported that transplantation of iPSCs or fetal-derived NPCs into the brain or spinal cord causes histologically distinct tissues, differentiating neural tissue (DNT), undifferentiated neural tissue (UDNT), blastemal tissue (BLT), and benign mesenchymal tissue (Sugai et al., 2016). Interestingly, major component cells of these tissues seem to correspond with sub-populations in our neuron culture, such as mNeurons/imNeurons for DNT, undiffNPC for UDNT/BLT, and MC for benign mesenchymal tissue, suggesting in vitro differentiation mimics the differentiation after in vivo transplantation. Considering that immature undifferentiated tissues such as UDNT/BLT could contribute to the overgrowth of transplants and the transformation into embryonal tumors, evaluation of the emergence of undiffNPCs in vitro culture may reflect the risk of tumorigenicity of iPSC-derived NPCs. Although the detailed mechanism of the differentiation resistance of NPCs was still unclear, it has been reported that the overgrowth of NPC transplants is correlated with an increased number of copy-number variants and abnormal karyotypes of iPSCs (Sugai et al., 2016). Such genetic instability has occurred during the reprogramming process and long-term culture of iPSCs. In this study, iPSCs with passages 40–60 were used, which may have induced genetic instability and resulted in the appearance of undiffNPCs in neural cultures.
The appearance of mesenchymal cells has been reported in a variety of iPSC-derived neural culture systems such as 2D culture, transplanted NPCs, and cerebral organoids (Curchoe et al., 2010; Fair et al., 2020; Sugai et al., 2016). It has been believed that the mesenchymal cells originated from the contaminated NCCs (Colleoni et al., 2010; Curchoe et al., 2010; Sugai et al., 2016). Here, we found the contamination of mNPCs that retained both NPCs and mesenchymal features in iPSC-derived NPCs and neurons. Pseudotime analysis showed a continuous transition trajectory from nNPCs and mNPCs toward MCs, suggesting that mesenchymal characteristics may emerge progressively during differentiation. scGR-seq analysis revealed high expression of poly-LacNAc as a novel marker for mesenchymal subpopulations within the NPC/Neuron cultures. An increase in poly-LacNAc has been previously reported during the epithelial-mesenchymal transition (EMT) in cancer cells, suggesting that this glycan change is similarly associated with the EMT from neuroepithelial cells to mNPC/MC populations (Lucena et al., 2016). Consistent with our findings, Isoda et al. Recently reported the existence of an NCC-like NPC subpopulation in iPSC-derived NPCs, expressing both NPC markers (PSA-NCAM) and mesenchymal markers (CD73 and CD105) (Isoda et al., 2023). These NCC-like NPCs differ from our mNPCs by exclusively expressing the NCC marker SOX9 but share several key characteristics, including the expression of PDGFR and mesenchymal markers. Interestingly, it has been reported that PDGFD-PDGFRB is highly expressed in the germinal regions of the human dorsal neocortex but not in mice (Lui et al., 2014). Human-specific PDGFRB signaling contributes to the proliferation of neocortical radial glia (RGs) and may have played a role in the evolutionary expansion of the human neocortex. Although it is unclear whether these PDGFRB-positive RGs express other mesenchymal markers, mNPCs may reflect these human-specific RG subtypes. It will be necessary to investigate whether the acquisition of mesenchymal features of NPCs also occurred in CNS development or is artificial for in vitro differentiation of iPSC-derived NPCs.
This study has certain limitations. First, although lectin-based glycan profiling methods enable the characterization of glycan epitopes, they do not facilitate the identification of full glycan structures. Second, we utilized plate-based scGR-seq, which limits the number of analyzed cells to a few hundred, potentially leading to the omission of extremely rare cell populations due to the small sample size. However, given the inherent trade-off between sample size and sequencing depth, our analysis achieved an exceptionally high sequencing depth, with a median of 9,058,151 transcript reads per cell. Notably, we successfully detected glycosyltransferases, such as fucosyltransferases, which are typically challenging to identify in single-cell analysis due to their low expression levels. To compensate for the smaller number of cells and to enhance the robustness of our findings, we additionally validated key observations using complementary methods, including flow cytometry, fluorescence staining, and qPCR. These independent validation experiments confirmed the trends observed in the single-cell dataset, thereby strengthening the reliability of our conclusions. Recently, we developed a droplet-based scGR-seq platform capable of processing 10,000 cells simultaneously (Keisham et al., 2024). This method could be applied to investigate the rare cell populations present in iPSC-derived neuron cultures in future studies.
scRNA-seq enables the identification of cell types and their RNA markers in a non-invasive manner and has been widely used to identify markers in various cell subpopulations. While RNA analysis of membrane-protein-encoding genes has been employed to search for cell surface markers for detection and cell sorting, these markers cannot always be reliably identified through single-cell RNA analysis due to its generally low sequencing depth. Our study demonstrated that scGR-seq can be used to analyze cellular heterogenicity and determine the cell marker genes as well as the cell surface glycans. The integration of multiple modalities, glycans and RNA, can further enhance the precision of cell clustering. By conducting integrated analyses of glycan and RNA modules, the biological functions of RNA and glycans within each cell population can be inferred. Although several single-cell glycan analysis methods have been reported, scGR-seq is the only method capable of profiling diverse glycan antigens alongside RNA information (Dworkin et al., 2022; Kearney et al., 2021; Ma et al., 2022). scGR-seq can apply to any type of cells including iPSC-derived cells, organoids, tissue, and tumors. Identified cell surface glycan markers can be targeted by lectin or antibody to detect, isolate, or remove target cells. Therefore, the scGR-seq analysis provides the means to manipulate sub-populations in heterogenic cells, which will contribute to the control of the quality and safety of iPSC-derived cell products.
Materials and methods
scGR-seq of iPSCs, NPCs, and neurons
scGR-seq analysis was performed as described previously (Odaka et al., 2022). Details of the differentiation protocol from iPSCs to neurons are provided in the supplemental information. Briefly, iPSCs and NPCs were detached and dissociated for single cells with Accutase. For dissociation of neurons, Accutase supplemented with papain (50 units/mL, Worthington) and L-cysteine (0.25 mg/mL, Sigma-Aldrich) was applied; 1×10^5^ cells were incubated with 1% BSA-PBS containing DNA-oligonucleotide-conjugated lectin library (0.5 μg/mL) at 4°C for 1 h. The cells were washed with 1% BSA-PBS three times, and single cells were manually picked into a PCR tube with 10 μL distilled water using the TOPick I Live Cell Pick system (Yodaka Giken). The cells were irradiated at 365 nm, 15 W for 15 min using UVP Blak-Ray XX-15L UV Bench Lamp (Analytik Jena) and centrifuged at top speed for 30 s. Supernatants containing DNA barcodes were transferred to new PCR tube, and cells were lysed with lysis buffer of GenNextRamDA-seq Single Cell Kit (TOYOBO). DNA barcode samples were amplified by PCR with i5/i7 index primer and NEBNext Ultra II Q5 Master Mix (New England Biolabs) for 20 cycles. All PCR products were combined into one tube and purified by Agencourt AMPure XP (Beckman Coulter.). The quality of PCR products was confirmed by MultiNA (Shimadzu) and sequenced by the MiSeq sequencer (26 bp, paired-end) (Illumina). RNA library for scRNA-seq was prepared by GenNextRamDA-seq Single Cell Kit according to the manufacturer’s protocol. The quality of the RNA library was analyzed by MultiNA and sequenced by the Nova-Seq 6000 (151 bp, paired-end, NPCs, and iPSCs) and HiSeqX (151 bp, paired-end, neurons). Details of scGR-seq data processing are provided in the supplemental information.
Fluorescence staining
The cells were fixed with 4% paraformaldehyde at room temperature for 20 min. For intracellular antigen detection, the cells were incubated with 1% BSA-PBS containing 0.2% Triton X-100, whereas for extracellular antigen detection, the cells were incubated with 1% BSA-PBS at room temperature for 30 min. The cells were then stained with primary antibody or lectin at 4°C for overnight. After washing the cells three times with PBS, secondary antibodies were added as needed. Following another three washes with PBS, Hoechst33342 (1 μg/mL, 346–07951, Fujifilm Wako) was added and incubated at room temperature for an hour. After washing once with PBS, images were taken under an inverted fluorescence microscope (IX51, Olympus). The information on the antibodies used is provided in Table S11. Details of the quantitative analysis of fluorescence signals are provided in the Supporting Information.
Quantitative RT-PCR
Total RNA was extracted from the samples using the Rneasy Mini Kit (QIAGEN), following the manufacturer’s protocol. The extracted RNA was then converted into cDNA using the QuantiTect Reverse Transcription Kit (QIAGEN). qRT-PCR was performed using PowerUp SYBR Green Master Mix (Thermo Fisher) and CFX Connect (Bio-Rad). The mRNA expressions of the specified genes were normalized to the expression of the GAPDH gene. The primer sequence for qRT-PCR was described in Table S12.
Fluorescence-activated cell sorting
The NPCs were detached and dissociated into a single cell with Accutase. After washing the cells with 1% BSA/PBS, PE-labeled rLSLN (1/100) and APC-labeled anti-PDGFRB antibody (1/20, BioLegend, 323608) were added and incubated on ice for an hour. The cells were washed twice with 1% BSA/PBS and were analyzed using CytoFLEX (Beckman Coulter) for analysis or BD FACSAria III (BD biosciences) for cell sorting.
Statistics
Statistical analysis was performed using Seurat v4 for scGR-seq data and EZR on R commander version 1.41 for other analyses (Kanda, 2013). Unless otherwise specified, non-parametric Wilcoxon rank-sum tests and Spearman’s rank correlation tests were used. For multiple comparisons, p values were adjusted using the Benjamini-Hochberg correction. A p value threshold of <0.05 was considered statistically significant.
Resource availability
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Hiroaki Tateno ([email protected]).
Materials availability
This study did not generate new unique reagents.
Data and code availability
The scRNA-seq data have been deposited in Gene Expression Omnibus under accession number [GSE304173](GSE304173) and are publicly available as of the date of publication. Raw data and images are available upon request to the corresponding author.
Acknowledgments
We thank Ms. Keiko Hiemori and Jinko Murakami at National Institute of Advanced Industrial Science and Technology (AIST) for their technical assistance in preparing DNA-barcoded lectins. This work was supported by 10.13039/501100001691JSPS KAKEN(22K15660) awarded for HO, 10.13039/501100001691JSPS KAKEN(23K26872, 23H04796), and JST A-step (JPMJTR23U6) awarded for HT.
Author contributions
H.O. designed the study, collected and assembled the data, performed the data analysis and interpretation, and wrote the manuscript. H.T. conceived and designed the study, assembled the data, carried out the data analysis and interpretation, and wrote the manuscript.
Declaration of interests
This paper was applied for a patent in Japan. The application number is “2024–015818.”
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Aboul-Soud M.A.M.Alzahrani A.J.Mahmoud A.Induced Pluripotent Stem Cells (i PS Cs)-Roles in Regenerative Therapies, Disease Modelling and Drug Screening Cells 102021231910.3390/cells 10092319 · doi ↗
- 2Alghazali R.Nugud A.El-Serafi A.Glycan Modifications as Regulators of Stem Cell Fate Biology 1320247610.3390/biology 13020076 · doi ↗
- 3Cao J.Spielmann M.Qiu X.Huang X.Ibrahim D.M.Hill A.J.Zhang F.Mundlos S.Christiansen L.Steemers F.J.The single-cell transcriptional landscape of mammalian organogenesis Nature 566201949650210.1038/s 41586-019-0969-x 30787437 PMC 6434952 · doi ↗ · pubmed ↗
- 4Čaval T.Alisson-Silva F.Schwarz F.Roles of glycosylation at the cancer cell surface: opportunities for large scale glycoproteomics Theranostics 1320232605261510.7150/thno.8176037215580 PMC 10196828 · doi ↗ · pubmed ↗
- 5Colleoni S.Galli C.Giannelli S.G.Armentero M.T.Blandini F.Broccoli V.Lazzari G.Long-term culture and differentiation of CNS precursors derived from anterior human neural rosettes following exposure to ventralizing factors Exp. Cell Res.31620101148115810.1016/j.yexcr.2010.02.01320171210 · doi ↗ · pubmed ↗
- 6Curchoe C.L.Maurer J.Mc Keown S.J.Cattarossi G.Cimadamore F.Nilbratt M.Snyder E.Y.Bronner-Fraser M.Terskikh A.V.Terskikh A.V.Early acquisition of neural crest competence during h ES Cs neuralization P Lo S One 52010 e 1389010.1371/journal.pone.0013890 · doi ↗
- 7de Luzy I.R.Law K.C.L.Moriarty N.Hunt C.P.J.Durnall J.C.Thompson L.H.Nagy A.Parish C.L.Parish C.L.Human stem cells harboring a suicide gene improve the safety and standardisation of neural transplants in Parkinsonian rats Nat. Commun.122021327510.1038/s 41467-021-23125-934045451 PMC 8160354 · doi ↗ · pubmed ↗
- 8Dworkin L.A.Clausen H.Joshi H.J.Applying transcriptomics to studyglycosylation at the cell type leveli Science 25202210441910.1016/j.isci.2022.104419 · doi ↗
