Computational Drug Repurposing for Alzheimer’s Disease via Sheaf Theoretic Population-Scale Analysis of snRNA-Seq Data
Sean Cottrell, Seungmin Yoon, Xiaoqi Wei, Alex Dickson, Guo-Wei Wei

TL;DR
This paper uses advanced computational methods to find new drugs for Alzheimer’s by analyzing large-scale brain cell data.
Contribution
A novel sheaf-theoretic approach is introduced for population-scale analysis of snRNA-seq data to repurpose drugs for Alzheimer’s.
Findings
Persistent Sheaf Laplacians were used to analyze PPI networks from AD-related gene expression data.
Machine learning models identified potential drug candidates with favorable CNS-related ADMET properties.
The study generated a list of molecular targets and pharmaceutical leads for Alzheimer’s treatment.
Abstract
Single-cell and single-nucleus RNA sequencing are used to reveal heterogeneity in cells, showing a growing potential for precision and personalized medicine. Nevertheless, sustainable drug discovery must be based on a population-level understanding of molecular mechanisms, which calls for a population-scale analysis of this data. This work introduces a sequential target-drug selection model for drug repurposing against Alzheimer’s Disease (AD) targets inferred from snRNA-seq data of AD progression- involving hundreds of thousands of nuclei from multipatient and multiregional studies. We utilize Persistent Sheaf Laplacians (PSL) to facilitate a Protein–Protein Interaction (PPI) analysis inferred from disease related differential gene expression (DEG). We then use an ensemble of machine learning models to predict repurpose-able compounds. We screen the efficacy of different small…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
1
2
3
4
5| gene | cell type | log2FC | p-value |
|---|---|---|---|
| ANK2 | B | 0.79, 0.81 | 2.9 × 10–4, 6.4 × 10–18 |
| ANK3 | B | 1.41, 1.61 | 1.5 × 10–5, 1.7 × 10–8 |
| ANKS1B | B | 0.79, 0.97 | 3.7 × 10–2, 3.7 × 10–4 |
| APOE | P | 0.72 | 1.5 × 10–3 |
| CADM2 | B | 1.57, 1.90 | 1.1 × 10–37, 1.8 × 10101101 |
| CDH2 | E | 0.55 | 4.4 × 10–20 |
| CNTN1 | P | 1.18 | 9.9 × 10–57 |
| CNTN6 | P | 1.32 | 1.1 × 10–20 |
| CNTNAP4 | E | 0.83 | 2.2 × 10–9 |
| CTNNA3 | B | 1.28, 1.40 | 2.3 × 10–18, 3.5 × 10–55 |
| CTNND2 | B | 0.74, 0.92 | 6.1 × 10–13, 1.4 × 10–17 |
| DLG2 | B | 0.39, 0.61 | 4.9 × 10–11, 2.6 × 10–19 |
| ERBB4 | B | 0.54, 0.66 | 2.6 × 10–7, 2.9 × 10–7 |
| FHIT | P | 0.36 | 1.3 × 10–4 |
| GPM6A | B | 0.85, 0.97 | 1.2 × 10–14, 5.2 × 10–20 |
| GRID2 | P | 1.78 | 1.1 × 10–192 |
| LRP1B | E | 0.43 | 2.7 × 10–7 |
| LSAMP | B | 1.22, 1.38 | 1.9 × 10–8, 6.3 × 10–43 |
| NKAIN2 | B | 1.09, 1.17 | 2.9 × 10–21, 8.2 × 10–34 |
| NRXN3 | B | 1.00, 1.04 | 4.6 × 10–7, 9.6 × 10–7 |
| OPCML | P | 1.25 | 2.0 × 10–15 |
| PCDH9 | B | 1.44, 1.25 | 5.0 × 10–82, 8.6 × 10–118 |
| PDE4B | B | 1.13, 0.41 | 9.5 × 10–15, 8.3 × 10–8 |
| PRKACB | P | 0.31 | 7.9 × 10–3 |
| PTPRD | B | 1.28, 1.09 | 1.6 × 10–9, 3.4 × 10–15 |
| RBFOX1 | E | 0.80 | 6.4 × 10–2 |
| SLC24A2 | B | 1.68, 1.63 | 3.8 × 10–9, 4.8 × 10–12 |
| SLIT2 | P | 0.50 | 4.2 × 10–4 |
| SPOCK3 | B | 0.97, 1.13 | 3.3 × 10–65, 3.4 × 10–6 |
| WWOX | B | 0.33, 0.58 | 5.4e-10,6.8 × 10–52 |
| gene | log2FC |
|
|---|---|---|
| APOE | 1.08 | 8.1 × 10–165 |
| MERTK | 0.93 | 2.9 × 10–138 |
| FOXO3 | 0.96 | 2.3 × 10–81 |
| CEBPD | 1.15 | 5.4 × 10–78 |
| RGS1 | 1.23 | 4.4 × 10–59 |
| TGFBI | 3.83 | 1.8 × 10–53 |
| IL15 | 1.38 | 6.6 × 10–42 |
| HLA-DRA | 0.84 | 1.8 × 10–40 |
| B2M | 0.48 | 2.3 × 10–24 |
| PTK2B | 0.73 | 5.4 × 10–16 |
| CXCR4 | 4.18 | 1.1 × 10–13 |
| DUSP1 | 0.87 | 1.5 × 10–11 |
| GPNMB | 3.77 | 2.4 × 10–11 |
| HIF1A | 0.23 | 1.8 × 10–7 |
| PPARG | 0.56 | 2.2 × 10–7 |
| SGK1 | 0.49 | 1.9 × 10–6 |
| TBL1X | 0.68 | 7.9 × 10–6 |
| IGF1R | 0.63 | 1.4 × 10–5 |
| HLA-DRB1 | 0.84 | 2.7 × 10–5 |
| CXCL12 | 1.96 | 6.0 × 10–5 |
| DrugBank ID | drug name | Pred. Δ | included in training set |
|---|---|---|---|
| DB09053 | Ibrutinib | –13.00 | Yes |
| DB08916 | Afatinib | –12.55 | Yes |
| DB15035 | Zanubrutinib | –12.41 | Yes |
| DB05944 | Varlitinib | –11.77 | Yes |
| DB12558 | AEE-788 | –11.74 | Yes |
| DB08564 | –11.48 | No | |
| DB13002 | HKI-357 | –11.39 | No |
| DB05424 | Canertinib | –11.37 | Yes |
| DB07662 | PD-168393 | –11.28 | Yes |
| DB11703 | Acalabrutinib | –10.98 | Yes |
| DrugBank ID | drug name | Pred. Δ | BBB | pgpsub | pgpinh | included in training set |
|---|---|---|---|---|---|---|
| DB12016 | Ponesimod | –9.62 | 0.95 | 0.00 | 0.14 | No |
| DB13643 | Loprazolam | –9.61 | 1.00 | 0.00 | 0.07 | No |
| DB11805 | Saracatinib | –9.49 | 0.96 | 0.00 | 0.45 | No |
| DB01222 | Budesonide | –9.49 | 1.00 | 0.02 | 0.05 | No |
| DB15489 | Mexazolam | –9.34 | 1.00 | 0.04 | 0.45 | No |
| DB05410 | NCX 1022 | –9.33 | 0.81 | 0.00 | 0.04 | No |
| DB07117 | –9.31 | 1.00 | 0.01 | 0.00 | No | |
| DB07249 | –9.31 | 0.96 | 0.00 | 0.04 | No | |
| DB00248 | Cabergoline | –9.29 | 1.00 | 0.00 | 0.01 | No |
| DB09295 | Talniflumate | –9.29 | 0.96 | 0.00 | 0.00 | No |
| DB09383 | Meprednisone | –9.28 | 0.82 | 0.00 | 0.46 | No |
| DB13208 | Prednylidene | –9.25 | 0.79 | 0.01 | 0.05 | No |
| DB12549 | Pyrazoloacridine | –9.21 | 0.99 | 0.07 | 0.15 | No |
| DrugBank ID | drug name | Pred. Δ | included in training set |
|---|---|---|---|
| DB11894 | Efatutazone | –13.72 | Yes |
| DB15242 | MK-0533 | –12.17 | Yes |
| DB09198 | Lobeglitazone | –11.91 | Yes |
| DB07863 | –11.77 | Yes | |
| DB12511 | Imiglitazar | –11.72 | Yes |
| DB09200 | Rivoglitazone | –10.71 | Yes |
| DB08915 | Aleglitazar | –10.47 | Yes |
| DB06519 | Edaglitazone | –10.40 | No |
| DB06510 | Muraglitazar | –10.28 | Yes |
| DB01941 | LG-100268 | –10.25 | Yes |
| DrugBank ID | drug name | Pred. Δ | BBB | pgpsub | pgpinh | included in training set |
|---|---|---|---|---|---|---|
| DB07242 | –9.17 | 1.00 | 0.02 | 0.39 | No | |
| DB00248 | Cabergoline | –9.17 | 1.00 | 0.00 | 0.01 | No |
| DB01218 | Halofantrine | –9.09 | 0.85 | 0.04 | 0.15 | No |
| DB02132 | Zenarestat | –9.07 | 0.72 | 0.00 | 0.00 | No |
| DB15367 | LY-2623091 | –8.99 | 1.00 | 0.01 | 0.21 | No |
| DB07270 | –8.95 | 0.71 | 0.00 | 0.16 | No | |
| DB11794 | Berzosertib | –8.95 | 0.86 | 0.00 | 0.04 | No |
| DB13937 | LGD-3303 | –8.91 | 0.90 | 0.00 | 0.02 | No |
| DB07218 | –8.88 | 0.94 | 0.01 | 0.28 | No | |
| DB01611 | Hydroxychloroquine | –8.87 | 0.87 | 0.18 | 0.31 | No |
| DB14970 | Alobresib | –8.82 | 0.99 | 0.00 | 0.00 | No |
| DrugBank ID | drug name | Pred. Δ | included in training set |
|---|---|---|---|
| DB05501 | Mavorixafor | –10.63 | Yes |
| DB12715 | MSX-122 | –10.61 | No |
| DB07809 | –10.29 | No | |
| DB16652 | Avotaciclib | –9.84 | No |
| DB11970 | Burixafor | –9.74 | No |
| DB12076 | Surotomycin | –9.72 | No |
| DB08441 | –9.71 | No | |
| DB02022 | –9.70 | No | |
| DB11830 | Mocetinostat | –9.70 | No |
| DB12027 | Serdemetan | –9.68 | No |
| DrugBank ID | drug name | Pred. Δ | BBB | pgpsub | pgpinh | included in training set |
|---|---|---|---|---|---|---|
| DB02022 | Toxopyrimidine | –9.7 | 0.92 | 0.19 | 0.00 | No |
| DB13414 | Fenyramidol | –9.61 | 0.83 | 0.07 | 0.01 | No |
| DB12740 | CC-115 | –9.35 | 0.98 | 0.02 | 0.01 | No |
| DB12485 | Pimonidazole | –9.34 | 0.78 | 0.02 | 0.00 | No |
| DB13069 | Nimustine | –9.3 | 0.96 | 0.02 | 0.01 | No |
| DB12522 | Toreforant | –9.28 | 0.98 | 0.00 | 0.10 | No |
| DB07244 | –9.28 | 0.94 | 0.08 | 0.10 | No | |
| DB12177 | Eplivanserin | –9.26 | 0.76 | 0.00 | 0.10 | No |
| DB08707 | –9.24 | 1.00 | 0.00 | 0.13 | No | |
| DB16954 | Ezeprogind | –9.24 | 1.00 | 0.20 | 0.18 | No |
| DB15091 | Upadacitinib | –9.23 | 0.79 | 0.14 | 0.18 | No |
| DrugBank ID | drug name | Pred. Δ | included in training set |
|---|---|---|---|
| DB02505 | –10.13 | No | |
| DB16135 | Desidustat | –10.09 | No |
| DB15642 | Molidustat | –10.04 | Yes |
| DB15039 | ITI-214 | –9.63 | No |
| DB06497 | Aplaviroc | –9.59 | No |
| DB11682 | Daprodustat | –10.02 | Yes |
| DB15670 | Gusacitinib | –9.54 | No |
| DB17456 | Dalpiciclib | –9.47 | No |
| DB12690 | LY-2584702 | –9.46 | No |
| DB16080 | Acolbifene | –9.37 | No |
| DrugBank ID | drug name | Pred. Δ | BBB | pgpsub | pgpinh | included in training set |
|---|---|---|---|---|---|---|
| DB08149 | –8.81 | 1.00 | 0.11 | 0.45 | No | |
| DB07606 | –8.73 | 0.89 | 0.02 | 0.03 | No | |
| DB07227 | L-778123 | –8.72 | 0.98 | 0.03 | 0.34 | No |
- —National Science Foundation10.13039/100000001
- —National Science Foundation10.13039/100000001
- —National Institutes of Health10.13039/100000002
- —National Institutes of Health10.13039/100000002
- —Bristol-Myers Squibb10.13039/100002491
- —Michigan State University Research Foundation10.13039/100016254
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Bioinformatics and Genomic Networks · Alzheimer's disease research and treatments
Introduction
1
As of 2024, an estimated 6.9 million Americans over the age of 65 are living with Alzheimer’s Disease (AD), and it consistently ranks among the top five most common causes of death for elderly individuals in the US. As life expectancy continues to rise, the incidence of age-related diseases also increases, and AD is projected to constitute a serious global health crisis by midcentury. Much like cancer development, Alzheimer’s disease can be defined as a continuum of clinical and pathological events from normal aging to dementia- hitting primarily on neuro-inflammation, blood brain barrier disruption, and the accumulation of plaques. Detecting the pre-dementia stages of this continuum are notoriously difficult, especially so given the heterogeneity of AD- and so new diagnostic/treatment frameworks relying on the identification of biomarkers have been introduced and are gaining increasing attention in the medical community.?
Single Cell and Single Nucleus RNA Sequencing have emerged in recent years as part of a new era in understanding disease pathology, drug discovery, and drug development. Single Cell RNA-seq is widely used to reveal heterogeneity in cells, which has given us insights into cell–cell communication, cell differentiation, and differential gene expression. Many tissues (e.g., the adult brain) are challenging to dissociate without altering transcriptional programs or losing fragile cell types. Single-nucleus RNA-seq (snRNA-seq) complements single-cell RNA-seq to this end by profiling RNA from isolated nuclei rather than whole, live cells. Differential gene expression (DEG) analysis is a crucial tool in molecular biology and genomics. In particular, the analysis of interactomic networks constructed from these differentially expressed genes has had tremendous success in providing insights into a variety of biological and medicinal problems. Liang et al. applied Weighted Gene Coexpression Network Analysis (WGCNA) to the identification of key genes regulating Alzheimer’s.? While notable, such methods are traditionally limited in handling only the low-dimensional, or pairwise, relations in the gene coexpression networks, which may result in an incomplete understanding of the complex, high-dimensional, nonlinear nature of gene–gene interactions.
Du et al. proposed a novel topological differentiation technique to identify key genes from a protein–protein interaction network derived from DEGs relating to opioid and cocaine addiction.? This topological differentiation technique serves as an example of the valuable perspectives offered by topological data analysis (TDA) to the study of biological networks through the application of persistent spectral theory, i.e., persistent topological Laplacians.? The first persistent topological Laplacian was introduced as a new generation of TDA models in 2019.? It has stimulated a variety of theoretical developments and led to remarkable applications as shown in a recent survey.? The power of persistent topological Laplacians is augmented by topological deep learning (TDL), a new learning paradigm introduced for the first time in 2017.? The reader is referred to reviews on recent advances in TDA? and TDL.? More recently, Wei and Wei have introduced the Persistent Sheaf Laplacian (PSL) for cellular sheaves, and described how to construct sheaves for a point cloud where each point is associated with a quantity that can be devised to embed some physical property, or similarly, to emphasize a node in its network.? While the traditional Persistent Laplacian provides only a global topological description, the PSL is a local method, making it better suited to topological perturbations by analyzing the topology of the gene network with respect to a specific gene. In our context, we note that raw differential expression magnitudes may emphasize genes that exhibit high fold-changes but do not significantly interact with other genes or connect important modules. Meanwhile, traditional topological Laplacians may emphasize genes which are structurally significant but not significantly dysregulated, and therefore are simply noise or post-transcriptionally regulated. We might assume that a hub which barely changes at the RNA level likely does not drive pathology. Sheaf Laplacians address this shortcoming through their ability to incorporate nongeometric information via cellular sheaves on the simplicial complex. Specifically, by labeling each node in the PPI complex by the magnitude of its dysregulation, we seamlessly unite both perspectives: scale of dysregulation and PPI complex centrality.
Additionally, previous studies utilizing the persistent Laplacian are limited by their analysis of only single sample data rather than full populational or atlas studies. Although scRNA-seq and omics data offer a revolutionary opportunity for patient-specific and personalized medicine, they do not translate directly to personalized drug development, as this is not economically sustainable on an industrial scale. Personalized medicine can be achieved with a specific selection or combination of drugs based on an individual’s genetic biomarkers, but efficient drug development must be based on verifiable population-level molecular mechanisms. Therefore, it is necessary to identify commonly occurring targets on the population-level of scRNA-seq and/or spatial transcriptomic data for drug discovery.
Our goal is then to carry out a computational drug repurposing for AD targets identified via a population-scale snRNA-seq data analysis using the PSL. Our work proceeds as follows. We obtained the GSE163577 data set from the Gene Expression Omnibus database, providing single-cell profiling of the hippocampus and superior frontal cortex from 25 hippocampus and cortex samples across 17 control and 8 Alzheimer’s disease patients. These data were used to profile the major vascular and perivascular cell types of the human brain through 143,793 single-nucleus transcriptomes, enabling us to study cell-type-specific differential expression in the brain vasculature- a site of critical importance for the pathogenesis of neurodegeneration.? In addition, we obtained 194,000 single nucleus microglial transcriptomes from CellBrowser. Each sample was segmented into various transcriptional states during AD progression from over 400 human subjects.? This enables us to study the genetic heterogeneity in microglia cells during the temporal progression of neurodegeneration which drives neuro-inflammation trends as well as plaque clearance. In this study, we extracted a total of 50 potential gene targets. These topologically significant genes are then analyzed in the wider context of their surrounding PPI modules, providing coherent and targeted insight into pathology inducing pathways and machinery using gene set enrichment analysis. A small subset of the genes which corresponded to convergent disease inducing pathways across multiple lenses, and that have sufficient publicly available binding affinity data, are then targeted for computational drug repurposing efforts using DrugBank small compounds. The resulting compounds are then further screened with Central Nervous System (CNS) related ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) analysis.
Results
2
In this section we walk through the results of the Persisent Sheaf Laplacian analysis of DEG derived PPI networks for cells taken either from a Human Brain Vascular Atlas or a Dynamic State Evolution Study of Microglia cells. The analysis was carried out by first obtaining a list of the top 200 differentially up-regulated genes within a set of cells of interest referenced against either control/homeostatic cells or cells sequenced earlier in the progression of neuroedegeneration. These DEGs were then used to construct cell type/state specific PPI networks and corresponding clique complexes, on which we could identify topologically significant genes with respect to each genes’ respective magnitude of dysregulation via the cellular sheaf framework (Figure). In this way, we theorize that pathology driving genes unite both topological significance with the nongeometric scale of their up-regulation. The top genes were chosen by intersecting the top 25 topological scores over each scale of filtration. Filtration steps were induced by varying the confidence of the PPI relations with thresholds of [250,400,550,700], which ranges from a very loose certainty/strength of interaction to a strong, experimentally validated certainty.
Schematic overview of our PSL analysis of PPI networks. Gene expression measurements are taken from various healthy and AD diseased cells and differentially expressed genes are extracted for different groups. These DEGs are used to construct PPI networks from the STRING database. The PPI networks undergo a clique expansion to construct a higher dimensional simplicial complex and each gene (0-simplex) is assigned a labeling reflecting its relative dysregulation magnitude in the group of interest. Gene specific topological perturbations and a filtration are then performed and the spectra of Persistent Sheaf Laplacians are analyzed to rank the genes by their significance. These significant genes are then targeted for further analysis via gene set enrichment and finally targeted drug repurposing.
Disease Associated Expression Heterogeneity
in Brain Vasculature
2.1
Dysfunction of the brain vasculature is a known upstream driver, amplifier, and monitorable marker of Alzheimer’s pathology? (Figurea). Alzheimer’s disease pathology generally begins years or decades before the onset of noticeable clinical symptoms, and one of its earliest hallmarks is dysfunction of the blood–brain barrier characterized by increased permeability or leakage. Previous studies have indicated the involvement of several key signaling pathways, including inflammatory cytokine cascades, oxidative stress in endothelial cells, and pericyte dysfunction. These processes not only disrupt BBB integrity but also promote neuronal injury, driving neurodegeneration. Yang et al. developed a molecular map of the human brain vasculature by profiling the major vascular and perivascular cell types of the human brain through 143,793 single-nucleus transcriptomes taken from both diseased and control subjects? (Figureb). Specifically, their study examined 17 post-mortem individuals (9 affected by Alzheimer’s disease, 8 age-matched subjects with no cognitive impairment). They looked at 25 fresh-frozen tissue blocks: 17 hippocampus and 8 superior frontal cortex samples. They yieled 143,793 high-quality nuclei captured with VINE-seq after QC in total. By including both AD and healthy controls the atlas captures early and late vascular transcriptional changes linked to disease state, enabling a comprehensive DEG analysis in highly relevant cell types- endothelial capillary and Pericyte cells. Persistent Sheaf Laplacian analysis demonstrates an impressive ability to derive meaningful biological insights from this interpatient variation, which contrasts with previous Persistent Laplacian studies that were limited to single sample and more highly focused studies.?
(a) Depiction of the human brain vasculature structure and the place of capillary endothelial cells and perictye cells in that structure. These cells play a primary role in persistent BBB breakdown in AD which is a significant pathological driver. (b) UMAP plots of the human brain vasculature atlas. On the left cells are colored by their disease status and on the right cells are colored by their cell type. (c) UMAP plots of diseased versus healthy endothelial and pericyte cells, as well as the respective PPI complexes of their up-regulated DEGs with the top genes in terms of log-fold changes labeled. (d) Focused analysis of PPI structures surrounding our top scoring topologically significant genes from the PSL analysis. Cliques and connected components containing topological genes from endothelial and pericyte cells are extracted and their structure reveals a potential cross-compartment signaling cascade between protective adhesion modules and MMP inducing CA2+/cAMP/PKA machinery. (e) The topological genes and their surrounding PPI structures were then used to conduct a pathway analysis via the GO databases in GSEApy, revealing coherent and coordinated pathways driving different aspects of BBB leakage pathology. A schematic diagram is also presented illustrating the proposed ERBB4-DLG2 mechanistic bridge.
Endothelial cells line the brain’s capillaries and normally maintain barrier integrity. Alzheimer’s-disease (AD) affected brains display altered gene-expression signatures in these cells, mirroring early blood–brain-barrier (BBB) leakage and neuro-inflammatory feed-back loops? (Figurea). Early endothelial injury allows plasma proteins to enter the brain, accelerates microglial activation, and propagates synaptic loss and further damage to the BBB. Pericyte cells, meanwhile, wrap the brain capillaries and instruct endothelial cells to adopt BBB programs. Our Persistent Sheaf Laplacian enabled differentiation tool allowed us to extract 20 topologically significant genes (Table) from the differentially expressed gene (DEG) set of both endothelial and pericyte cells for focused analysis. These genes, together with their positions in the protein-interaction network and enriched pathways, are visualized in Figurec.
1: Topologically Significant Genes in AD-Affected Vascular Cells
In AD-affected brains, the endothelial pathology is heterogeneous, so the DEG list is broad. Examining how our significant genes assemble into subgraphs and cliques allows us to yield more coherent pathway themes for analysis. For example, the functional relation between CDH2, CTNNA3, CTNND2, WWOX, DLG2, ERBB4, and NRXN3 is evident from their forming the largest connected component of significant genes in the PPI network (Figured). Pathway enrichment analysis via the GO Cellular Component and Molecular Function databases points to coordinated dysfunction of junctional integrity and signaling at the BBB, consistent with compensatory up-regulation during early damage (Figure). Several genes relate to cell–cell adhesion pathways: CDH2 and its catenin partners CTNNA3 and CTNND2 mediate tight contacts between endothelial cells and pericytes, preserving barrier integrity. ?,? Loss of this adhesion likely weakens the BBB as pericytes detach. DLG2 carries a guanylate-kinase–like domain that provides a docking platform; MAGUK scaffolds organize adherens junctions and connect them to kinases such as ERBB4. Because both DLG2 and ERBB4 are annotated to the basolateral plasma membrane and MAGUKs recruit polarity complexes, this interaction places ERBB4 on the abluminal face that meets the basement membrane and pericytes. ?,?,? The relation therefore merits functional testing, as it touches both cadherin belts and basolateral polarity, which are essential in BBB integrity (Figuree).
We next examined the pericytes which envelop the basement membrane. We again identified 20 significant genes, nine of which overlap with the endothelial list, possibly suggesting a shared molecular backbone across the two vascular compartments (Table). It is interesting to consider the context of these genes in the wider set of Pericyte DEGs. Extracting the PPI connected component which contains the greatest number of our significant genes yields the third subgraph in Figure. Here, the ERBB4/DLG2 pair again appears, now linking a Ca^2+^/cAMP-rich module (PDE4B, SLC8A1, AKAP6, FHIT, RUNX2, WWOX, PRKACB) with neuronal-adhesion genes (NRXN3, CNTNAP2, NLGN1, SLIT2/ROBO1, etc.). This may suggest an abluminal handshake in which the same scaffold couples adhesion–polarity machinery to leakage-promoting signaling on both sides of the basement membrane. Together, the two cell-type significant gene subgraphs in Figured along with the corresponding enrichment analysis of these subgraphs in Figuree seemingly converge on a model where an ERBB4-anchored MAGUK node links cadherin-belt repair in endothelial cells to chronic matrix-remodelling signals in pericyte cells via a PKA/Ca^2+^/cAMP to MMP pipeline, potentially turning an initially protective instruction into a self-perpetuating leakage loop (Figuree). ?,? While some prior studies have reported barrier-protective effects of ERBB4 activation, such structure raises the alternative hypothesis that maladaptive, context-specific ERBB4 signaling may instead exacerbate chronic BBB permeability in AD. As of yet, such a hypothesis remains untested and unvalidated. Further experimental work would be needed to better understand and validate or invalidate these claims.
The formation of cliques among the topologically significant genes in endothelial cells grants additional insight. The earliest four-node clique to appear during simultaneous topological score and PPI edge filtration comprises NRXN3, CADM2, GPM6A, and RBFOX1- a module of neuronal-adhesion genes. Brain capillary endothelial cells can express synaptic adhesion molecules. ?,? It has previously been established that deletion of the Pcdh family member PCDH9 in microvascular endothelium increases paracellular permeability.? CNTNAP4, another synaptic adhesion molecule detected here, shows measurable endothelial expression.? Such maladaptive expression may be the result of ambient noise, or rather, may indicate an aberrant attempt at barrier remodelling during leakage. Most notably, in the pericyte analysis this neuronal-adhesion program may couple to the Ca^2+^/PKA/MMP axis through the DLG2/ERBB4 bridge, potentially worsening leakage while repair genes remain chronically elevated.
Several significant genes also relate to cell stress and amyloid-beta handling. LRP1B is a lipoprotein-receptor relative of LRP1, which ferries amyloid-beta out of the brain; its dysregulation may impair clearance.? PTPRD and WWOX are neurodegeneration-linked genes associated with stress signaling. ?,? LSAMP has appeared in AD coexpression modules, though its mechanistic role is not fully understood. Taken together, these findings underscore the popular notion of BBB disruption as a central pathological driver in AD and highlight the ERBB4/DLG2 node as a potentially promising and under-explored hypothesis for BBB leakage pathology, and thus warrants additional focus.
Temporally Driven Expression Heterogeneity
in Disease Associated Microglia
2.2
The dynamic evolution of microglial states is a major driver of neuroinflammation and subsequent neurodegeneration as well as plaque clearance in AD. Sun et al. presented 194,000 single-nucleus microglial transcriptomes across 443 human subjects and diverse AD pathological phenotypes.? They annotated a series of temporally developing cell states during AD progression, enabling researchers to study differential gene expression trends among different stages of neurodegeneration. The data also exhibits technical heterogeneity with libraries spanning 10× v2 and v3 chemistries, and has been batch corrected via Harmony. The ability for our PSL analysis to extract insights in spite of this technical heterogeneity is also in contrast to previous Persistent Laplacian based studies.? This data represents one of the largest, modality-matched censuses of human microglia cells to date, with enough subjects to dissect inter-individual and stage-specific expression programs.
As AD progresses, microglia become increasingly triggered by signals from plaque buildup and enter a phagocytic stage. At this point the hallmark genetic profiling of microglia cells begins to reflect significant upregulation of pathways for debris clearance, antigen presentation, and immune recruitment. In Table we list 20 genes identified as being significant in our Persistent Sheaf Laplacian analysis of the PPI complex comprised of DEGs relative to an initial inflammatory state. These genes, along with their positions in the overall PPI network and topological significance scores, are shown in Figure.
2: Significant Genes in Early-Inflammatory vs Phagocytic Microglia
(a) UMAP plot of the microglial state transitionary cells taken from Sun et al. Cells are colored by their activation status. (b) Schematic overview of the interactions between disease associated microglia cells and the BBB in AD. (c) UMAP plots of the early Inflammatory and Phagocytic microglia cells, with up-regulated DEGs in the phagocytic state being used to construct a PPI complex. Top dysregulated genes in terms of logFC are labeled. (d) Focused analysis of PPI structures surrounding our top scoring topologically significant genes from the PSL analysis. The largest connected component made up of topological genes is extracted and its structure reveals four primary axes joined by a central HIF1A/PPARG/APOE backbone. (e) The topological genes and their surrounding PPI structures were then used to conduct a pathway analysis via the GO databases in GSEApy, revealing coherent and coordinated pathways driving different aspects of phagocytosis as well as hypoxia, antigen presentation, and outside immune recruitment.
We are interested in the positioning of these genes in the overall context of other up-regulated genes. If we extract the connected component containing the largest number of our significant genes, we obtain the network configuration in Figure. This resolves into four functional gene modules that converge on an HIF1A–PPARG–APOE backbone. All three genes are significantly up-regulated and achieve high mean topological scores across PSL filtrations (Table). The red module (MSR1, CD163, LYVE1, etc.) GO pathway analysis demonstrates enrichment of amyloid-β binding and endocytic-vesicle membrane, indicating receptor-mediated tethering and phagocytosis of plaque material.? Plaque-associated hypoxia stabilizes the HIF1A gene; its interaction with the p300/CBP complex and with transcriptional Pol-II promoters is reflected in the orange enrichment terms histone acetyl-lysine binding and RNA-polymerase II cis-regulatory DNA binding. ?,? In this way, the genome becomes more receptive to transcription via histone modifications and increased polymerase binding affinity. Overall, the orange module, composed chiefly of PPARG, FOXO3, SGK1, ZBTB16, HIF1A and NR3C2, encodes the metabolic and transcriptional circuitry that equips microglia cells to increase transcription rates and to digest the plaque cargo internalized by the red cluster. Its leading pathway, arachidonic-acid binding, singles out PPARG as a long-chain fatty-acid sensor that switches on lipid-oxidation and cholesterol-efflux programs.? Chloride-channel regulatory activity is driven by SGK1, an Akt-responsive kinase that preserves ionic balance and prevents lysosomal rupture during intense lipid catabolism.? The recurring chromatin terms, histone acetyl-lysine binding (HIF1A), transcription-core binding (ZBTB16) and multiple DNA-binding categories listing PPARG, FOXO3, ZBTB16, capture a coordinated transcriptional wave in which FOXO3 supplies antioxidant and autophagy genes, ZBTB16 remodels lipid-responsive promoters, and PPARG transactivates APOE. ?,?,? By linking hypoxia-stabilized HIF1A to PPARG-driven lipid detoxification, and by generating APOE particles that dock directly onto the red module’s plaque-handling receptors, the orange module serves as the metabolic license that allows microglia to clear Aβ while attempting to avoid self-inflicted lipid and oxidative stress. Critically, the HIF1A–PPARG backbone is at the core of this interacting machinery.
The blue module reflects the interplay between HLA-DRA, HLA-DRB1, and B2M. Their top enrichment terms, MHC-protein complex and clathrin-mediated endocytosis, mark the assembly and surface trafficking of both MHC-II and MHC-I peptide complexes. This configuration signals that microglia have progressed beyond mere phagocytosis and are now poised to additionally activate T cells.? IL-15, while not included in this module, was identified as topologically significant and likely sits at the fringe of this module as it is a cytokine that supports survival of T cells. Microglia producing IL-15 could attract peripheral immune cells to enter the brain in AD, or activate resident T cells, linking innate and adaptive immunity in neurodegeneration.? Similarly, CEBPD is a transcription factor that is upregulated by inflammatory stimuli and drives expression of cytokines and complement components. Its presence indicates a feed-forward loop reinforcing the microglial inflammatory phenotype and outside immune recruitment.? RGS1, meanwhile, is known to modulate chemokine receptor signaling in immune cells, likely tuning the intensity of this antigen response.? The core of the purple gene module revolves around the chemokine pair CXCL12–CXCR4 and is highlighted in the pathway analysis for “C-X-C chemokine-receptor activity” and “VEGF signalling”. CXCL12 gradients, up-regulated in hypoxic plaque niches, guide CXCR4-expressing leukocytes, including T cells primed by IL-15, toward these lesions.? Together, the two modules form a coordinated gateway: the purple arm orchestrates chemotactic recruitment, and the blue arm presents plaque-derived peptides to arriving T cells, linking innate microglial activation with an adaptive immune response that can propagate neuroinflammation in Alzheimer’s disease.
Molecular Targeting Assessment
2.3
Having studied the topologically significant genes as well as their positioning in the larger PPI complex and its corresponding enriched pathways for each of our cell groups, we can arrive at some convergent conclusions with regards to meaningful hypotheses and therapeutic targets. In both sets of vascular cells we found genes involved in the breakdown of the Blood Brain Barrier, particularly due to loss of cell adhesion and endothelial-pericytes contact causing increased barrier permeability. Interestingly, examining the positional context of the topologically significant genes in both cells revealed the placement of ERBB4 at a putative cross-compartment hub. Its C-terminal PDZ-binding motif can dock onto MAGUK scaffolds such as DLG2, which in turn link adherens–junction complexes to intracellular kinases. We therefore hypothesize that ERBB4-DLG2 assemblies couple barrier-repair signaling in endothelial cells to Ca^2+^/cAMP–PKA-driven MMP activity in pericytes, potentially converting an initially protective barrier maintenance response into machinery that drives chronic leakage. Targeted (either inhibitory or activating) small-molecule modulation of ERBB4 becomes an attractive line of investigation, especially if future experiments were to establish a truly causal role for this axis in BBB permeability. Importantly, this validates the potential for a PSL based analysis of biological networks to present novel and testable biological hypotheses.
From our analysis of inflammatory to phagocytic microglia transition, several more genes and pathways emerge as promising targets for drug repurposing. PPARG was identified alongside HIF1A as two of the central players uniting multiple pathology related machineries involving inflammation, plaque clearance, and hypoxia response. PPARG agonists have been tested in AD with some evidence that they can suppress microglial activation and also enhance the efficiency of Aβ clearance.? In one of the PPI modules, the CXCR4-CXCL12 axis was positioned centrally, and pathway enrichment analysis revealed it as a primary driver of the inflammatory state caused by additional outside immune recruitment. CXCR4 antagonists might tamp down the deleterious pathogenic immune cell recruitment and microglial positioning. Finally, the central role of HIF1A in multiple modules highlights the importance of plaque induced hypoxia in AD affected microenvironments. It has been proposed that inhibition of Prolyl-hydroxylase (EGLN) is a potentially effective approach to enhancing the protective activity of HIF1A.? Together, these targets represent a multipronged approach to reinforcing the BBB, quelling microglial and immune cell overactivation, boosting clearance of toxic proteins, and improving the brain’s hypoxic response. Furthermore, this demonstrates the capacity for PSL based analysis of biological networks to fully recover the hallmark axes of disease associated microglia function from otherwise noisy differential expression data.
Drug Repurposing Targets
2.4
In this section, we pivot to computationally evaluating drug repurposing candidates for our selected targets. Having assessed the therapeutic potential of our topologically significant genes, as well as the availability of public binding affinity data, we settled on focusing the remainder of our analysis on the following genes/pathways (see Figure).
(a) (Top) Distributions for ground truth binding affinity scores for each target taken from ChEMBL. (Bottom) Distributions for predicted binding affinity scores for DrugBank small compounds. Predictions were made from a consensus of two Gradient Boosted Decision Tree Regressors trained on a Transformer based and an ECFP based molecular representation of the ChEMBL binders. Colors are split by an effective binder threshold of −9.2 ΔG. (b) Plots demonstrating the performance of the prediction models on ChEMBL testing data for each target.
Inhibiting ERBB4
2.4.1
In Table, we list the top 10 predicted binders for ERBB4. Ibrutinib, developed primarily as a BTK inhibitor, exhibits nanomolar affinity for ERBB4 and has been shown to suppress cell growth and MEK and ERK signaling in cancer cell lines.? Afatinib irreversibly inhibits EGFR, HER2, and ERBB4, thus blocking ErbB receptor phosphorylation and downstream signaling.? Similarly, zanubrutinib and acalabrutinib have demonstrated off–target activity against the ErbB family and may modulate ERBB4 signaling.? Varlitinib is a reversible pan-HER inhibitor with nanomolar potency against ERBB4 and has shown efficacy in HER overexpressing biliary tract cancers.? Canertinib is an irreversible pan-ErbB tyrosine kinase inhibitor with confirmed activity against ERBB4.? AEE-788 is a dual EGFR/HER2 and VEGFR inhibitor developed to suppress tumor growth and angiogenesis; it also inhibits ERBB4 (HER4) with an IC_50_ of 0.16 μM, suggesting potential modulation of ERBB4 signaling.? PD-168393 is a covalent pan-ErbB inhibitor that potently targets ERBB4 and induces autophagy without apoptosis in MPNST cells, but enhances cytotoxicity when combined with lysosomal stress.? HKI-357 has shown potent inhibition of HER2 in vitro and in vivo, with sustained suppression of receptor phosphorylation.?
3: Top-Ranking DrugBank Compounds and Their Predicted Binding Affinities for ERBB4
In addition to the highest-scoring binders based solely on affinity prediction, we further identified 13 ERBB4-targeting compounds that also satisfied CNS-relevant criteria, including strong predicted binding affinity (ΔG pred < −9.2 kcal/mol) and high BBB permeability and probability of P-gp inhibition and substrate interaction. Representative compounds are shown below; the full set is provided in Table. DB12016 (Ponesimod), an antagonist of sphingosine-1-phosphate receptor 1 (S1PR1), has shown potential in AD models by modulating microglial activity, alleviating neuroinflammation, and enhancing amyloidβ clearance.? DB11805 (Saracatinib), is a Src/Abl kinase inhibitor for cancer therapy, has shown promise in AD by targeting Fyn kinase and improving cognitive outcomes at lower doses than used in cancer therapy.? DB01222 (Budesonide) is a synthetic, inhaled glucocorticoid that mitigates asthma-induced neuroinflammation, which may help reduce neuronal loss and suggests a neuroprotective potential.? DB15489 (Mexazolam) is an anxiolytic oxazolo-benzodiazepine whose active metabolite, chlornordiazepam, exhibits a pharmacodynamic profile that supports strong antianxiety effects with minimal sedation compared to other benzodiazepines.? DB05410 (NCX 1022) is a nitric oxide-releasing derivative of hydrocortisone that modulates early skin inflammation by regulating leukocyte recruitment, resulting in faster and more effective anti-inflammatory responses than hydrocortisone.? DB07249 is an anilinoquinazoline inhibitor that targets c-Src tyrosine kinase and exhibits strong antitumor activity in a rat xenograft model based on 3T3 cells transformed with Src.? DB00248 (Cabergoline), a dopamine receptor agonist, improved memory and reduced Aβ 42 and p-tau levels in an AD model. It exerted neuroprotective and autophagy-enhancing effects via AKT/mTOR, GLT-1/P38-MAPK, and ERK1/2 pathways.? DB09295 (Talniflumate) is an anti-inflammatory drug that exerts neuroprotective effects by inhibiting astrocytic ASCT2–NLRP3 interaction, thereby reducing neuroinflammation and preserving dopaminergic neurons in Parkinson’s disease models.?
4: Repurposing Candidates Targeting ERBB4 with Favorable CNS Properties and Strong Binding Affinities
PPARG Agonists
2.4.2
In Table we list the top 10 predicted binders for PPARG. The drug Efatutazone is a potent PPARG agonist of the thiazolidinedione class. While it has been explored in clinical trials for anticancer therapies, this drug has not been applied to PPARG agonism in the context of Alzheimers Disease, representing a missed opportunity.? MK-0533 is a selective partial PPARG modulator, which are being explored in clinical trials as safer options for neuro-degeneration as they preserve anti-inflammatory gene programs with fewer metabolic liabilities.? Lobeglitazone is a dual PPARA and PPARG agonist. In previous studies on AD affected mice, it lowered cortical Aβ plaques, shifted microglia to an anti-inflammatory phenotype, and improved motor function.? Like Efatutazone- Rivoglitazone and Edaglitazone are also TZD class PPARG agonists. They have been proposed for AD trials by researchers and experiments with Rivoglitazone have shown promising results regarding its anti-inflammatory signature in vitro. For Edaglitazone, however, clinical data is still pending. ?,? Aleglitazar, Muraglitazar, Imiglitazar are also dual PPARA/PPARG inhibitors. Several mouse studies have demonstrated their promise in promoting an anti-inflammatory shift. ?,?
5: Top-Ranking DrugBank Compounds and Their Predicted Binding Affinities for PPARG
Beyond the top 10 PPARG binders, we select in Table an additional 11 compounds based on moderate binding affinities (ΔG pred < −8.5 kcal/mol) and CNS-relevant ADMET filetering. A summary of these candidates is presented below. DB07242 is a β-carboline derivative; related analogs have demonstrated inhibitory activity against kinases such as PIM1, CLK, DAPK3, and BIKE.? DB00248 (Cabergoline) is a dopamine D2 receptor agonist approved for the treatment of hyperprolactinemia, and has been reported to normalize prolactin levels in most patients.? Preclinical studies suggest that Cabergoline, either in its original form or as active metabolites, can penetrate the BBB and reach central targets, making it a promising candidate for CNS drug repurposing.? DB01218 (halofantrine) is a lipophilic antimalarial agent effective against chloroquine resistant strains.? While some reports have suggested a high volume of distribution indicative of potential CNS penetration, inconsistent pharmacokinetic findings leave its BBB permeability uncertain. DB15367 (LY-2623091) is a nonsteroidal mineralocorticoid receptor antagonist with stable pharmacokinetics observed in phase 1 and 2 trials, showing consistent exposure regardless of disease status.? DB11794 (Berzosertib) is a selective ATR kinase inhibitor. In a phase I clinical trial, it demonstrated tolerable safety and preliminary efficacy when combined with gemcitabine in patients with refractory solid tumors.? DB01611 (Hydroxychloroquine) is an aminoquinoline derivative originally developed as an antimalarial and later repurposed for anti inflammatory purposes.? It has been evaluated in an AD clinical trial, suggesting potential relevance to neuroinflammation and CNS accessibility.
6: Repurposing Candidates Targeting PPARG with Favorable CNS Properties and Strong Binding Affinities
Inhibiting CXCR4
2.4.3
In Table we list the top 10 predicted binders for CXCR4. The top predicted drug is Mavorixafor, which is a known oral CXCR4 antagonist used to treat WHIM syndrome. It has recently been explored for controlling chronic inflammation in neurodegeneration.? MSX-122 is a partial CXCR4 antagonist which has been tested in preclinical stroke models and demonstrated reduced neutrophil infiltration.? Burixaofr is also known to block CXCR4 and lower CXCL12 driven trafficking, with clinical trials underway for modulating inflammation in WHIM syndrome.? It is applications to neurodegeneration have not yet been explored. Avotaciclib and Mocetinostat are CDK and HDAC inhibitors with no known interactions with CXCR4, but which are known to attenuate pro-inflammatory signaling cascades that drive glial activation and secondary neuronal injury.
7: Top-Ranking DrugBank Compounds and Their Predicted Binding for CXCR4
Additionally, our CNS-relevant ADMET screening for CXCR4 resulted in 11 compounds satisfying the stringent binding affinity criterion (ΔG pred < −9.2 kcal/mol). Five illustrative candidates are highlighted below; the complete results are available in Table. DB02022 (Toxopyrimidine) is a pyrimidine-based pyridoxamine antagonist that induces seizures via GAD inhibition and GABA depletion.? DB12740 (CC-115) is a dual mTOR/DNA-PK inhibitor under clinical investigation for cancer therapy.? DB13069 (Nimustine) is a nitrosourea alkylating agent clinically evaluated via convection-enhanced delivery in children with diffuse intrinsic pontine glioma (DIPG) in a multicenter phase II trial, demonstrating therapeutic potential.? DB12177 (Eplivanserin) is a highly selective 5-HT_2_ receptor antagonist developed for the treatment of insomnia.? DB16954 (Ezeprogind) is a small molecule drug developed to treat Progressive Supranuclear Palsy (PSP), a tauopathy-related neurodegenerative disorder, by targeting the Progranulin–Prosaposin (PGRN–PSAP) axis and enhancing lysosomal function.?
8: Repurposing Candidates Targeting CXCR4 with Favorable CNS Properties and Strong Binding Affinities
CNS-relevant ADMET properties of top-ranking DrugBank compounds predicted to bind CXCR4. All selected compounds exhibited strong binding affinity thresholds (ΔG pred < −9.2 kcal/mol), with high predicted BBB permeability (>70%) and low probabilities of acting as Pgp substrates or inhibitors (<50%).
Inhibiting EGLN
2.4.4
In Table we list the top 10 predicted binders for EGLN. Several of those listed are known to exhibit HIF-stabilizing effects that can lessen hypoxic damage particularly in renal-anemia in CKD. Desidustat and Molidustat are both EGLN inhibitors that have demonstrated HIF stabilization along with elevation of brain VEGF and tight junciton proteins in mouse models.? Daprodustat is an FDA approved EGLN oral inhibitor. ITI-214, Aplaviroc, Gusacitinib and Dalpiciclib are not known to interact with EGLN, but they all modulate pathways that either improve cerebral blood-flow, curb leukocyte recruitment or reduce microglial cytokine production.? For instance, Aplaviroc is a known CCR5 chemokine-receptor agonist. CCR5 blockade dampens leukocyte trafficking into the CNS and has been neuro-protective in stroke and Alzheimer models.? N-(R-Carboxy-Ethyl)-α-(S)-(2-Phenylethyl) exhibited the highest predicted binding affinity, but has not yet been investigated to our knowledge for its impact on hypoxia related neuroprotection, warranting additional research.
9: Top-Ranking DrugBank Compounds and Their Predicted Binding Affinities for EGLN
Complementing our top binders for EGLN, we identified three structurally diverse compounds fulfilling both the binding affinity requirement (ΔG pred < −8.5 kcal/mol) and CNS-related pharmacokinetic benchmarks (Table). DB08149, a pyrrolo[2,3-d]pyrimidine-based small molecule, is originally developed as an ATP-competitive kinase inhibitor with demonstrated PKBβ and PKA inhibition as demonstrated in enzymatic essays.? DB07606, a pyrazolo[3,4-d]pyrimidinone compound, was initially designed to target CDK pathways and has shown strong enzymatic and cellular activity in preclinical studies.? DB07227 (L-778,123) is a dual farnesyltransferase and geranylgeranyltransferase-I inhibitor that has been evaluated in phase I clinical trials for oncology indications, including combination therapy with radiotherapy in pancreatic cancer.?
10: Repurposing Candidates Targeting EGLN with Favorable CNS Properties and Strong Binding Affinities
Discussion
3
Motivation of the Sheaf Laplacian for Network
Analysis
3.1
It should be noted that successful application of topological methods such as persistent homology and persistent Laplacians hinges on the integration of important nongeometric biological information. One can try to build a filtration from selected atoms or use generalized distance to encode biochemical interactions. These tricks are included in the approach called element-specific persistent homology (ESPH).? In the cellular (co)sheaf framework, restriction or extension maps can be understood as interactions between cliques, and a cleverly designed sheaf can integrate our prior knowledge about the network into the persistent (co)sheaf (co)homology and sheaf Laplacians. For example, by labeling each gene by their respective up or down regulation according to Log2FC. The restriction map may then encode some local measure of a gene module’s dysregulation by averaging the logFC scores in an edge or clique relation scaled by the PPI strength of the genes’ interactions. On the other hand, sheaves can also be constructed in a naive and formal manner when the question is more about distinguishing different networks where the underlying topological information is the same. An elementary situation is when we have two types of nodes A and B in the network. Different distributions of A and B in the network might lead to different network behaviors or functions. One can construct multiple sheaves by assigning two numerical quantities to A and B and the spectra of Laplacians will be different for different distributions. Similarly, by removing a node from either A or B, the resulting distribution will again change. A more highly PPI connected or more significantly dysregulated gene’s removal will then likely affect the distribution to a greater degree. This would be reflected in a more substantial difference in the spectra of the perturbed Laplacian. In this manner we may rank the significance of our differentially expressed genes. The cellular (co)sheaf framework can be applied jointly with previous approaches such as element-specific persistent homology to integrate more nonspatial biological information and achieve better results. There are alternative topological Laplacians which can also be incorporated in such an analysis. For example, Mayer topology has been proposed, where the boundary condition of the chain complex maps has been altered to higher degrees than 2.? Generally, this provides an additional dimension of analysis by simultaneously considering multiple possible boundary conditions. The resulting set of spectral statistics is then considerably richer and generally more informative. Such a framework could provide a more nuanced analysis of biological networks.
Review of Potential Repurposing Candidates
3.2
Across our three targets, ERBB4, CXCR4, and EGLN, we collected a total of 707, 1469, and 1299 inhibitors labeled by IC50 values, respectively, from ChEMBL and PubChem to train our Machine Learning based binding affinity prediction models. Additionally, we collected 4933 agonists labeled by EC50 for PPARG. After deploying each model on 8865 Drugbank small compounds, we identified 276, 104, 15, and 38 potentially suitable binders for each respective target. In Tables, ?, ? and ?, we list each top 10 prediction and note that among these drugs several are already known as effective binders and in trial to determine their potential neuroprotective properties. Notably, some of the top-ranked predictions were not part of the original training set but are already known as agonists or antagonists for the respective targets. For example, Desidustat (ZYAN1), an EGLN inhibitor that has completed Phase 2 clinical trials for the treatment of anemia in CKD.? Similar target-specific compounds, including known binders to ERBB4 and PPARG, also appeared among the top predictions. This observation suggests that our molecular fingerprint–based binding affinity prediction models effectively identify compounds likely to bind each target, even when those compounds were not explicitly seen during training.
Following this, to screen compounds with potential as AD therapeutics while maintaining target specificity, we applied a secondary filtering step that incorporated both predicted binding affinity and predicted CNS-relevant ADMET properties, BBB permeability and P-glycoprotein (Pgp) inhibition/substrate probabilities, critical determinants of CNS drug efficacy and retention. This step enabled us to find candidates with both strong target engagement and favorable CNS drug-like profiles. Other ADMET parameters such as CYP inhibition or volume of distribution were excluded to avoid excessive filtering and to preserve chemical diversity- not only to identify viable repurposing candidates, but also to leave open the possibility of using these for future lead optimization. We note that our target set is comprised of DrugBank compounds which have generally undergone some level of validation. Many are approved, investigational, or experimental drugs. Thus, extensive ADMET filtering may prove superfluous anyway. Among the shortlisted candidates, several compounds have previously been tested in experimental models of AD, Parkinson’s disease, or neuroinflammation. For example, Ponesimod (DB12016), an antagonist of S1PR1, has shown significant promise in preclinical AD models.? In 5XFAD transgenic mice, it decreased Aβ-induced activation of microglia and astrocytes, reduced pro-inflammatory cytokines such as TNF-α and CXCL10, and enhanced Aβ clearance through improved phagocytosis. Although S1PR1 was not among our initially selected targets, Ponesimod was identified by our pipeline based on its strong predicted binding affinity and desirable CNS relevant ADMET profile. Its inclusion highlights the ability of our method to uncover therapeutically relevant compounds that may act through alternative or convergent mechanisms relevant to AD pathology. Similarly, Hydroxychloroquine was previously evaluated in a randomized, double-blind, placebo-controlled clinical trial in early AD for 18 months.? Although the trial did not demonstrate significant cognitive benefit, the fact that hydroxychloroquine progressed to human testing demonstrates its prior consideration as a viable AD therapeutic, reinforcing the ability of our pipeline to recover compounds with real potential to become actual treatments. A third example is Eplivanserin (DB12177), also known as SR 46349B, a selective 5-HT_2_ receptor antagonist.? In a study by John P. Dougherty and Jeff Oristaglio, chronic administration of this compound enhanced memory recall in mice.? Although it has not been directly evaluated in AD models, its cognitive effects suggest potential relevance for treating memory-related symptoms in AD disease and other neurodegenerative disorders. The presence of such compounds among our screening results suggests that our screening approach is capable of surfacing drug candidates that may impact Alzheimer’s-related mechanisms. Although these compounds have not yet been experimentally validated for direct interaction with the specific targets proposed in this study, future AD-related investigations may reveal their therapeutic potential. If confirmed, these findings could lead to the identification of novel mechanisms of action or support the repositioning of these compounds for new indications in AD. However, most of these studies were not designed to investigate direct interactions with the specific molecular targets highlighted in our study (ERBB4, CXCR4, PPARG, and EGLN). This presents an opportunity: compounds previously shown to affect Alzheimer’s disease–relevant processessuch as neuroinflammation, memory impairment, or synaptic dysfunctionmay also exert their effects through pathways involving our selected targets. Experimental validation of such interactions could uncover new therapeutic pathways and expand the scope of drug repurposing. Notably, our prediction pipeline successfully retrieved compounds that have undergone prior evaluation in AD models or related neurodegenerative conditions, even without explicit target alignment. This reinforces the biological relevance of our computational approach and suggests that compounds without prior experimental investigation may similarly harbor therapeutic potential. All shortlisted candidatesregardless of their current validation statuscould serve as valuable starting points for future development, whether through direct biochemical assays or structure-guided lead optimization.
In summary, the drug selection strategy presented in this study highlights the potential to uncover novel chemical entities with therapeutic relevance to AD. These findings strongly support the potential for topological approaches to analyzing transcriptomics data with the goal of identifying molecular therapeutic targets, as well as Machine Learning methods for drug repurposing. These candidates warrant further investigation through in vitro and in vivo experiments to evaluate their mechanisms of action, including direct target binding. Collectively, the identified compounds offer a strong foundation for the development of next-generation AD therapeutics.
Methods
4
Sheaf Persistent Spectral Theory
4.1
In Figure we outline the proposed methodology from PPI network and clique complex construction to topological perturbation and repurposing models. We consider the PPI network as a graph with nodes representing our genes and edges indicating interactions. Constructing a simplicial complex allows for representing high-order interactions among the genes, capturing a broader range of shapes and high-dimensional relationships. Persistent Spectral Theory then reveals multiscale structures by constructing a filtration, or a nested sequence of simplicial complexes induced by a parameter representing a thresholding
(a) Definition of a simplex. A 0-simplex is a node, a 1-simplex is an edge connecting two nodes, and so on. (b) Illustration of Vietoris-Rips filtration scheme by varying a threshold. (c) Illustration of the Cellular Sheaf Framework. Genes in the network are labeled by the magnitude of their dysregulation, embedding non structural biological information for a richer analysis. (c) Topological perturbations are performed for each gene by removing that gene from the network and analyzing the impact. (d) Analysis pipeline. Topological perturbations are performed for each gene on the network. A filtration is induced via the STRING confidence scores of each P–P interaction. The spectra of the original and perturbed complexes are featurized via summary statistics, and the Wasserstein distance between the feature vectors gives the topological significance score of that gene. (e) Machine Learning drug repurposing pipeline. Small molecule drugs are evaluated on each identified molecular target. Features are generated via sequence aware and 2D structure aware models, which are utilized in gradient boosted decision trees to provide binding affinity predictions. Subsequent ADMET analysis provides list of repurposeable candidates.
Here, K is the largest simplicial complex in the filtration, with each K _ t _ being a sub simplicial complex at the filtration level t. Accompanying this filtration is a corresponding sequence of chain complexes and boundary operators at each scale
In which C _ q _ ^ t ^ = C _ q _(K _ t ) is its chain group, ∂ q _ ^ t ^:C _ q (K _ t ) → C _ q‑1(K _ t ) is its q-th boundary operator, ∂ q _ ^*t ^is the adjoint operator of the boundary operator ∂ q _ ^ t ^, and it is relative to an inner product defined on a chain group. For each K _ t _, every q-simplex is oriented, and the boundary operator is applied as follows
Here, σ_ q _ = [v 0, ..., v _ q ] is an oriented q-simplex within K _ t _ and represents the oriented (q – 1)-simplex obtained by removing the i’th vertex from σ q . Consider now to be the subset of C _ q _ ^ t ^ consisting of chains whose boundaries are in C _ q‑1 ^ t ^. The p-persistent q’th boundary operator ∂_ q _ ^ t+p ^ is then defined as a mapping from . The p-persistent q-combinatorial Laplacian operator Δ_ q _ ^ t+p ^ along the filtration is then given as
If orthonormal bases are chosen for all linear spaces, then the matrix representations of ∂_ q+1_ ^ t+p ^ and ∂_ q _ ^ t ^ can be denoted as and respectively and obtain a matrix representation of Δ, denoted , of the form
This matrix is symmetric and positive semidefinite, ensuring that all eigenvalues are real and non-negative. The p-persistent q’th Betti numbers, representing the number of q-cycles persisting in K _ t _ after p filtration, correspond to the nullity of the spectra of .
Now, a cellular sheaf on a simplicial complex K consists of the following:
- 1.A simplicial complex K, where the face relation that σ_ p _ is a face of σ_ q _ is denoted by σ_ p _ ≤ σ_ q _
- 2.An assignment to each simplex σ of K a finite dimensional vector space and to each face relation σ_ p _ ≤ σ_ q _ a linear morphism of vector spaces denoted satisfying and is the identity map. The vector space is the stalk of over σ and the linear morphism is the restriction map of the face relation. A global section s of is an assignment to each simplex σ an element such that for any face relation σ_ p _ ≤ σ_ q _.
Suppose then that we have a graph, or a one-dimensional simplicial complex K where each vertex v _ i _ is associated with a quantity . Denote the edge connecting v _ i _ and v _ j _ as e _ ij _. We can define a sheaf on K such that each stalk is , and for the face relation v _ i _ ≤ e _ ij _, the morphism is the multiplication by q _ j _/r _ ij _ where r _ ij _ is the length of e _ ij _. The assignment q _ i _ → v _ i _ and q _ i _ q _ j _/r _ ij _ → e _ ij _ is a global section, since .
This can then be generalized to higher order simplicial complexes. Suppose we also have a nowhere zero function . We can define a sheaf where each stalk is and for the face relation [v 0, v 1, ..., v _ n _] ≤ [v 0, v 1, .., v _ n , v _ n+1, ..., v _ m _], the linear morphism is the scalar multiplication by
The Sheaf Laplacian is then a combinatorial Laplacian constructed from the sheaf cochain complexes. The nullity of the q-th persistent sheaf Laplacian is then equal to the q-th p-persistent sheaf Betti number. The matrix representation of the persistent sheaf Laplacian is constructed similarly to the combinatorial Laplacian, with a full treatment available in the Supporting Information.
Biomarker Identification and Drug Repurposing
4.2
Given our PPI simplicial complex, in which the genes make up the set of 0-simplexes, the STRING PPI confidence scores facilitate the construction of an inverse rips filtration.? We wish to now identify the topologically significant genes via topological perturbations over each scale of the filtration. We hypothesize that the genes which are most topologically significant with respect to a dysregulation scaled sheaf would correspond well to significant biomarkers of disease progression. Topological significance is calculated according to the disparity between feature vectors obtained from the original complex and a perturbed complex where the respective gene has been removed.
Specifically, for a clique complex K and a perturbed clique complex K̂, we obtain the sets of subcomplexes induced by filtration: {K 0, K 1, ..., K _ p _} and . For each step of the filtration we compute the L 0 and L 1 Persistent Sheaf Laplacians. The spectra of these Laplacians encode the topological and geometric structure of our network with respect to a logFC labeling. We remove a specific gene from the network to construct the perturbed complex. For convenience, we calculate summary statistics of the spectra: {Min, Mean, Max, Standard Deviation, Sum, Number of Zeros}, giving us a feature vector for each subcomplex. Having obtained a feature vector for K _ i _-call it f _ i _, and - call it , we calculate the Wasserstein distance between the two: . The larger the distance, the more topologically significant we consider that gene to be at that scale. For robustness, we then consider the set of genes that rank among the top 25 most significant out of 200 over all scales of the filtration, giving us our final set of inferred biomarkers.
Given a list of biomarkers, we move to our machine learning enabled drug repurposing. We first obtain binding affinity training data for these targets from ChEMBL. These data sets are comprised of SMILES strings for the molecular compounds, each paired with some bioactivity label, specifically IC_50_ for the inhibitors and EC_50_ for the agonists. In addition, we retrieved small molecule drugs, categorized under either approved, investigational, or experimental status from the DrugBank database (version 5.1.12). To ensure consistency, all SMILES strings were canonicalized by the RDKit toolkit.
In our binding affinity analysis, we employed two fingerprinting methodologies to delineate molecular structures in a format suitable for machine learning input. Specifically, we utilized a pretrained Bidirectional Transformer, which was pretained in a self-supervised manner using the ChEMBL27 data set by Chen and colleagues.? We also utilized a two-dimensional structure based molecular fingerprinting technique called Extended-connectivity fingerprints (ECFPs).?
We then trained two Gradient Boosting Decision Tree regressors (one for each drug representation-structural or NLP) on the bioactivity labeled ChEMBL data, and then tested them on thousands of small compounds obtained from DrugBank. The models were finetuned using 10-fold cross validation on a 70/30 train-test split of the ChEMBL data, and after parameter tuning were refit to the entire data set. We averaged the predicted binding affinities between the two models to obtain our final prediction. Ultimately, drugs with a predicted equilibrium dissociation/inhibitor constant (K d/K i) of less than 180 nm (i.e., less than −9.2 kcal/mol) were deemed to be effective binders to their respective molecular target.
To align our predictions with the specific requirements of drug repurposing for AD, we applied an additional layer of filtering to the initially screened DrugBank small molecules, focusing on key CNS related physiological properties predicted by ADMETlab 3.0.? As a preliminary step, we excluded mixtures or ionic compounds, identified by the presence of multiple unconnected molecular entities in their SMILES representation. We then screened for compounds with a predicted BBB penetration probability greater than 70%, while excluding those with a Pgp inhibitor or substrate probability exceeding 50%. Given the CNS involvement in Alzheimer’s pathology, BBB permeability is a critical requirement for therapeutic efficacy. At the same time, compounds predicted to be either P-glycoprotein substrates or inhibitors were excluded as substrates are actively transported out of the brain, reducing central availability, and inhibitors may interfere with endogenous efflux regulation, potentially leading to safety issues or altered pharmacokinetics of co-administered drugs. We retained only those molecules that satisfied both the CNS-oriented ADMET criteria and our binding affinity thresholds: −9.2 kcal/mol for CXCR4 and ERBB4, and a slightly relaxed −8.5 kcal/mol for EGLN and PPARG due to the lack of strong candidates under the stricter threshold.
Conclusion
5
In this study, we performed a population-level snRNA-seq analysis to identify gene targets that commonly occur in AD patients. We have demonstrated the ability of our PSL network analysis method to present novel and testable biological hypotheses as well as to confirm known biological programmes for different aspects of AD pathology. Specifically, when applied to DEG based PPI complexes from the brain vasculature, the tool identifies genes whose interaction structure suggests a previously unexplored mechanism by which protective signaling cascades in endothelial cells connect to MMP inducing cAMP/PKA machinery in pericytes, which actually drives further leakage. Additionally, when applied to the study of PPI networks constructed from DEGs in a phagocytic transition state for microglia cells, the tool identifies genes which combine to form four axes which each comprise the hallmark phenotype of disease associated microglia, confirming the reliability of our tool in uniting gene structural significance with dysregulation magnitude to recover meaningful biological programmes in differential expression analysis. In both cases the method highlighted genes which can serve as suitable therapeutic targets, hitting multiple pathology inducing pathways simultaneously. Naturally, this analysis is not limited to the study of AD, and could extend to a variety of biological networks.
Our subsequent computational drug repurposing pipeline integrated predicted binding affinity with CNS-relevant ADMET properties to identify candidate compounds targeting ERBB4, CXCR4, PPARG, and EGLN. Several shortlisted compounds have been previously tested in models of AD or other neurological disorders, providing indirect evidence supporting the validity of our pipeline. Others, though not yet explored experimentally in AD, showed favorable profiles that warrant further investigation. Together, these results showcase the potential of our approach to surface both biologically meaningful and novel repurposing candidates that could serve as entry points for experimental validation and lead optimization. Additionally, we identified 50 potential gene targets in total. While we focused our repurposing analysis on only a small subset of these genes due to their clear involvement in pathology driving pathways and availabiltity of binding affinity information, a further investigation of these other targets is certainly worthwhile.
Future directions of research in this area include namely the experimental validation of our hypothesized ERBB4-DLG2 signaling cascade in endothelial and pericyte cells as well as an additional application of PSL analysis to coexpression or GRN based network configurations to overcome the incompleteness of PPI relations. Additional experimental exploration of the proposed HIF1A-PPARG-APOE backbone in phagocytic microglia is also an important step. We would suppose that with high quality data this approach could produce additional novel insights and testable hypotheses in other cell types and regions of the brain, potentially furthering our understanding of the molecular mechanics behind Alzheimer’s Disease. Also, the efficacy of drugs cannot be determined entirely in silica, and so experimental work to decide the validity of these drug repurposing recommendations would also be necessary. Lastly, it is interesting to explore the utility of alternative topological Laplacians in biological network analysis, such as Mayer topology.
Supplementary Material
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1ArnòB.Grassivaro F.Rossi C.Bergamaschi A.Castiglioni V.Furlan R.Greter M.Favaro R.Comi G.Becher B.Martino G.Muzio L.Neural progenitor cells orchestrate microglia migration and positioning into the developing cortex Nat. Commun.20145561110.1038/ncomms 661125425146 · doi ↗ · pubmed ↗
- 2Arnold S. E.Hyman B. T.Betensky R. A.Dodge H. H.Pathways to personalized medicineembracing heterogeneity for progress in clinical therapeutics research in alzheimer’s disease Alzheimer’s Dement.202420107384739410.1002/alz.1406339240044 PMC 11485305 · doi ↗ · pubmed ↗
- 3Ayati A.Moghimi S.Salarinejad S.Safavi M.Pouramiri B.Foroumadi A.A review on progression of epidermal growth factor receptor (egfr) inhibitors as an efficient approach in cancer targeted therapy Bioorg. Chem.20209910381110.1016/j.bioorg.2020.10381132278207 · doi ↗ · pubmed ↗
- 4Badolato R.Alsina L.Azar A.Bertrand Y.Bolyard A. A.Dale D.Deyà-MartínezÀ.Dickerson K. E.Ezra N.Hasle H.Kang H. J.Kiani-Alikhan S.Kuijpers T. W.Kulagin A.Langguth D.Levin C.Neth O.Olbrich P.Peake J.Rodina Y.Rutten C. E.Shcherbina A.Tarrant T. K.Vossen M. G.Wysocki C. A.Belschner A.Bridger G. J.Chen K.Dubuc S.Hu Y.Jiang H.Li S.Mac Leod R.Stewart M.Taveras A. G.Yan T.Donadieu J.A phase 3 randomized trial of mavorixafor, a cxcr 4 antagonist, for whim syndrome Blood 20241441354510.1182/blood.202302265838643510 PMC 11251404 · doi ↗ · pubmed ↗
- 5Benedetti M. S.Dostert P.Barone D.Efthymiopoulos C.Peretti G.Roncucci R.In vivo interaction of cabergoline with rat brain dopamine receptors labelled with [3h]n-n-propylnorapomorphine Eur. J. Pharmacol.1990187339940810.1016/0014-2999(90)90367-f 1981553 · doi ↗ · pubmed ↗
- 6Bottos A.Rissone A.Bussolino F.Arese M.Neurexins and neuroligins: synapses look out of the nervous system Cell. Mol. Life Sci.201168162655266610.1007/s 00018-011-0664-z 21394644 PMC 11115133 · doi ↗ · pubmed ↗
- 7Boujon V.´.Uhlemann R.Wegner S.Wright M. B.Ulrich L.Endres M.Kronenberg G.Gertz K.Dual PPARα/γ agonist aleglitazar confers stroke protection in a model of mild focal brain ischemia in mice J. Mol. Med.20199781127113810.1007/s 00109-019-01801-031147725 PMC 6647083 · doi ↗ · pubmed ↗
- 8Bowman E. P.Campbell J. J.Druey K. M.Scheschonka A.Kehrl J. H.Butcher E. C.Regulation of chemotactic and proadhesive responses to chemoattractant receptors by RGS (regulator of G-protein signaling) family members J. Biol. Chem.199827343280402804810.1074/jbc.273.43.280409774420 · doi ↗ · pubmed ↗
