Wiring Between Close Nodes in Molecular Networks Evolves More Quickly Than Between Distant Nodes

Alejandro Gil-Gomez; Joshua S Rest

PMC · DOI:10.1093/molbev/msae098·May 20, 2024

Wiring Between Close Nodes in Molecular Networks Evolves More Quickly Than Between Distant Nodes

Alejandro Gil-Gomez, Joshua S Rest

PDF

Open Access

TL;DR

This paper shows that connections between closely related nodes in molecular networks evolve faster than those between distant nodes, using drug interactions as a proxy.

Contribution

The study introduces drug–drug interactions as a novel proxy for measuring molecular network evolution across species.

Findings

01

DDIs diverge rapidly over short evolutionary time periods but show saturation over longer periods.

02

Targets of synergistic DDIs are closer in PPI and cofunctional networks and evolve faster.

03

Drug combination data can provide insights into network evolution at larger scales.

Abstract

As species diverge, a wide range of evolutionary processes lead to changes in protein–protein interaction (PPI) networks and metabolic networks. The rate at which molecular networks evolve is an important question in evolutionary biology. Previous empirical work has focused on interactomes from model organisms to calculate rewiring rates, but this is limited by the relatively small number of species and sparse nature of network data across species. We present a proxy for variation in network topology: variation in drug–drug interactions (DDIs), obtained by studying drug combinations (DCs) across taxa. Here, we propose the rate at which DDIs change across species as an estimate of the rate at which the underlying molecular network changes as species diverge. We computed the evolutionary rates of DDIs using previously published data from a high-throughput study in gram-negative bacteria.…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Diseases1

DC

Figures4

Click any figure to enlarge with its caption.

Fig. 1 — DDI score distance diverges nonlinearly over time. Species and strain abbreviations are shown in the top left box. a) Bayesian phylogeny and divergence time estimates based on an alignment of 27 highly conserved protein sequences. b) Hierarchical clustering of strains based on average Euclidean distances across DDIs (i.e. “DDI score distance”); drug combination (DC) data are from Brochado et al. (2018). c) Heatmap of DDI scores across strains. Along top, hierarchical clustering of DCs is shown based on Euclidean distances across strains, but these clusters were constrained by t-SNE cluster membership (see text). The heatmap displays synergistic interactions (close to −1), antagonistic interactions (close to 1), and additivities (close to 0; measured in Bliss score units). The bars below indicate whether the two drugs involved in the interaction are the same or different in terms of: belonging to the same drug category, targeting the same cellular process, or having the same use. d) Evolutionary rate of DDI score change, calculated for each cluster. e) Pairwise DDI score distances between strains as a function of divergence time between strains. Intraspecific comparisons, comparisons between Salmonella and Escherichia, and comparisons with Pseudomonas are each labeled. f) First two axes of a phylomorphospace-PCA for the DDI score data. The percent of variance explained for principal components 1 and 2 are shown.

Fig. 2 — a) Graphical representations of the cofunctional gene pair network of E. coli (EcoliNet: EcoCyc/GO-BP). This network contains 1,835 nodes with an average path length of 4.8 and contains 20 proteins targeted by 36 drugs in our analysis. b) Graphical representations of the PPI network of E. coli, as determined by small and medium-scale experiments (EcoliNet: LC. Small/medium-scale PPI). This network contains 764 nodes with an average path length of 4.9, with 18 proteins targeted by 27 drugs in our analysis. In a and b, each node represents a unique protein (KEGG ID) in E. coli; the red nodes are target proteins identified as participating in DCs in our analysis.

Fig. 3 — Pairs of protein targets with synergistic DDIs have lower path length and higher connectivity and centrality measures in E. coli cofunctional (a, c, e, g, i) and PPI (b, d, f, h, j) networks. DDI types examined are synergies, additivities, and antagonisms (“nontargets” are a background sample of nondrug target proteins in the network). The number of interactions in each category is in parentheses. Network metrics are: a and b: path length between two targets, c and d: K-edge connectivity between two targets, e and f: mean node degree of the two targets, g and h: mean betweenness centrality of the two targets, i and j: mean eigenvector centrality of the two targets. In all plots, significance of Wilcoxon test P-values are given for differences in the mean between all pairwise comparisons: P-value < *0.05, 0.0001, *0.00001. For all plots, Kruskal−Wallis test for difference among groups is P < 1 × 10−7.

Fig. 4 — Evolutionary rates of DDI scores as a function of DDI type and network connectivity of targets. a) DCs resulting in only synergy or synergy↔antagonism DDIs across strains have faster evolutionary rates. b) Aggregated interaction types reveal that more synergistic DCs have higher evolutionary rates. The x axis value is the sum, per DC, across strains, where DDIs are scored as −1 for synergies, 0 for additive, and +1 for antagonistic interactions. c) Rate of DDI evolution as a function of DDI type, for DCs with protein targets in the cofunctional and small/medium-scale PPI networks. Synergistic DCs have higher evolutionary rates than additive and antagonistic DCs (combination types and networks from E. coli). d) Rate of DDI evolution as a function of the minimum distance between DC target proteins reveals that wiring between close nodes in molecular networks evolves more quickly than between distant nodes. See supplementary table S2, Supplementary Material online for statistical tests for differences among groups shown here. For all plots, the error bars represent the standard error of the mean.

Keywords

gram-negative bacteriaphylogenetic comparative methodsbiological network evolutiondrug–drug interactions

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational Drug Discovery Methods · Gene Regulatory Network Analysis · CO2 Reduction Techniques and Catalysts

Full text

Introduction

Molecular networks are models of real molecular interactions in the cell. These networks are built by collecting biochemical and genetic interaction data, an approach that has improved in the last decades with the advent of modern high-throughput methods. However, there are still many limitations to the gathering and analysis of molecular networks, specifically for studying network variation across and within species. This is because many molecular networks have poor quality or are sparse and incomplete and because the number of organisms for which network data are available is still very limited (Cusick et al. 2005; Jin et al. 2013; Ghadie et al. 2018). The major types of molecular networks that have been built represent protein interactions, metabolic processes, signaling, and gene regulation.

Molecular networks evolve as nodes (e.g. proteins) and edges (e.g. molecular interactions) are added or lost. This can be the result of different processes, including those affecting genes and the interaction of their products, such as gene or motif duplication, loss, horizontal transfer, or whole genome duplication (Wagner 2003; Cork and Purugganan 2004; Bernhardsson et al. 2011; Koch et al. 2017), or a result of processes affecting quantitative properties, including nonsynonymous substitutions in the protein nodes affecting their function (Jensen 1976; Ghadie et al. 2018). These mutations may affect the binding of ligands, protein domains, or DNA motifs (Koch et al. 2017), which in turn result in changes in their associated metabolic, signaling, or gene expression networks. The type of network also may affect the rewiring rates of nodes. For example, gene regulatory networks tend to rewire at faster rates than metabolic networks (Shou et al. 2011), suggesting that some network types are less constrained than others.

The genetic and evolutionary events that remove old connections and generate new ones may be random with neutral consequences (Bernhardsson et al. 2011), may be genetically constrained (Wollenberg Valero 2020), or may affect fitness. We know that not all network motifs (i.e. recurrent patterns of connections with potential functional properties) are equally abundant in molecular networks, suggesting the action of natural selection (Milo et al. 2002; Picard et al. 2008). Natural selection may remove deleterious connections (Jordan et al. 2008) and favor advantageous ones (Laarits et al. 2016; Mehta et al. 2021), including balancing environmental robustness with network functions (Han et al. 2013; Chen and Ho 2014). While higher fitness solutions may include those with lower levels of modularity (Kashtan and Alon 2005), modularity itself may be driven by selection to reduce connection costs (Clune et al. 2013). Higher levels of protein connectivity have been correlated with lower substitution rates (Fraser et al. 2002), potentially due to strong purifying selection acting on the interfacial sites of interacting proteins (Zotenko et al. 2008). Each of the network types, discussed in aggregate above, may be subject to different evolutionary forces and patterns.

Part of the key to understanding the forces acting on networks is a characterization of evolutionary rates of network change. Despite advances in studying the connections of individual nodes and network rewiring, the rate and patterns by which quantitative differences between networks accumulate as a function of species divergence remain relatively unexplored. Beltrao and Serrano (2007) estimated rewiring rates among eukaryotic proteins at ∼10^−5^ interactions/protein pair/million years, found that network rewiring proceeds faster than sequence evolution, and that differences in rewiring rates between functional categories suggest the action of natural selection. One interesting question is whether network rewiring among a given pair of nodes is correlated with metrics of connectivity between those nodes, including minimum path length (the number of steps along the shortest path between nodes), K-edge connectivity (minimum number of edges that can be removed to disconnect two nodes), degree (number of edges connected to the node), and centrality. This is important information for understanding what aspects of network structure that we can quantitate are subject to evolutionary forces acting on variation.

In this study, we use drug–drug interaction (DDI) scores as a quantitative proxy measurement of inter-node connectivity, since it has been shown that DDIs are partially dependent on the underlying network topology between targets (Lehár et al. 2007; Yeh et al. 2009). Among drug combinations (DCs), DDIs occur when the effect of two or more drugs is significantly stronger or weaker than the additive expectation, respectively named synergies and antagonisms (Greco et al. 1995). DDIs are used in the development of novel pharmacological treatments with higher efficiencies at lower doses and to reduce the evolution of drug resistance (Cowen and Steinbach 2008). The reasons why a DDI occurs are varied, influenced by the functional relationships between the drug targets; this is an effect of factors such as the topology of the underlying network between drug targets and the essentiality of the metabolites affected by the drugs (Yeh et al. 2009). These relationships can either be direct (between the target genes themselves) or indirect (mediated through interactions with other genes). Synergies often occur when drugs act on parallel or redundant pathways that contribute to the same essential biological function or end product. This redundancy in pathways is susceptible to DCs which prevent any single pathway from fully compensating for the inhibition of others, thereby enhancing the overall effect (Lehár et al. 2007). In contrast, antagonistic interactions can arise (i) when drugs acting on the same pathway at different sites interfere with each other's actions, (ii) when drugs acting on the same pathway or process exert opposite effects, thus diminishing the overall effect, (iii) when two drugs each result in complete loss of function of the same nonessential pathway or (iv) if two drugs each effect different functions, but where the combined effect is equal to the most limiting of the two (Yeh et al. 2009). While both DDI types can be a result of either direct or indirect effects, experimental evidence suggests that, in general, most synergistic interactions are the result of drugs targeting the same cellular process, while antagonistic interactions are typically the result of drugs targeting different processes (Brochado et al. 2018).

Only a few studies have explored interspecific variation of DDI scores (Spitzer et al. 2011; Robbins et al. 2015; Brochado et al. 2018; Davis et al. 2024), but they have demonstrated that DC experiments (used to obtain DDI scores) could be scalable in the number of species and strains. A high-throughput study of DDIs in gram-negative bacteria has shown that synergies are more conserved across species than antagonisms and additive combinations (Brochado et al. 2018).

How well does the evolution of DDI scores transmit the evolutionary patterns of their underlying molecular networks? This is an important question; assuming there is a reasonable correspondence, then our questions about DDI evolution mirror those about molecular networks: Does DDI score divergence accumulate linearly with divergence time, or follow some other function? Do DDIs diverge consistently with neutral processes acting on the underlying network structure? Or is there heterogeneity across the network, for example with differences accumulating slowly in constrained local neighborhoods, and faster between more distant connections? In other words, does the evolutionary rate of DDIs depend on the connectivity of the drug targets?

Another way to parse this latter question is to divide up DCs by type. Do synergistic interactions evolve at a slower rate than other types of drug interactions, and do antagonistic interactions evolve at a faster rate? These differences in rates would be a result of synergistic interactions taking place in local neighborhoods of nodes, while antagonistic interactions act across distant network neighborhoods, which may evolve more slowly due to greater network redundancy between distant nodes. Indeed, if we were to map DCs to protein targets (when known), are the drug targets of synergistic drug interactions closer in the network than additive and antagonistic interactions? And, more generally, are more closely connected nodes subject to higher rates of evolution than more distantly related nodes? These questions can be addressed by comparing rates of DDI score evolution with measures of inter-node connectivity.

To address these questions, we used the most complete available dataset of DCs measured across species and strains (Brochado et al. 2018). We modeled these DCs under a phylogenetic comparative framework and applied a multivariate Brownian motion model to estimate the evolutionary rate of interaction scores for different clusters of DCs in six strains (three species) of gram-negative bacteria. We also mapped DCs to their putative protein targets to evaluate them in known molecular networks. We show that DDI scores can be used as an effective proxy to evaluate macroevolutionary patterns of network evolution.

Results

DDI Scores Diverge nonlinearly

We obtained DDI scores for 2,655 pairwise combinations of 79 different compounds affecting six strains, two from each of the three gammaproteobacteria species: Escherichia coli, Salmonella enterica, and Pseudomonas aeruginosa. The DDI scores were measured and calculated by Brochado et al. (2018) using the Bliss independence model, which calculates the expected combined effect of two drugs as the product of the probabilities that each drug individually fails to produce its effect (Tallarida 2011; Foucquier and Guedj 2015). This expected outcome provides a baseline for detecting synergistic or antagonistic interactions, where the actual combined effect surpasses or falls short of the predicted effect, respectively. We performed hierarchical clustering (UPGMA) of the Euclidean distances among DDI scores from the six strains, revealing a “DDI score distance” tree (Fig. 1b). The DDI score distance is half of the cophenetic distance for each strain pair; i.e. the distance from either tip of the pair to the point in the tree where they first come together.

DDI score distance diverges nonlinearly over time. Species and strain abbreviations are shown in the top left box. a) Bayesian phylogeny and divergence time estimates based on an alignment of 27 highly conserved protein sequences. b) Hierarchical clustering of strains based on average Euclidean distances across DDIs (i.e. “DDI score distance”); drug combination (DC) data are from Brochado et al. (2018). c) Heatmap of DDI scores across strains. Along top, hierarchical clustering of DCs is shown based on Euclidean distances across strains, but these clusters were constrained by t-SNE cluster membership (see text). The heatmap displays synergistic interactions (close to −1), antagonistic interactions (close to 1), and additivities (close to 0; measured in Bliss score units). The bars below indicate whether the two drugs involved in the interaction are the same or different in terms of: belonging to the same drug category, targeting the same cellular process, or having the same use. d) Evolutionary rate of DDI score change, calculated for each cluster. e) Pairwise DDI score distances between strains as a function of divergence time between strains. Intraspecific comparisons, comparisons between Salmonella and Escherichia, and comparisons with Pseudomonas are each labeled. f) First two axes of a phylomorphospace-PCA for the DDI score data. The percent of variance explained for principal components 1 and 2 are shown.

We also estimated phylogenetic divergence times between the strains and species using calibrated phylogenetic analysis of highly conserved proteins (Fig. 1a). The 95% highest probability density of the most recent common ancestor (MRCA) of these three species is 1359 to 1527 million years ago (MYA), or 100 to 151 MYA for the two Enterobacteriaceae (E. coli and S. enterica).

To test how DDI scores diverge as a function of species divergence, we regressed the DDI score distances between strains against their divergence times (time to MRCA in MYA). There is a positive correlation between DDI score distance and divergence time (Fig. 1e). This is a log-linear relationship (R^2^ = 0.93, P-value = 6.8 × 10^−9^; supplementary fig. S1, Supplementary Material online). Specifically, the initial stages of DDI score divergence is characterized by rapid changes that accumulate between closely related strains, the rate of divergence slows down when comparing more distant species, and it saturates among the most distantly related species. This pattern may be consistent with both neutral and selective forces acting on the molecular networks underlying DDIs (see Discussion). Indeed, a phylomorphospace plot of the first two axes of a principal components analysis of the DC data (Fig. 1f) does not reflect the branch lengths of the DDI distance tree (Fig. 1b), suggesting that DDI score variance is explained by more factors than just time. Here, the largest variance component seems to distinguish all three species from each other (PC1, 47%), while the second component (PC2, 21.5%) seems to distinguish between Salmonella and the other species (Fig. 1f). This highlights that while the divergence time between Salmonella and Escherichia is small (Fig. 1a), the DDI score distance between these species is comparatively much larger, suggesting that after the divergence from their common ancestor, each lineage rapidly accumulated cellular and biochemical differences that are reflected in their differing DDI scores.

In light of these observations and to further explore these possibilities, we sought to estimate the evolutionary rate of DDI score change for particular DCs. However, there is low power with only six tips to estimate rate shifts for any individual DC. To address this limitation, we used t-SNE to detect clusters of DCs that behave similarly across species, allowing for increased power within each cluster. In order to find DC clusters that are the most biologically relevant from among a large parameter space (after filtering solutions), we first filtered for the top five solutions with the highest degree of phylogenetic modular signal (Adams et al. 2013). From among those five, we chose the solution with the biggest differences in evolutionary rate among clusters, when fitted with a multivariate Brownian motion model. A detailed description of the filters and tests used to select the clusters is in the Materials and Methods.

The resulting solution, with a perplexity of 245 and 16 clusters (Fig. 1c), was used for the remainder of the analysis. Cluster 14 had the highest evolutionary rate measured using the Brownian motion standard deviation parameter σ^2^, with a value of 0.00347 bliss score^2^ per million years, while all the other clusters had rates ranging from 0.00021 to 0.00169 (Fig. 1d). Cluster 15 had the second highest rate with a value of 0.00169 bliss score^2^ per million years, and cluster 1 had the third highest rate with a value of 0.00161 bliss score^2^ per million years.

Synergistic DDIs Have the Shortest Distance Between Network Targets, Followed by Additive DCs and Antagonistic DDIs

We observed that highly synergistic DDIs across all species tend to occur when both drugs belong to the same chemical category and target the same cellular process, as can be seen from the accumulation of green bars for category and process across cluster 14 (Fig. 1c; in agreement with (Brochado et al. 2018). This motivated us to formally examine how DCs map onto molecular networks. We therefore leave discussion of DDI evolution for now, to first describe our mapping of the DC targets to known molecular networks, and how connectivity on these networks relates to combination type (antagonism, additivity, synergy); in the next section we will describe how rates of DDI evolution map onto combination types and network connectivity. We were able to identify protein targets for 39 of the 79 drugs tested by Brochado et al. (2018) (see Materials and Methods). These drugs had a total of 27 target proteins as identified by their unique protein IDs in E. coli. The most common target category was bacterial penicillin-binding protein, a group involved in the biosynthesis of bacterial cell walls. Other target categories mapped include ribosomal RNA, DNA polymerase, topoisomerase, thymidylate synthase, and mitochondrial glycerol-3-phosphate (supplementary table S1, Supplementary Material online).

We then examined the E. coli DDI scores as a function of the connectivity of their targets. We used two E. coli networks: (i) a gold-standard cofunctional gene pair network (Fig. 2a), and (ii) a protein–protein interaction (PPI) network derived from small/medium-scale studies (Fig. 2b), both hosted on EcoliNet (Kim et al. 2015). The cofunctional network integrates probabilistic functional links, including shared GO and EcoCyc annotations, which signify molecular, metabolic, and biological process linkages or coregulation among genes and their products. Each edge of the network represents a shared functionality between gene pairs, reflecting their involvement in similar molecular or biological processes. The PPI network, compiled from curated databases, features high-confidence interactions between pairs of E. coli proteins.

a) Graphical representations of the cofunctional gene pair network of E. coli (EcoliNet: EcoCyc/GO-BP). This network contains 1,835 nodes with an average path length of 4.8 and contains 20 proteins targeted by 36 drugs in our analysis. b) Graphical representations of the PPI network of E. coli, as determined by small and medium-scale experiments (EcoliNet: LC. Small/medium-scale PPI). This network contains 764 nodes with an average path length of 4.9, with 18 proteins targeted by 27 drugs in our analysis. In a and b, each node represents a unique protein (KEGG ID) in E. coli; the red nodes are target proteins identified as participating in DCs in our analysis.

We measured the length of minimum distance paths between all drug targets in the networks. Targets of synergistic DCs have a significantly shorter minimum path length than antagonistic and additive DCs in both networks (Fig. 3a and b). This result is consistent with our expectations, given that synergistic DDIs tend to be more common between drugs that target the same cellular process and thus should be closer to each other in molecular networks (Brochado et al. 2018). We next examined K-edge connectivity, where two nodes are K-edge-connected if after removing k edges or less the nodes remain connected. For the cofunctional network, K-edge connectivity is higher for targets of synergistic DCs than additive DCs (Wilcoxon P-value = 0.0032), and higher for targets of additive DCs than antagonistic DCs (Wilcoxon P-value = 0.014) (Fig. 3c), as expected if targets of synergistic DCs are closer together. There was no significant difference among combination types for the PPI network, although targets of synergistic and additive DCs do have significantly higher connectivity than a background set of nontargets (Fig. 3d). Overall, these results suggest that, the closer together and better connected two nodes are to each other, the more likely they are associated with a synergistic DDI.

*Pairs of protein targets with synergistic DDIs have lower path length and higher connectivity and centrality measures in E. coli cofunctional (a, c, e, g, i) and PPI (b, d, f, h, j) networks. DDI types examined are synergies, additivities, and antagonisms (“nontargets” are a background sample of nondrug target proteins in the network). The number of interactions in each category is in parentheses. Network metrics are: a and b: path length between two targets, c and d: K-edge connectivity between two targets, e and f: mean node degree of the two targets, g and h: mean betweenness centrality of the two targets, i and j: mean eigenvector centrality of the two targets. In all plots, significance of Wilcoxon test P-values are given for differences in the mean between all pairwise comparisons: P-value < *0.05, **0.0001, **0.00001. For all plots, Kruskal−Wallis test for difference among groups is P < 1 × 10−7.

We also found that proteins in the cofunctional network that are associated with synergistic DDIs are more central and better connected than nodes that are associated with other types of interactions. In the PPI network, nodes that are associated with any type of DC are more central and better connected than nodes that are not targeted in our DC set, but don't differ by combination type. We arrived at these conclusions by examining three metrics (node degree, betweenness centrality, eigenvector centrality) that are characteristics of individual nodes (we averaged between the two target nodes in the DC). Average node degree (the number of connected edges) is significantly higher for targets of synergistic DCs than additive (Wilcoxon P-value = 0.0024) or antagonistic DCs (Wilcoxon P-value = 0.00018) in the cofunctional network (Fig. 3e), but not in the PPI network (Fig. 3f). Average betweenness centrality (how much each node lies on the shortest paths between pairs of other nodes in the network) is higher for targets associated with synergistic DCs than with additive (Wilcoxon P-value = 0.0015) or antagonistic DCs (Wilcoxon P-value = 0.013) in the cofunctional network (Fig. 3g), but not in the PPI network (Fig. 3h). Average eigenvector centrality (how much each node is connected to other important nodes in the network) was also higher for targets associated with synergistic DCs than with additive (Wilcoxon P-value = 0.0002) or antagonistic DCs (Wilcoxon P-value = 0.0005) in the cofunctional network (Fig. 3i), but these differences were not significant in the PPI network (Fig. 3j). Together, these results suggest that network connectivity of a protein affects the likelihood that it will be associated with a DDI, as well as the type of DDI.

Synergistic DDIs Evolve Faster than Additive and Antagonistic DDIs

We return now to describing the evolutionary rates of DDI scores (Fig. 1d), as determined on a per cluster basis (via t-SNE, described above). Cluster 14, which had the highest evolutionary rate (σ^2^ = 0.00347), appears to be rich in synergies in E. coli and Salmonella, while Pseudomonas has more additive DCs in this cluster (Fig. 1b). In contrast, cluster 15, which was the second highest rate cluster (σ^2^ = 0.00169), appears to be rich in synergies in Pseudomonas while the DCs in E. coli and Salmonella are primarily additive. The cluster with the third highest rate, cluster 1 (σ^2^ = 0.00161), contains highly antagonistic DDIs in all the species. These observations suggest that the evolutionary rate of DDI scores may vary as a function of the combination (DDI) type and network connectivity of their targets. To more formally investigate this, we examined how the rates vary based on the targets’ connectivity (in E. coli) and type of interaction. None of the DCs are exclusively antagonistic across all strains and species, although we found all other combinations (i.e. additivity, synergy, additivity↔antagonism, additivity↔synergy, antagonism↔synergy and additivity↔antagonism↔synergy, where the arrow indicates that some strains or species have one DDI type and other strains or species have another DDI type).

Overall, synergistic DDIs and additive↔synergistic DDIs have faster evolutionary rates than any other class (Fig. 4a; all possible group comparisons were significantly different from each other except additivity↔antagonism↔synergy vs. additivity↔antagonism, and excluding antagonism↔synergy, with only a single observation; see supplementary table S2, Supplementary Material online for all statistical tests in Fig. 4; significant differences mentioned have P-value < 0.001 unless indicated). We also aggregated each target pair's DDI type as a simple sum of its DDI types across strains and species (+1 for synergies, 0 for additivities, −1 for antagonisms), and found again that more synergistic DDIs (sum ≤ −3) have faster evolutionary rates than additivities (3 > sum > −3) and antagonisms (sum ≥ 3) (Fig. 4b). This result of higher evolutionary rates for synergistic DDIs is also true for the targets that are in both the PPI and cofunctional networks described earlier (Fig. 4c; all categories are significantly different, except for additivities vs. antagonisms in the PPI network).

Evolutionary rates of DDI scores as a function of DDI type and network connectivity of targets. a) DCs resulting in only synergy or synergy↔antagonism DDIs across strains have faster evolutionary rates. b) Aggregated interaction types reveal that more synergistic DCs have higher evolutionary rates. The x axis value is the sum, per DC, across strains, where DDIs are scored as −1 for synergies, 0 for additive, and +1 for antagonistic interactions. c) Rate of DDI evolution as a function of DDI type, for DCs with protein targets in the cofunctional and small/medium-scale PPI networks. Synergistic DCs have higher evolutionary rates than additive and antagonistic DCs (combination types and networks from E. coli). d) Rate of DDI evolution as a function of the minimum distance between DC target proteins reveals that wiring between close nodes in molecular networks evolves more quickly than between distant nodes. See supplementary table S2, Supplementary Material online for statistical tests for differences among groups shown here. For all plots, the error bars represent the standard error of the mean.

Furthermore, DCs whose targets are nearby in the network have the highest rate of DDI score evolution (Fig. 4d). This is consistent with the fact that cluster 14 has the highest rate (Fig. 1d), and contains a majority of synergies (Fig. 1c). This result of nearby targets having the highest DDI evolution rates is significant for both the PPI network (P-value = 0.007) and the cofunctional network (P-value = 0.019). Our observation of higher rates of DDI evolution among the close nodes is consistent when using reference networks from other species: we repeated this analysis by mapping E. coli target proteins to their orthologs in other species, and found a similar result among networks from four corresponding strains (including E. coli, Pseudomonas and Salmonella) in comparative networks from the STRING database (supplementary fig. S2, Supplementary Material online). The evolutionary rate is also significantly faster when simply comparing DCs that target adjacent versus non-adjacent targets in the E. coli PPI network (P-value = 0.0049; supplementary fig. S3B, Supplementary Material online), and for the EcoCyc portion of the E. coli cofunctional network (supplementary fig. S3A, Supplementary Material online), and when comparing connected versus disconnected targets for this latter network (supplementary fig. S3C, Supplementary Material online).

Protein pairs with the highest K-edge connectivity (>20) are targets of DCs with higher DDI score evolutionary rates in the cofunctional network (but not the PPI network, where all targets have connectivities < 5; supplementary fig. S3K, Supplementary Material online). Interestingly, DCs that have evolved between additive and synergistic DDI types tend to have targets with greater connectivity, degree, and betweenness centrality values in the E. coli networks (supplementary fig. S3, E to G, Supplementary Material online). Together, these results indicate that, as the distance and connectivity between two targets increase (as measured, for example, by their minimum connecting path length), the average evolutionary rate of the DDI scores decreases. We interpret this to mean that wiring between close nodes in molecular networks evolves faster than between distant nodes.

Discussion

Overview

We studied the evolution of DDI scores as a proxy for studying the evolution of network topology. We found that the evolutionary rates of DDIs among gram-negative bacteria are initially high, lower at longer evolutionary distances, and plateau at the largest distances. This suggests that chemogenomic variation rapidly accumulates. This helps to explain the observation that therapeutic combination therapies are typically taxon specific; for example, in fungi very few combinations of antifungals act across species (Brown et al. 2014).

We then mapped drugs with known targets to different molecular networks in E. coli K-12 and found that the targets of synergistic DDIs are closer in molecular networks than other types of DDIs, and that synergistic interactions have a higher evolutionary rate, suggesting that connectivity between nodes that are closer in molecular networks evolves at a faster rate than between more distant nodes.

DDI Evolution

Examining pairwise distances of DDI scores between species, we observed a rapid accumulation of differences and overall dissimilarity in DDI responses among species. Indeed the DDI score distances curve appears saturated among distantly related species. The high rate of DDI evolution we observed is consistent with the observations from the source paper that most (∼70%) drug interactions are species-specific, and about one fifth are strain-specific (Brochado et al. 2018). Indeed, it is now generally appreciated that some DDIs show substantial genetic variation (Roemhild et al. 2022). For instance, the interaction between the aminoglycoside tobramycin and different β-lactams can be either synergistic or additive across isolates of multidrug-resistant Enterobacteriaceae (Fass 1982). Some interactions between drugs used for treating urinary tract infections are additive across E. coli isolates (50%, e.g. mecillinam and ciprofloxacin) or antagonistic across isolates (10%, mecillinam and nitrofurantoin), while two different combination types occur across isolates for other combinations (40%; e.g. mecillinam and trimethoprim) (Fatsis-Kavalopoulos et al. 2020).

We also clustered DDIs by their similarity across strains and species and then evaluated the rate of evolution of these clusters under a Brownian motion model. This approach is analogous to methods used in geometric morphometrics, in which specimen landmarks are grouped to evaluate different features, and modularity tests are used to assess whether there are differences in the evolution rate of different regions of that organ (Smaers and Vanier 2019). Modularity among traits mirrors shared functional, genetic, and developmental pathways (Cheverud 1996). Here, the signal of high modularity among DDI clusters is a result of variance between clusters being greater than the variance within clusters. We observed that the distribution of synergies, additivities, and antagonisms among clusters partially explains different rate estimates. Clusters with low rates are composed of mostly additive DDIs; however, some of these low-rate estimates are caused by resistance to one or both antibiotics, and a low phylogenetic signal is expected when multiple strains are resistant (Michel et al. 2008). Importantly, low evolutionary rates can reflect two fundamentally different scenarios. The first is a scenario where a functional relationship between two drug targets is conserved across species and strains, likely under purifying selection. For instance, cluster 4 contains many DCs that result in synergistic or slightly synergistic DDIs and are conserved across all strains. A second scenario involves targets that do not interact functionally, resulting in consistently additive interactions across different biological contexts. These interactions represent “evolutionary indifference” where a lack of connectivity between nodes in the network, for example, may contribute to the apparent conservation. Our use of t-SNE clustering categorized DDIs into clusters based on both their interaction patterns and evolutionary rates, so that these two low-rate scenarios were separated into distinct clusters. In contrast to these low-rate clusters, high rate clusters such as 14, 15, and 1 contain DCs with a combination of synergistic and additive DCs across strains, or else antagonistic and additive DCs across strains. This variability may reflect a complex interplay of evolutionary forces acting on the underlying network.

Biological Network Inferences

With these rate estimates and interaction patterns, we asked whether network-based metrics are correlated with DDI divergence, by linking a subset of DCs with their respective protein targets in E. coli cofunctional (EcoCyc/GO-BP) and PPI networks. We hypothesize that the rate of evolution of a DDI reflects the evolution of underlying connections between the protein targets of the DC.

We first made several observations about the E. coli DDI types and network connectivity: synergistic DDIs occur when the targets are closer in the network, while additive and antagonistic DDIs target, on average, more distant proteins (Fig. 3a and b). This result is not surprising given the fact that synergies are expected to be more prevalent in DCs that target the same pathway or that belong to the same chemical class (Cowen and Steinbach 2008; Yeh et al. 2009), however, we think this is the first time this pattern has been shown using network data. We also found that synergistic interactions are more prevalent among targets that are part of the same cellular process or functional category, while antagonisms can take place between targets that are in different parts of the cell (Yeh et al. 2006; Wang et al. 2012; Brochado et al. 2018). We further found that targets associated with synergistic DDIs have a higher node degree (for the cofunctional network) and K-edge connectivity than targets of DCs with additive or antagonistic DDIs (Fig. 3c to f). This result agrees with a previous study that reports on the positive contribution of node degree in synergy prediction (Liu et al. 2022). Average node betweenness and eigenvector centralities were also found to be more significant among targets of DCs with synergistic DDIs compared to additivities and antagonisms in the cofunctional network (Fig. 3g to j). This result suggests that targets of DCs with synergistic DDIs tend to be “hubs” in the network (Li et al. 2011; Tilli et al. 2016), or that they are more central to different pathways than targets of DCs with antagonistic or additive DDIs.

At large evolutionary scales, network rewiring rates appear to slow down (Shou et al. 2011); our DDI-based estimates also suggest a pattern of saturation of network divergence at the largest evolutionary distances (Fig. 1e). Beltrao and Serrano (2007) estimated a single rewiring rate among proteins that is faster than sequence evolution; here, we demonstrated that there is a distribution of evolutionary rates. We examined the rate of evolution of DDIs whose targets are closer in the PPI and cofunctional networks and found that their rate of evolution is higher (Fig. 4d). This is correlated with our observation that synergies tend to occur when DC targets are closer in the PPI and cofunctional networks (Fig. 3a and b) and have a high evolutionary rate (Fig. 4a to c). Accordingly, as the path length between targets increases (and for more antagonistic DDIs), the evolutionary rate decreases (Fig. 4c and d). Our results indicate that the rewiring rates of DDI scores are higher for drug targets that are nearby within biological networks, contrary to the initial expectation that more distant nodes would have greater flexibility for rewiring. We hypothesize several mechanisms to explain these findings, including roles for epistasis, pathway topology, and pharmacodynamics. Proximity in biological networks often implies direct epistatic interactions (where the effect of one gene is directly influenced by another), which can be rapidly rewired through single mutational events (Phillips 2008). Conversely, targets that are functionally distant typically exhibit indirect epistasis, with multiple intervening genes that can buffer changes, thus requiring multiple mutations for effective rewiring (Wagner 2013). This buffering capacity reduces the likelihood of rapid change among distant targets in our DDI analysis.

We found differences in rates are also associated with differences in interaction type. Indeed, interaction signs (synergistic, antagonistic) are influenced in complex ways by the local topology of the pathway, including whether the drug targets act in series or parallel. Modifications in drug effects (e.g. partial vs. total inhibition) or pathway essentiality may swiftly convert an interaction from synergistic to antagonistic or vice versa. Such transformations are more feasible within tightly connected networks where targets share direct functional relationships (Yeh et al. 2006). Another mechanism by which network proximity may determine rates of connectivity evolution is through the action of gene duplication, or segmental duplication affecting multiple genes in an operon. Such duplications may be particularly relevant in closely connected network nodes, where duplicated genes are likely to interact and undergo selective pressures together. For example, if gene duplication results in parallel pathways and selective drug resistance develops differently in these pathways, it can transform drug interaction dynamics from antagonism to synergy, as one pathway compensates for the inhibited function of the other (Díaz-Mejía et al. 2007; Hegreness et al. 2008). This capability for rapid adaptation is less probable in distant nodes, which are typically part of separate functional systems, requiring more extensive genetic and regulatory changes to evolve new interactions and less likely to affect localized clusters of duplicated genes (Schmidt et al. 2003; Conant and Wolfe 2008). The functional and spatial proximity of these nodes within both the network and the genome promotes faster evolutionary rewiring, supporting observed patterns of network dynamics where close nodes exhibit more rapid evolutionary changes compared to more distantly connected nodes. Because the method that we are using relies on druggability of respective targets, pharmacokinetic effects of the network are also detected, and may vary as a function of distance between targets. For distant targets, interactions often manifest through altered pharmacokinetics (e.g. absorption, metabolism) rather than direct pathway interactions. Changes in one drug's metabolism can inadvertently affect the bioavailability of another, leading to modifications in DDI type (Roemhild et al. 2022).

These mechanisms suggest why closer nodes might exhibit higher rates of evolutionary rewiring compared to more distant ones. Our findings are in line with the notion that synergistic interactions, which are predominantly found among closely linked targets, tend to accelerate the evolution of resistance, potentially driving faster network changes (Hegreness et al. 2008). While our use of DDIs to investigate network change opens up more questions than answers, a key attractive feature is that they highlight specific routes for future investigation of the mechanisms behind the putative network changes.

An important extension of the work presented here involves leveraging available cross-species network data (e.g. STRING; Szklarczyk et al. 2023). Such analysis will facilitate direct comparison of network connectivity evolution from network data and drug combination data. Comparative network analyses are also important for evaluating the sensitivity of our observed relationships between DDI evolutionary rates and network connectivity to the species from which the reference network is obtained. In a preliminary analysis, we examined STRING's PPI and comprehensive networks for four strains: E. coli strains K-12 (ebw) and IAI1 (ecr), Salmonella strain LT2 (stm), and Pseudomonas strain PAO1 (pae). All of these networks support our primary conclusion that DDI evolutionary rates decrease with increasing network distance (supplementary fig. S2, Supplementary Material online). Further work with comparative network analysis (e.g. De Domenico et al. 2015) will be crucial for advancing our understanding of how network topology evolution corresponds to evolution of DDIs.

Our observation of differential evolutionary rates among drug interaction types suggests that different molecular pathways have different rewiring rates. Different processes (besides the drugs themselves) may be driving this pattern, for instance, positive selection on mutations that cause coexpression or colocalization of components from different networks may result in novel connections between the previously disconnected pathways. Purifying selection against loss of function may also constrain the rewiring of the network.

Beltrao and Serrano (2007) suggested that differences in rewiring rates between functional categories may be attributed to natural selection. In our DDI analyses, we found that there is variation in functional process annotations among the subset of known target proteins of drugs in clusters 2, 3, 4, 6, and 16 (supplementary fig. S4, Supplementary Material online). These clusters all showed enrichment for anion binding (GO:0043168) and small molecule binding (GO:0036094), while clusters 3, 6, and 16 are also enriched for ion binding (GO:0043167). These results suggest that differences in rates of DDI evolution may be partially explained by differences in natural selection acting on different underlying network components. However, next, we highlight a different type of bias in DDI data that may contribute to our observed signal and cause difficulty in the interpretation of DDI data.

Network vs Resistance Evolution

Another important explanation for our observed signal (high evolutionary rate of synergistic, well-connected targets) lies in the fact that some strains are resistant to particular antibiotics. We suspect that some changes in DDIs that we correlated with changes in network structure are actually caused by changes in resistance to one or both drugs, rather than changes in the drug interaction term itself. Changes in resistance are likely based on structural changes in either target proteins, or other resistance mechanisms. These other resistance mechanisms include transporter proteins or cell membrane permeability characteristics, which affect cross-resistance and collateral sensitivity, and do not represent the types of molecular network changes that we hoped to measure with the DDI. Indeed, there is genetic variation in how drug interactions are affected by the order in which DCs are administered, suggesting hysteresis (Roemhild et al. 2022). Note that Brochado et al. (2018) reported any strain with fitness values (e.g. ratio of drug/nondrug treated growth) of >0.7 as being resistant to that drug. However, some drugs that are known to be clinically bacteriostatic (inhibit growth) fall in this range, and strains are reported to be resistant; e.g. see the case of sulfamonomethoxine resistance in all species and strains, below. Indeed, Bushby (1973) reported “resistance to one of the drugs as measured by conventional tests may not abolish synergy.” To consider the effect of resistance differences on our results, we repeated the entire analysis filtering for DDIs that are “susceptible” (fitness < 0.7) in at least one of the species for both drugs, assuming that this decreases the proportion of true resistance cases in the dataset. This set included 1127/2655 DCs (42.2% of the total DCs). Our results for this smaller set are consistent with the full dataset (supplementary figs. S5 to S7, Supplementary Material online). A minor difference in the subset is that antagonisms have a slightly lower (instead of equivalent) rate than additivities. Although this consistency between the full and reduced dataset is desirable, it does not remove the possibility that differences in resistance among strains are a substantial part of the evolutionary signal.

We used DDIs to learn about the evolution of networks, so discussion of some of the individual DDIs illustrates strengths and limitations of the approach. The evolutionary and network parameters also provide some insight into how and why individual DDIs with clinical relevance may vary across species. Next, we discuss these aspects of the interaction between A22 and novobiocin, and sulfamonomethoxine and trimethoprim or erythromycin.

A22 and Novobiocin

The DDI between A22 and novobiocin is a case example (supplementary table S3, Supplementary Material online). Briefly, A22 binds the ATP-binding domain of MreB (eco:b3251) (Bean et al. 2009), the actin homolog in prokaryotes, while novobiocin binds gyrase subunit B (gyrB, eco:b3699) and topoisomerase IV (parC, eco:b3019), which are two type II topoisomerases involved in DNA unwinding and DNA duplication (Gellert et al. 1976; Hardy and Cozzarelli 2003). MreB and ParC are known to physically interact (Madabhushi and Marians 2009) and are required for chromosome segregation (Huang et al. 1998). A22 and novobiocin have previously been reported to be synergistic in E. coli and P. aeruginosa (Taylor 2011; Taylor et al. 2012); however, in the dataset from Brochado et al. (2018), the combination type is variable within species. Both E. coli strains are susceptible to both drugs, but they have different interaction terms: one is additive, the other synergistic. Salmonella also shows intraspecific differences, where one of the strains is synergistic and the other additive, although in this case, the synergistic strain is resistant to novobiocin, with a fitness of 0.85. Both Pseudomonas strains are resistant to novobiocin, with fitnesses greater than 0.86, and they have additive DDIs. This high variability of DDI type within species resulted in an evolutionary rate of 0.008 bliss^2^/MYA for this DC, within the cluster with the highest evolutionary rate. The distance between these targets in the cofunctional (EcoCyc/GO-BP) network was 2, and in the DDI network the distance between targets was 4. Why are the rates so high among strains, for this DDI? Strain-specific network predictions at STRING (Szklarczyk et al. 2023) suggest strain-specific differences in local network content. Of the genes all connected to each other in the local network (mreB, gyrA, parC, gyrB), parC is missing in three strains, while a fourth strain has reduced connectivity for MreB (only connected to gyrB). This type of gene content variation suggests the type of intraspecific network variation that may lead to high observed rates of DDI evolution.

Sulfamonomethoxine and Trimethoprim or Erythromycin

Another illustrative example of a known DDI used in the clinic and found in our data is the combination of sulfamonomethoxine and trimethoprim (supplementary table S3, Supplementary Material online) (Bushby 1973). These drugs inhibit successive steps in the synthesis of tetrahydrofolic acid synthesis, necessary for the biosynthesis of amino acids, purines, and thymidine. Sulfamonomethoxine competes against the enzymatic substrate of dihydropteroate synthase (folP, eco:b3177); this blocks the production of dihydrofolic acid, which in turn is a substrate for dihydrofolate reductase. Trimethoprim inhibits dihydrofolate reductase (folA, eco:b0048) by competing against the substrate for the binding site of the enzyme. Even on its own, trimethoprim can contribute to “thymineless death” (Then and Angehrn 1973); but while the effect of each drug on its own is bacteriostatic (prevents growth; Ocampo et al. 2014), when used in combination, the combined effect is bactericidal (kills bacteria). This is the cause of the strong synergistic effect between the two drugs (Bushby 1973; Ocampo et al. 2014).

We measured an evolutionary rate of 0.008 bliss^2^/MYA for this DDI, as it belongs to the cluster with the highest evolutionary rate. This rate can be explained by the difference in DDI type between Pseudomonas and non-Pseudomonas strains. In the data used in our study, the interaction was synergistic for E. coli and S. enterica, but additive in Pseudomonas. The DDI may be additive in Pseudomonas because it is resistant to both drugs, due to differences in its permeability to the drugs and the presence of efflux pumps that remove the drugs (Eliopoulos and Huovinen 2001). In contrast, E. coli and S. enterica were only resistant to a single drug, sulfamonomethoxine.

Our goal in using the sulfamonomethoxine and trimethoprim DDIs to study the network, in this case, was to capture the rate of evolution of the folate biosynthesis pathway across species. The fact that the two targets act as successive steps in all studied species suggests several possible causes for this observed rate. There are some differences in the presence of pathways components that are peripheral to these steps (Pribat et al. 2010), and it is possible that such differences in the network beyond the direct connection contribute to DDI differences. Indeed, beyond their direct link in the cofunctional (EcoCyc/GO-BP) network, the targets are highly connected to each other, with a K-edge connectivity of 21 (68% quantile), meaning that 21 edges have to be removed to disconnect the two targets. It is also possible that there are differences in the direct target enzymes and their connectivity (e.g. concentration, kmax). An alternative explanation is resistance: resistance to sulfamonomethoxine and trimethoprim was detected soon after they were introduced in clinical treatments. Several point mutations have been described in the target genes across different bacteria, and the most common mechanism of resistance is due to the bacteria having an extra copy of the target gene (Estrada et al. 2016). All of this suggests that, for sulfamonomethoxine and trimethoprim, evolution of the DDI is the result of many types of biological differences, not just specific differences in network interactions between the drug targets. This is an important limitation to the DDI approach to studying network evolution.

Trimethoprim also interacts with erythromycin, a macrolide that inhibits protein synthesis and has a bacteriostatic effect. This interaction is additive in E. coli, but when sulfate is added to the media the combination type becomes suppressive (a type of antagonistic interaction) (Qi et al. 2021). Trimethoprim may cause sulfur limitation resulting in changes in expression of sulfur reduction genes. Interestingly, this suppressive effect is dependent on the gene crl, and a crl knockout strain eliminates the suppressive response entirely. This is therefore a clearly demonstrated case of genetic variation affecting a DDI. Because Pseudomonas does not have crl (Cavaliere et al. 2015) we might have expected to see a high rate of DDI evolution; however, Brochado et al. (2018) did not add sulfate to the growth media, and thus the interaction between trimethoprim and erythromycin is additive across all species and in a cluster with one of the lowest evolutionary rates (0.0003 bliss^2^/MYA). We predict that, in the presence of sulfate, the rate of this DDI would be much higher (suppressive in non-Pseudomonas strains, and additive in Pseudomonas), illustrating a case of DDI evolution as a result of network evolution, as well as a strong dependence on environmental conditions.

Conclusions

Here, we introduce a novel framework to compare evolutionary rates across entire networks. We examined changes in DC effects among strains and species, and compared these rates of change with the DDI types and, when possible, network topology of the drugs’ targets. This approach has some advantages over direct network analyses, such as increasing the number of species under study, without the need of obtaining the underlying network in each species. DDIs can be quantified using high-throughput experiments across different species and strains, and are phenotypic quantitative traits that can be modeled in a phylogenetic comparative framework, allowing for an independent measurement of evolutionary rates between nodes in the network. An important limitation of our approach is that evolution of resistance to a drug may obscure underlying network-based effects. Still, our approach suggests a general picture of network evolution, where close nodes in the network (which tend to respond synergistically in response to targeting both at once) evolve at faster rates than more distant nodes.

Materials and Methods

DDI Scores, Strain Susceptibility, and Combination Types

We obtained DDI scores (Bliss scores) from a previous study (Brochado et al. 2018) that assessed 2,655 combinations (after removing 228 DDIs with missing data) of 79 different compounds on six strains of three species of gram-negative bacteria: Escherichia coli K-12 BW2952 (ebw), E. coli O8 IAI1 (ecr), Salmonella enterica subsp. enterica serovar Typhimurium 14028S (seo), S. enterica subsp. enterica serovar Typhimurium LT2 (stm), Pseudomonas aeruginosa PAO1 (pae), P. aeruginosa UCBPP-PA14 (pau). We used the following datasets from Brochado et al. (2018): interaction scores from table ED09C as input for our phylogenetic comparative analysis, strain susceptibility to antibiotics from supplementary table S1, Supplementary Material online (used for our supplementary figs. S5 to S7, Supplementary Material online), and the categories that DDIs belong to from supplementary table S2, Supplementary Material online.

Divergence Time Estimates

We obtained concatenated alignments of 27 highly conserved protein sequences for the six strains using PhySpeTree (Fang et al. 2019). We simultaneously estimated the phylogeny and divergence times in BEAST2 (Bouckaert et al. 2014) using the following approach: We employed one partition for each protein, with linked trees/clocks. A Yule tree model was used in conjunction with an optimized relaxed molecular clock. The following priors were used for calibrating node time estimates: For the divergence time between E. coli and S. enterica, 100 to 160 MYA (Vernikos et al. 2007; Meysman et al. 2013; Knöppel et al. 2018). The divergence time between these species and P. aeruginosa occurred within the range of 1350.0 to 1527.7 MYA (Battistuzzi et al. 2004; Blair Hedges and Kumar 2009; Marin et al. 2017). The Markov chain Monte Carlo analysis was run for 20 million iterations.

Clustering DDIs Using t-SNE

To reduce data dimensionality of the DDI trait space, we classified DDIs into clusters using t-SNE in the R packages bigMap (Garriga and Bartumeus 2018) and bigmemory (Kane et al. 2013) with parameters: 80 threads, 80 layers, and 9 rounds. DDIs that were similar to each other across strains were clustered together. The t-SNE analyses were performed for a range of 251 perplexity values between 5 and 2,505 (supplementary Doc. S1, Supplementary Material online). The clustering output was evaluated based on the stability and plateauing of cost and effect size, and the variance between threads. We obtained stable solutions between perplexity values 175 and 1455. The pakde algorithm was applied with a perplexity of 1/3 the respective t-SNE perplexity. To find which of the clustering solutions was the most modular, we tested for modularity in the data using the function phylo.modularity within the R package geomorph (Adams et al. 2013). The five most modular clustering patterns were selected with the most negative ZCR coefficients. These clustering patterns had the following perplexity values: 355, 245, 275, 215, and 345; and the following number of clusters: 9, 16, 14, 18, and 12. All five clustering patterns had a strong modular signal, with multivariate effect sizes under −26.6, P-value = 0.001 and covariance ratios below 0.91. Lastly, we fitted a multivariate Brownian motion model to each of these five clustering patterns and calculated the evolutionary rates per cluster using compare.multi.evol.rates, also using the R package geomorph (Adams et al. 2013). Out of the five most modular clustering models, the model with a perplexity of 245 was the one with a higher Z effect in the test, showing the biggest differences between groups. Thus, this clustering pattern was used in the following steps. In addition, the sigma rates calculated for each of the clusters were used as approximations for DDI evolutionary rates that are part of that cluster.

Identification of Drug Targets and their Molecular Networks

Each drug was identified with unique Pubchem and CHEMBL IDs using webchem (Szöcs et al. 2020), which were then used to retrieve their mode of action from IUPHAR (Armstrong et al. 2020). We also compared our targets to a previously published dataset on drugs and drug targets (Santos et al. 2017), identified unique Uniprot IDs and KO IDs for each target protein, and converted these IDs into E. coli Uniprot IDs using the KEGG Orthology (Kanehisa and Goto 2000). We did not include drugs whose mode of action was unknown, or that had nonprotein molecules as targets, such as small molecules, RNA, or DNA.

Two molecular networks of E. coli were downloaded from EcoliNet (https://www.inetbio.org/ecolinet/downloadnetwork.php) (Kim et al. 2015), small/medium-scale PPIs (LC; 764 genes, 1,073 links) and the gold-standard cofunctional gene pair network of E. coli derived from EcoCyc and GO-BP (1,835 genes, 10,804 links). Comparative networks were also downloaded from the STRING database (Szklarczyk et al. 2023) for Escherichia coli K-12 BW2952 (ebw), E. coli O8 IAI1 (ecr), S. enterica subsp. enterica serovar Typhimurium LT2 (stm), Pseudomonas aeruginosa PAO1 (pae), filtered for interactions with a 70% score or higher, and mapped to our DDI data via KEGG ortholog IDs.

Inter-node Network Metrics

For each of the networks, we calculated the average path length and node degree distributions. In addition, the minimum distance between each of the nodes in the network was calculated, as well as the node degree (number of connections per node), the K-edge connectivity between each pair of nodes (i.e. the minimum number of edges that can be removed to disconnect the nodes), betweenness centrality (i.e. a measure of centrality in the network based on shortest paths) and eigenvector centrality (i.e. a measure of the influence of the node in the network). We used the R package igraph (Csardi and Nepusz 2006) to calculate these values in each one of the molecular networks and for each node or pair of nodes. We also generated an adjacency matrix, which contains information on whether two nodes are connected directly by an edge or not. In addition, we used K-edge connectivity as a proxy for connectedness between proteins (i.e. a pair with K-edge connectivity equal to zero is disconnected, and proteins with K-edge connectivity different than zero are connected).

Enrichment Analysis per Cluster

Differential gene set enrichment analysis was performed across the subset of DDIs with known target proteins for each cluster using the R package clusterProfiler ver 4.10.0 (Yu et al. 2012) and org.EcK12.eg.db. The background set included only the known targets. In many cases, drugs from different clusters target the same protein.

Supplementary Material

msae098_Supplementary_Data

Bibliography85

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Adams DC, Otárola‐Castillo E, Paradis E. geomorph: an R package for the collection and analysis of geometric morphometric shape data. Methods in Ecology and Evolution. 2013:4(4):393–399. 10.1111/2041-210X.12035. · doi ↗
2Armstrong JF, Faccenda E, Harding SD, Pawson AJ, Southan C, Sharman JL, Campo B, Cavanagh DR, Alexander SPH, Davenport AP, et al The IUPHAR/BPS guide to PHARMACOLOGY in 2020: extending immunopharmacology content and introducing the IUPHAR/MMV guide to MALARIA PHARMACOLOGY. Nucleic Acids Res. 2020:48(D 1):D 1006–D 1021. 10.1093/nar/gkz 951.31691834 PMC 7145572 · doi ↗ · pubmed ↗
3Battistuzzi FU, Feijao A, Hedges SB. A genomic timescale of prokaryote evolution: insights into the origin of methanogenesis, phototrophy, and the colonization of land. BMC Evol Biol. 2004:4:44. 10.1186/1471-2148-4-44.15535883 PMC 533871 · doi ↗ · pubmed ↗
4Bean GJ, Flickinger ST, Westler WM, Mc Cully ME, Sept D, Weibel DB, Amann KJ. A 22 disrupts the bacterial actin cytoskeleton by directly binding and inducing a low-affinity state in Mre B. Biochemistry. 2009:48(22):4852–4857. 10.1021/bi 900014 d.19382805 PMC 3951351 · doi ↗ · pubmed ↗
5Beltrao P, Serrano L. Specificity and evolvability in eukaryotic protein interaction networks. P Lo S Comput Biol. 2007:3(2):e 25. 10.1371/journal.pcbi.0030025.17305419 PMC 1797819 · doi ↗ · pubmed ↗
6Bernhardsson S, Gerlee P, Lizana L. Structural correlations in bacterial metabolic networks. BMC Evol Biol. 2011:11:20. 10.1186/1471-2148-11-20.21251250 PMC 3033826 · doi ↗ · pubmed ↗
7Blair Hedges S, Kumar S. The timetree of life. Oxford: Oxford University Press; 2009.
8Bouckaert R, Heled J, Kühnert D, Vaughan T, Wu C-H, Xie D, Suchard M, Rambaut A, Drummond A, Prlic A. BEAST 2: A Software Platform for Bayesian Evolutionary Analysis. P Lo S Computational Biology. 2014:10(4):e 1003537. 10.1371/journal.pcbi.1003537.24722319 PMC 3985171 · doi ↗ · pubmed ↗