Drug-drug interaction prediction based on co-medication patterns and graph matching
Wen-Hao Chiang, Li Shen, Lang Li, Xia Ning

TL;DR
This paper introduces novel kernel methods utilizing graph matching and co-medication patterns within support vector machines to accurately predict adverse drug reactions from complex drug combinations.
Contribution
It presents new kernels based on graph matching and co-medication data for predicting drug interactions of arbitrary orders, advancing the accuracy of adverse drug reaction prediction.
Findings
Achieved an AUC of 0.912 on real-world data
Utilized co-medication patterns to measure drug similarities
Developed kernels effective for complex drug combination prediction
Abstract
Background: The problem of predicting whether a drug combination of arbitrary orders is likely to induce adverse drug reactions is considered in this manuscript. Methods: Novel kernels over drug combinations of arbitrary orders are developed within support vector machines for the prediction. Graph matching methods are used in the novel kernels to measure the similarities among drug combinations, in which drug co-medication patterns are leveraged to measure single drug similarities. Results: The experimental results on a real-world dataset demonstrated that the new kernels achieve an area under the curve (AUC) value 0.912 for the prediction problem. Conclusions: The new methods with drug co-medication based single drug similarities can accurately predict whether a drug combination is likely to induce adverse drug reactions of interest. Keywords: drug-drug interaction prediction; drug…
| Notation | Description |
|---|---|
| Drug | |
| Drug combination | |
| Complete graph for a drug combination | |
| Single drug similarity from drug 2d structures | |
| Single drug similarity based on co-medications | |
| Kernel based on graph matching algorithm | |
| Kernel from common drugs | |
| Kernel from drug similarities | |
| Probabilistic drug combination kernel |
| no | total | ||
|---|---|---|---|
| total |
| dataset | stats | |||||
|---|---|---|---|---|---|---|
| # | 621,449 | 1,264 | 8,986 | 27,387 | ||
| # | 1,209 | 417 | 881 | 1,201 | ||
| avgOrd | 6.100 | 2.351 | 3.588 | 7.096 | ||
| avgFrq | 1.761 | 225.317 | 13.730 | 1.402 | ||
| avg | - | 0.546 | 16.343 | - | ||
| # | 2,200 | 1,264 | 2,464 | 1,000 | ||
| # | 562 | 417 | 692 | 679 | ||
| avgOrd | 2.678 | 2.351 | 3.809 | 7.615 | ||
| avgFrq | 42.082 | 225.317 | 20.565 | 5.520 | ||
| avg | - | 0.546 | 31.998 | - | ||
| acc | 0.829 | 0.836 | 0.817 | 0.827 | 0.827 | 0.825 | 0.763 | 0.765 | ||||
| pre | 0.889 | 0.892 | 0.879 | 0.878 | 0.893 | 0.865 | 0.810 | 0.770 | ||||
| rec | 0.752 | 0.765 | 0.735 | 0.759 | 0.744 | 0.770 | 0.689 | 0.756 | ||||
| F1 | 0.815 | 0.823 | 0.801 | 0.814 | 0.812 | 0.815 | 0.744 | 0.763 | ||||
| AUC | 0.898 | 0.912 | 0.907 | 0.909 | 0.900 | 0.900 | 0.843 | 0.853 | ||||
| 13.3 | 16.6 | 24.3 | 89.8 | 30.7 | 18.6 | 15.6 | 0.10 |
| N | prd | frq | combinations | |
|---|---|---|---|---|
| 1 | 2.696 | 26 | - | atorvastatin fenofibrate rosiglitazone simvastatin |
| 2 | 2.507 | 26 | - | allopurinol amlodipine atorvastatin levothyroxine naproxen omeprazole simvastatin |
| 3 | 1.878 | 22 | - | acetylsalicylicacid atorvastatin bisoprolol clopidogrel ramipril simvastatin |
| 4 | 1.855 | 27 | - | acetylsalicylicacid atenolol atorvastatin furosemide lansoprazole lisinopril nitroglycerin |
| 5 | 1.785 | 21 | - | citalopram clozapine isosorbidemononitrate prochlorperazine simvastatin zopiclone |
| 6 | 1.750 | - | 0.842 | amlodipine bisoprolol pravastatin ramipril simvastatin spironolactone warfarin |
| 7 | 1.696 | 22 | - | amlodipine clopidogrel ibuprofen omeprazole ramipril simvastatin |
| 8 | 1.669 | 29 | - | bisoprolol flecainide ramipril simvastatin |
| 9 | 1.613 | 35 | - | aripiprazole atorvastatin bendroflumethiazide clozapine diazepam folicacid furosemide iron lactulose lansoprazole perindopril ramipril trimethoprim zopiclone |
| 10 | 1.549 | - | 0.875 | lansoprazole omeprazole pantoprazole rabeprazole |
| N | prd | frq | Combinations | |
|---|---|---|---|---|
| 1 | 4.167 | 3 | - | atorvastatin lansoprazole pravastatin rosuvastatin simvastatin |
| 2 | 4.009 | - | 11.372 | atorvastatin pravastatin rosuvastatin simvastatin |
| 3 | 3.776 | - | 50.043 | atorvastatin fenofibrate metformin pravastatin rosuvastatin simvastatin |
| 4 | 3.734 | - | 68.232 | atorvastatin metformin pravastatin rosuvastatin simvastatin |
| 5 | 3.676 | - | 45.487 | atorvastatin lovastatin rosuvastatin simvastatin |
| 6 | 3.618 | - | 136.470 | atorvastatin pravastatin rosuvastatin simvastatin tadalafil |
| 7 | 3.573 | 9 | - | atorvastatin fenofibrate pravastatin simvastatin |
| 8 | 3.552 | 10 | - | atorvastatin ezetimibe fenofibrate rosuvastatin |
| 9 | 3.519 | - | 22.746 | atorvastatin ezetimibe rosuvastatin simvastatin |
| 10 | 3.461 | 11 | - | atorvastatin lansoprazole pravastatin simvastatin |
| N | prd | frq | Combinations | |
|---|---|---|---|---|
| 1 | 2.083 | 4 | - | calcium clonazepam colestipol prednisone teriparatide |
| 2 | 1.992 | - | 45.487 | alendronate anastrozole desloratadine hydrochlorothiazide lisinopril triamterene valdecoxib vitaminc |
| 3 | 1.968 | - | 17.058 | alendronate raloxifene risedronate teriparatide |
| 4 | 1.960 | - | 90.978 | alendronate amlodipine atenolol clonazepam raloxifene teriparatide |
| 5 | 1.901 | - | 45.489 | alendronate fexofenadine hydrochlorothiazide omeprazole prednisone risedronate triamterene |
| 6 | 1.850 | 5 | - | alendronate fexofenadine levothyroxine nabumetone oxybutynin |
| 7 | 1.849 | - | 113.720 | alendronate calcium esomeprazole ibandronate levothyroxine rabeprazole |
| 8 | 1.843 | - | 7.581 | alendronate calciumgluconate teriparatide |
| 9 | 1.838 | - | 22.744 | alendronate calcium levothyroxine raloxifene teriparatide |
| 10 | 1.834 | 4 | - | calcium escitalopram iron ketorolac raloxifene teriparatide |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Pharmacogenetics and Drug Metabolism · Chemical Synthesis and Analysis
Drug-drug interaction prediction based on
co-medication patterns and graph matching
WC\fnmWen-Hao Chiang
LS\fnmLi Shen
LS\fnmLang Li
XN\fnmXia Ning
\orgnameDepartment of Computer & Information Science, Indiana University - Purdue University Indianapolis, \postcode46202 \cityIndianapolis, \cnyUSA. Email: [email protected]
\orgnameDepartment of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, \postcode19104 \cityPhiladelphia, \cnyUSA. Email: [email protected]
\orgnameDepartment of Biomedical Informatics, Ohio State University, \postcode43210 \cityColumbus, \cnyUSA. Email: [email protected]
\orgnameDepartment of Computer & Information Science, Indiana University - Purdue University Indianapolis, \postcode46202 \cityIndianapolis, \cnyUSA. Email: [email protected]
Abstract
\parttitle
Background The problem of predicting whether a drug combination of arbitrary orders is likely to induce adverse drug reactions is considered in this manuscript. \parttitleMethods Novel kernels over drug combinations of arbitrary orders are developed within support vector machines for the prediction. Graph matching methods are used in the novel kernels to measure the similarities among drug combinations, in which drug co-medication patterns are leveraged to measure single drug similarities. \parttitleResults The experimental results on a real-world dataset demonstrated that the new kernels achieve an area under the curve (AUC) value 0.912 for the prediction problem. \parttitleConclusions The new methods with drug co-medication based single drug similarities can accurately predict whether a drug combination is likely to induce adverse drug reactions of interest.
drug-drug interaction prediction,
drug combination similarity,
co-medication,
graph matching,
keywords:
\startlocaldefs\endlocaldefs
{fmbox}\dochead
Research
{artnotes}
{abstractbox}
Introduction
Drug-Drug Interactions () and the associated Adverse Drug Reactions (ADRs) represent a consistent detriment to the public health in the United States. have accounted for approximately 26% of the ADRs, occurred among 50% of the hospitalized patients [1], and caused nearly 74,000 emergency room visits and 195,000 hospitalizations annually in the US [2]. Apart from these, because of the common practice of co-medication among elderly Americans, particularly co-medication of more than two drugs, the high-order drug-drug interactions and their associated ADRs have imposed significant scientific and public health challenges. The National Health and Nutrition Examination Survey [3] reports that more than 76% of the elderly Americans take two or more drugs every day. Another study [4] estimates that about 29.4% of elderly American patients take six or more drugs every day. However, for most of such high-order , their mechanisms are unknown.
In this manuscript, novel approaches to predicting whether high-order drug combinations are likely to induce ADRs are presented. The prediction problems are formulated as a binary classification problem and support vector machines (SVMs) are used for the prediction. Novel kernels over drug combinations of arbitrary orders are developed within the framework of SVMs. These kernels are constructed using drug co-medication information to measure single drug similarities and graph matching on drug combination graphs to measure drug combination similarities. A comparison on the new kernels with other convolutional kernels and probabilistic kernels on drug combinations is also conducted. The experimental results demonstrate that the new kernels outperform the others and can accurately predict whether a drug combination is likely to induce ADRs of interest with an AUC value 0.912. To the best of our knowledge, this manuscript represents the first effort in predicting for drug combinations of arbitrary orders.
Background
Drug-drug interactions
Significant research efforts have been dedicated to detect pairwise drug-drug interactions () [5, 6] in recent years. Existing methods either extract pairs mentioned in medical literature or Electronic Health Records (EHRs) [4], or predict/score pairs from various drug/target information [7]. While most of the existing studies are focused on interactions between a pair of drugs (i.e., order-2 ), understanding high-order and their associated ADRs has attracted increasing attention recently [2, 8]. These emerging methods on high-order studies are largely focused on how to discover high-order through mining frequent itemsets (i.e., drug combinations) from EHRs efficiently. Most recent work also includes pattern discovery from directional high-order [9] and directional high-order prediction [10].
Graph matching
Graph matching is to find the optimal vertex correspondence between two graphs [11, 12]. Graph matching problems can be broadly classified into two categories. The first category is exact graph matching, which is to find the graph and subgraph isomorphisms so that the mapping of vertices between two graphs is bijective and edge-preserving (i.e., vertices connected by an edge in one graph are mapped to vertices in the other graph that are also connected by an edge). The second category is inexact graph matching, which allows errors (e.g., different types of matched vertices in attributed graphs) during matching, and thus it is to minimize the total errors in finding optimal graph matching. Typical algorithms for graph matching include spectral methods [13], probabilistic methods [14], tree search [15], etc.
Definitions and notations
We use to represent a drug, and to represent a combination of drugs, where is the number of unique drugs in (i.e., ) and thus the order of . A drug combination is defined when the drugs and only the drugs in are taken simultaneously. There are no orderings among the drugs in a drug combination. When no ambiguity is raised, we drop the superscript in and represent a drug combination as . An event is referred to as a patient taking a drug combination. In addition, in this manuscript, all vectors (e.g., ) are represented by bold lower-case letters and all matrices (e.g., ) are represented by upper-case letters. Row vectors are represented by having the transpose superscript T, otherwise by default they are column vectors. Table 1 summarizes the important notations in the manuscript.
Methods
We formulate the problem of predicting whether high-order drug combinations induce a particular as a binary classification problem, and solve the classification problem within the framework of kernel methods and support vector machines (SVMs). In this manuscript, we consider myopathy as the in particular. The central concept of SVM-based classification methods is that “similar” instances are likely to share similar labels, and thus the key is to capture and measure the “similarities” among instances (i.e., drug combinations in our prediction problem) via kernels. In the case of drug combinations, we hypothesize that if two drug combinations share similar pharmaceutical, pharmacokinetic and/or pharmacodynamic properties, they may induce similar . Therefore, the question boils down to effectively representing and measuring the similarities in terms of such properties. To this end, we develop various kernels over drug combinations. A key property of such kernels as will be discussed later is that they are able to deal with drug combinations of arbitrary orders. These kernels are constructed using single drug similarities, which incorporate various drug information that could relate to . Here we decompose the discussion on such kernels from three aspects: 1). single drug similarities () as in Section Single drug similarities, 2). our new kernel based on matching similar drugs in drug combination graphs in Section Drug combination kernels from graph matching, and 3). other convolutional kernels [16] in Section Convolutional drug-combination kernels. Given these kernels, we further employ the freely available SVM-Light software to build up the binary classifiers and conduct our experiments based on such classifiers [17].
Single drug similarities
We use two different approaches to measuring single drug similarities (). The first approach measures single drug similarities based on their intrinsic properties that can be represented by their 2D structures [18]. The second approach measures the similarities in a more data-driven fashion based on the co-occurrence patterns among drugs.
from drug 2d structures
A straightforward way to measure between two drugs is to look at their structures, which ultimately determine their physicochemical properties. We use Extended Connectivity Fingerprints (ECFP) [19] of length 2,048 to represent drug 2D structures. Each of the fingerprint dimensions corresponds to a substructure among the drugs of interest. The binary values in the fingerprints represent whether a drug has the corresponding substructure or not. We use a vector to represent the fingerprint for drug . The between two drugs from their 2D structures, denoted as , is calculated as the Tanimoto coefficient between their ECFP fingerprints [20]. Tanimoto coefficient between two sets is defined as follows,
[TABLE]
where is the cardinality of set . Thus, is defined as
[TABLE]
where represents the set of substructures that \mbox{\mathop{\mathtt{d}}\limits}_{i} has in its fingerprint .
based on co-medications
We develop a new approach to measuring the between two drugs by looking at whether they are often involved in co-medications with similar other drugs, respectively. The hypothesis is that drugs that are respectively taken together with other similar drugs may share similar therapeutic purposes and target similar therapeutic targets, and thus behave similarly in inducing . Such data-driven co-medication based have a potential advantage over in that they leverage the signals from information directly that may not be captured or explained by drug 2D structures or other features on individual drugs. Such co-medication based is denoted as .
We use two vectors \mbox{\mathop{\boldsymbol{c}^{+}{i}}\limits}\in\mathbb{R}^{n} and \mbox{\mathop{\boldsymbol{c}^{-}{i}}\limits}\in\mathbb{R}^{n} ( is the total number of drugs) to represent the co-medication information for drug . The -th dimension () in / corresponds to drug , and the value on the -th dimension in / is the co-medication frequency of and in all the events with/without . Both and values are then normalized into probabilities. The normalized and are further concatenated into one vector , that is, \mbox{\mathop{\boldsymbol{c}{i}}\limits}=[\mbox{\mathop{\boldsymbol{c}^{+}{i}}\limits};\mbox{\mathop{\boldsymbol{c}^{-}_{i}}\limits}], for . The between drug and is calculated as the cosine similarity between and . The reason why we use and to construct instead of co-medication frequencies from all events with and without together is that the co-medication patterns from the two types of events can be very different, and thus one unified co-medication vector for both of them could not necessarily capture discriminative information among drugs.
Drug combination kernels from graph matching
We formulate the problem of comparing drug combination similarities through matching drug combination graphs, and develop a graph-matching based kernel for drug combination similarities. Specifically, for a drug combination \mbox{\mathop{D_{p}}\limits}=\{\mbox{\mathop{\mathtt{d}}\limits}_{p1},\mbox{\mathop{\mathtt{d}}\limits}_{p2},\cdots,\mbox{\mathop{\mathtt{d}}\limits}_{pk_{p}}\}, we construct a complete graph of nodes, in which each node represents a drug in , and all the nodes are connected to one another. Thus, the similarity between drug combination and can be measured based on how and match to each other. In matching such graphs, we consider so that drugs that are similar to each other should be matched, and the graph matching procedure should maximize the overall from matched drugs. The underlying assumption is that if two drug combinations share similar drugs, they could have similar . Figure 1 illustrates the idea of complete graph matching for two drug combinations, in which the drugs connected by dash lines are matched between and . The similarity calculated from graph matching over two drug combinations, denoted as , will the sum of from matched drugs. will be further converted to a valid kernel, denoted as .
Graph matching algorithm for
The drug combination graph matching problem can be solved as a well known linear sum assignment problem (LSAP) [21]. The objective is to minimize the total cost of matching vertices in two graphs, and thus to find the graph matching with minimal total cost. In the case of high-order drug combinations, we define the cost of matching two drugs and as the dis-similarity between the drugs, that is,
[TABLE]
where cost(\mbox{\mathop{\mathtt{d}{i}}\limits},\mbox{\mathop{\mathtt{d}{j}}\limits}) is the cost between and , can be either or . Thus, if two drugs are very similar (i.e., large ), the cost of matching them will be small and therefore they are more likely to be matched.
Therefore, the graph matching can be solved by solving the following LSAP problem:
[TABLE]
where is the trace of a matrix; and and are the number of vertices in and (and thus the order of and ), respectively; C(\mbox{\mathop{\mathcal{G}{p}}\limits},\mbox{\mathop{\mathcal{G}{q}}\limits})\in\mathbb{R}^{k_{p}\times k_{q}} is the pairwise drug-matching cost matrix for two drug combinations and (C(i,j)=cost(\mbox{\mathop{\mathtt{d}}\limits}_{pi},\mbox{\mathop{\mathtt{d}}\limits}_{qj}), \mbox{\mathop{\mathtt{d}}\limits}_{pi}\in\mbox{\mathop{D_{p}}\limits}, \mbox{\mathop{\mathtt{d}}\limits}_{qj}\in\mbox{\mathop{D_{q}}\limits}). In Problem 4, is the assignment matrix to match and (i.e., to assign a vertex in to a vertex in ), in which all the values are either 0 or 1, both the row sum and the column sum are either 0 or 1 (i.e., a vertex is either matched or not; if it is matched, it is matched to only one vertex in the other graph), and thus the sum of all the values is exactly the minimal of and (i.e., the vertices in the small graph have to be all matched). Essentially, assigns each of the vertices in the smaller graph of and to exactly one vertex in the larger graph. The optimization problem in 4 can be solved by the Hungarian algorithm [22]. The drug-combination similarity is then calculated as
[TABLE]
where is a matrix of all 1’s.
The drug-combination similarity matrix is always symmetric but not necessarily positive semi-definite, and thus not always a valid kernel. To convert to a valid kernel , we follow the approach in Saigo et al.[23]. Specifically, we first conduct an eigenvalue decomposition on , subtract from the diagonal of the eigenvalue matrix its smallest negative eigenvalue, and reconstruct the original matrix from the altered decomposition. The resulted matrix is positive, semi-definite, and is used as .
Convolutional drug-combination kernels
Drug combination kernels from common drugs
We define a drug-combination kernel, denoted as , based on common drugs among drug combinations. is calculated as the Tanimoto coefficient over the sets of drugs in the drug combinations, that is,
[TABLE]
where is defined as in Equation 1. It has been proved that Tanimoto coefficient is a valid kernel function [24]. essentially measures the proportion of shared common drugs among two drug combinations. The underlying assumption is that if two drug combinations share many common drugs, they are likely to have similar properties.
To further enhance the similarity between two drug combinations from their common drugs, we also define an order-2 of drug combinations, denoted as ( in Equation 6 is correspondingly referred to as order-1 and denoted as ). We first represent a drug combination \mbox{\mathop{D}\limits}=\{\mbox{\mathop{\mathtt{d}}\limits}_{1},\mbox{\mathop{\mathtt{d}}\limits}_{2},\cdots,\mbox{\mathop{\mathtt{d}}\limits}_{k}\} by all its single drugs and drug pairs, denoted as \mbox{\mathop{D}\limits}^{(2)}=\{\mbox{\mathop{\mathtt{d}}\limits}_{1},\mbox{\mathop{\mathtt{d}}\limits}_{2},\cdots,\mbox{\mathop{\mathtt{d}}\limits}_{k},(\mbox{\mathop{\mathtt{d}}\limits}_{1},\mbox{\mathop{\mathtt{d}}\limits}_{2}),(\mbox{\mathop{\mathtt{d}}\limits}_{1},\mbox{\mathop{\mathtt{d}}\limits}_{3}),\cdots,(\mbox{\mathop{\mathtt{d}}\limits}_{k-1},\mbox{\mathop{\mathtt{d}}\limits}_{k})\}. Thus, on two drug combinations and can be calculated as the Tanimoto coefficient on and , that is,
[TABLE]
Intuitively, better differentiates drug combinations with many shared drugs from those with fewer shared drugs than . We only extend to order 2 since higher-order extension does not lead to better performance according to our experimental results. According to Equation 6, when the order becomes much higher, Tanimoto(, ) may become very small due to a rapid combinatorial growth in the denominator and the insufficient common drug -tuples (i.e., the number in the nominator). Thus, with extension to much higher order may lose the ability to differentiate drug combinations that contain more common drugs.
Drug combination kernels from drug similarities
The drug combination similarities can also be measured by the average drug similarities. The hypothesis is that if two drug combinations have drugs that are similar on average, they may share similar properties. If two drug combinations have drugs that are similar on average, they may share similar properties. Therefore, we define an average-drug-similarity based kernel for drug combinations, denoted as , as follows,
[TABLE]
where and are the order of and , respectively, and can be or . Intuitively, tends to capture averaged and smoothed drug combination similarities. It has been proved that as long as the involved are valid kernels (i.e., positive semi-definite), will also be a valid kernel [16].
Probabilistic drug combination kernels from drug sets
We apply an ensemble kernel for drug combinations based on the idea as in [25]. The key idea is to use a reproducing kernel to characterize sample similarities (i.e., ), and to use a probabilistic distance in the reproducing kernel Hilbert space (RKHS) to measure the ensemble similarity. The resulted ensemble similarity matrix is a valid kernel matrix, denoted as . This ensemble involves an eigen value decomposition, during which, it is possible that some similarity matrices are deprecated numerically and it leads to defeats in calculation. To deal with this issue, we increase the diagonals of involved square matrices by a small value to guarantee the positive semi-definite properties.
Materials
Mining drug combinations
We extract high-order drug combinations from FDA Adverse Event Reporting System (FAERS) [26]. We use myopathy as the of particular interest, and extract 64,892 case (myopathy) events, in which patients report myopathy after taking multiple drugs, and 1,475,840 control (non-myopathy) events, in which patients do not report myopathy after taking drugs. Each of these events involves a combination of more than one drug.
Among all the involved drug combinations, 10,250 unique drug combinations appear in both case and control events. For those 10,250 drug combinations, we use Odds Ratio () to quantify their risks. The for a drug combination is defined based on the contingency table 2, that is, it is the ratio of the following two values: 1). the odds that the occurs when is taken (i.e., in Table 2); and 2). the odds that the occurs when is not taken (i.e., in Table 2). 1 indicates the decreased risk of after a patient takes the drug combination, indicates no risk change, and 1 indicates the increased risk. In the 10,250 drug combinations, 8,986 combinations have \mbox{\mathop{\mathtt{OR}}\limits}>1 and 1,264 combinations have \mbox{\mathop{\mathtt{OR}}\limits}<1. These two sets of drug combinations are denoted as and , respectively. In addition to these combinations, there are 27,387 unique drug combinations that only appear in case events and 621,449 unique drug combinations that only appear in control events. These two sets are denoted as and , respectively. The set of drug combinations in case events is denoted as (i.e., \mbox{\mathop{\mathcal{M}}\limits}=\mbox{\mathop{\mathcal{M}^{+}}\limits}\cup\mbox{\mathop{\mathcal{M}^{0}}\limits}), and the set of drug combinations in control events is denoted as (i.e., \mbox{\mathop{\mathcal{N}}\limits}=\mbox{\mathop{\mathcal{N}^{-}}\limits}\cup\mbox{\mathop{\mathcal{N}^{0}}\limits}). All these four sets together define a high-order drug combination dataset from FAERS, denoted as . Table 3 presents the statistics of .
Training data generation
As shown in Table 3, and of have fewer drug combinations than and , and the drug combinations in are very infrequent (average frequency 1.402). To use more frequent and more confident drug combinations from case events, we further pruned drug combinations from and as follows. From , we retained the top 1,000 most frequent drug combinations. For , we applied right-tailed Fisher’s exact test on the drug combinations to further test the significance of their s at 5% significance level. Then we retained drug combinations with statistically significant s. Thus, the pruned and contain statistically confident drug combinations, which are very likely to induce myopathy, and therefore, these drug combinations are labeled as positive instances for classification model learning.
We retained all 1,264 the drug combinations in because this set is not large and contains informative drug combinations that may or may not induce myopathy. We further prune and retain the top most frequent drug combinations. The drug combinations from and the pruned are labeled as negative instances. To make the positive and negative training sets balance, we retained 2,200 drug combinations from . The pruned dataset from is denoted as . Table 3 presents the description of . is the set of labeled drug combinations that are used for model learning. In , there are in total 1,210 drugs involved. 71 out of these 1,210 drugs induce myopathy on their own based on the Side Effect Resource (SIDER) [27]. This set of 71 drugs is denoted as .
Evaluation protocol and metrics
The performance of the different methods is evaluated through five-fold cross validation. The dataset is randomly split into five folds of equal size (i.e., same number of drug combinations). Four folds are used for model training and the rest fold is used for testing. This process is performed five times, with one fold for testing each time. The final result is the average out of the five experiments.
We use accuracy, precision, recall, F1 and AUC to evaluate the performance of the methods. Accuracy is defined as the fraction of all correctly classified instances (i.e., true positives and true negatives) over all the instances in the testing set. Precision is defined as the fraction of correctly classified positive instances (i.e., true positives) over all instances that are classified as positive instances (i.e., true positives and false positives). Recall is the fraction of correctly classified positive instances (i.e., true positives) over all positive instances in the testing set (i.e., true positives and false negatives). F1 is the harmonic mean of precision and recall. AUC score is the normalized area under the curve that plots the true positives against the false positives for different thresholds for classification [28]. Larger accuracy, precision, recall, F1 and AUC values indicate better classification performance.
Results
Overall performance
Table 4 presents the performance comparison among the four different kernels in combination with different single drug similarities on dataset . Kernel with outperforms others in three (i.e., accuracy, F1 and AUC) out of five evaluation metrics. Specifically, in accuracy, with outperforms the second best kernel with at 0.84%. In F1, with outperforms the second best kernel with and with at 0.98%. In AUC, with outperforms the second best kernel order-2 at 0.33%. In precision and recall, with is the second best kernel, whereas with and with , respectively, is the best one. Overall, with has the best performance compared to other kernels. This indicates that it is effective to classify drug combinations by representing and comparing them as graphs (i.e., a set of drugs and their co-medication relation within the set), and measuring such graph similarities using their optimal matching (i.e., the optimal correspondence among drugs). In the following discussion, we use to represent with . More experimental results on other datasets are available in the supplementary materials (see Additional file 1).
performance
Table 4 shows that on average outperforms across different kernels (with a few exceptions on in precision for and ). considers drug intrinsic 2D structures. However, drug efficacy and side effects are the results of many complicated interactions and processes among drugs and various bioentities, which may not be sufficiently explained only by drug 2D structures. Compared to , measures drug similarity based on their co-medication patterns, which could be regarded as a high-level abstraction and representation of drug therapeutic properties that may or may not be explicitly explained by each drug and its intrinsic properties independently.
In , order-2 representation (i.e., in ) for drug combinations outperforms order-1 representation (i.e., in ). In order-2 representation, in addition to single drugs, drug pairs are also used as a feature for a drug combination, which stresses the signals in drug combinations. This also conforms to common observations in other applications [29], in which higher-order features improve classification performance.
Classification
Figure 2 and 3 present the prediction values with respect to drug combination orders. In Figure 2, drug combinations have higher orders (on average 7.615 as in Table 3), and higher and mostly positive prediction values, while drug combinations have lower orders (on average 2.678), and lower and mostly negative prediction values. Meanwhile, the mis-classification typically happens on drug combinations of higher orders, and on drug combinations of lower orders. Similar trends apply for and in Figure 3. This indicates that and together are able to learn and make predictions that correspond to drug combination orders. In addition, drug combination order is correlated with their labels.
Figure 4 presents the prediction values with respect to drug combination frequencies for and . drug combinations have lower frequencies (on average 5.520 as in Table 3), and higher and mostly positive prediction values, while drug combinations have higher frequencies (on average 42.082), and lower and mostly negative prediction values. For , the mis-classification typically happens on lower-frequency drug combinations (the mis-classification for does not show strong patterns with respect to drug combination frequencies). As for and , drug combination frequencies are used to define labels. Figure 4 shows that together are able to learn and make predictions that correspond to drug combination frequencies and thus labels.
Figure 5 presents the prediction values with respect to values for and . drug combinations have higher values and also higher and mostly positive prediction values, while drug combinations have lower values and also lower and mostly negative prediction values. For , the mis-classification typically happens on drug combinations of higher values (close to 1 and thus more lean toward ; the mis-classification for does not show strong patterns with respect to values). As we use to define labels on and , Figure 5 shows is able to make reasonably accurate prediction values on the drug combinations.
drug enrichment
Table 5 presents the average percentage of drugs among all the drug combinations. For each drug combination, the percentage is calculated as the number of its drugs that can cause myopathy on their own (i.e., drugs in ) divided by the drug combination order. As Table 5 shows, top-10 mis-classified drug combinations (i.e., ) have almost twice as many drugs (30.7%) as those in (15.6%), and even more than those in drug combinations (24.3%). In addition, mis-classified drug combinations (i.e., ) also have significantly more drugs (18.6%) than those in (15.6%). Since matches similar drugs, high drug enrichment could be a primary reason for the mis-classification.
Top predictions
Top mis-classification on
Table 6 lists the top-10 (in terms of prediction values) drug combinations in (i.e., without myopathy) that are mis-classified as positive (i.e., with myopathy) by . For those drug combinations which appear in , we present their values, otherwise only frequencies. Those top mis-classified drug combinations contain many single drugs, which on their own can induce myopathy (i.e., in , bold in Table 6). As a matter of fact, the percentage of drugs in top mis-classified drug combinations is significantly higher than average. In Table 6, one special mis-classified drug combination is {lansoprazole omeprazole pantoprazole rabeprazole}, which does not contain any drugs. This set of drugs is commonly used as proton pump inhibitors (PPIs) to decrease the amount of acid produced in the stomach. Some case studies show evidence of causality between the PPI drug class and myopathy [30, 31].
Top prediction on
Table 7 presents the top-10 correctly predicted drug combinations by . These drug combinations are significantly enriched with drugs (i.e., drugs that can induce myopathy on their own). As Table 5 shows, has the most drugs (89.8%) compared to all the other sets and significantly more than . In particular, all of these combinations contain statin drugs (e.g., atorvastatin, simvastatin and rosuvastatin, etc.). These statin-related drugs have been studied in literature as a drug class that has high possibilities to induce myopathy [32, 33]. In addition, in Table 7, 4 out of the 6 drug combinations among top 10 (i.e., the drug combinations that have values) have their values higher than average in (31.998 as in Table 3), and 3 out of the 4 drug combinations among top 10 (i.e., the drug combinations that do not have values) have their frequency higher than average in (5.520 as in Table 3). In addition, among the top-20 drug combinations predicted by , 7 out of 12 drug combinations have their values higher than average in , and 5 out of 8 drug combinations have their frequency higher than average in . The average values of the top-10, top-20 and top-50 drug combinations from predicted by are 55.725, 42.956 and 42.114, respectively, and they are all higher than the average 31.998 for . The average frequencies of the top-10, top-20 and top-50 drug combinations from predicted by are 8.250, 6.875 and 6.524, respectively, and they are also all higher than the average 5.520 on . This indicates that does learn signals from and correspondingly makes predictions.
Top non- prediction on
Table 8 presents the top-10 correctly predicted drug combinations by that do not contain any drugs from (i.e., do not contain drugs that can induce myopathy on their own). 4 out of 7 drug combinations from in this table have their values higher than the average in (31.998 as in Table 3). The 3 drug combinations from in this table have their frequency lower than the average in (5.520 as in Table 3) but very close. In Table 8, 8 out of top-10 drug combinations include alendronate. Case studies demonstrate that several events of severe muscle pain, which is the common symptom of myopathy, were reported after patients started therapy with alendronate [34], showing the association between the medical treatment with alendronate and myopathy.
Discussions
The experimental results show that the new methods with drug co-medication based single drug similarities outperform other kernels, such as convolutional kernels [16] and probabilistic kernels [25], and can accurately predict whether a drug combination is likely to induce ADRs of interest. The experimental results demonstrate the advance of such single drug similarities that leverage co-medication patterns among high-order drug-drug interactions, and also inspire further exploration that learns such similarities in a pure data-driven fashion without pre-defined kernels, for example, via manifold learning. Further research would also include learning drug representations in a data-driven fashion such that the representations better quantify drug similarities in terms of their co-medication patterns. Deep learning would be an optimistic option for such drug representation learning.
Conclusions
In this manuscript, SVM-based classification methods were developed to predict whether a drug combination of arbitrary orders is likely to induce adverse drug reactions. Novel kernels over drug combinations of arbitrary orders were developed for such classification. These kernels were constructed from various single-drug information including drug co-medication patterns, and compare drug combination similarities based on single drugs they have and the relations among the single drugs. Specifically, a novel kernel over drug combinations of arbitrary orders was developed based on graph matching over drug combination graphs. A dataset from FDA Adverse Event Reporting System (FAERS) was constructed to test the new methods. The experimental results demonstrated that the new methods with drug co-medication based single drug similarities and graph matching based kernels achieve the best AUC as 0.912. The prediction also revealed strong patterns among drug combinations (e.g., statin enriched) that may be highly correlated with their induced ADRs.
List of abbreviations
DDI: Drug-Drug Interactions; ADR: Adverse Drug Reaction; SDS: Single Drug Similarities; ECFP: Extended Connectivity Fingerprints; and : Odds Ratio.
Declarations
Ethics approval and consent to participate
Not applicable
Consent for publication
Not applicable
Availability of data and material
The data and materials will be made publicly available upon the acceptance of the manuscript.
Competing interests
The authors declare that they have no competing interests.
Funding
This material is based upon work supported by the National Science Foundation under Grant Number IIS-1566219 and IIS-1622526. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Author’s contributions
Wen-Hao Chiang implemented the methods and conducted the experiments. Li Shen, Lang Li and Xia Ning developed the methods and designed the experiments. Xia Ning analyzed the experimental results. Wen-Hao Chiang and Xia Ning wrote the manuscript.
Figures
Additional Files
Additional file 1 — Drug-Drug Interaction Prediction based on Co-Medication Patterns and Graph Matching (Supplementary Materials)
The additional file named as “supp.pdf” includes more experimental results on other datasets and it is provided in PDF format . Any PDF reader are recommended to view the file.
Tables
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Ramirez, E., Carcas, A.J., Borobia, A.M., Lei, S.H., Piñana, E., Fudio, S., Frias, J.: A pharmacovigilance program from laboratory signals for the detection and reporting of serious adverse drug reactions in hospitalized patients. Clinical Pharmacology & Therapeutics 87 (1), 74–86 (2010). doi: 10.1038/clpt.2009.185
- 2[2] Percha, B., Altman, R.B.: Informatics confronts drug-drug interactions. Trends in pharmacological sciences 34 (3), 178–184 (2013). doi: 10.1016/j.tips.2013.01.006
- 3[3] The National Health and Nutrition Examination Survey. http://www.cdc.gov/NCHS/NHANES.htm
- 4[4] Iyer, S.V., Harpaz, R., Le Pendu, P., Bauer-Mehren, A., Shah, N.H.: Mining clinical text for signals of adverse drug-drug interactions. Journal of the American Medical Informatics Association 21 (2), 353–362 (2014)
- 5[5] Vilar, S., Uriarte, E., Santana, L., Lorberbaum, T., Hripcsak, G., Friedman, C., Tatonetti, N.P.: Similarity-based modeling in large-scale prediction of drug-drug interactions. Nature protocols 9 (9), 2147–2163 (2014)
- 6[6] Hammann, F., Drewe, J.: Data mining for potential adverse drug–drug interactions. Expert Opinion on Drug Metabolism & Toxicology 10 (5), 665–671 (2014). doi: 10.1517/17425255.2014.894507 . PMID: 24588496. http://dx.doi.org/10.1517/17425255.2014.894507
- 7[7] Luo, H., Zhang, P., Huang, H., Huang, J., Kao, E., Shi, L., He, L., Yang, L.: Ddi-cpi, a server that predicts drug–drug interactions through implementing the chemical–protein interactome. Nucleic Acids Research (2014). doi: 10.1093/nar/gku 433
- 8[8] Harpaz, R., Du Mouchel, W., Shah, N.H., Madigan, D., Ryan, P., Friedman, C.: Novel data-mining methodologies for adverse drug event discovery and analysis. Clinical Pharmacology & Therapeutics 91 (6), 1010–1021 (2012). doi: 10.1038/clpt.2012.50
