Drug-drug interaction prediction based on co-medication patterns and   graph matching

Wen-Hao Chiang; Li Shen; Lang Li; Xia Ning

arXiv:1902.08675·cs.LG·February 26, 2019

Drug-drug interaction prediction based on co-medication patterns and graph matching

Wen-Hao Chiang, Li Shen, Lang Li, Xia Ning

PDF

Open Access

TL;DR

This paper introduces novel kernel methods utilizing graph matching and co-medication patterns within support vector machines to accurately predict adverse drug reactions from complex drug combinations.

Contribution

It presents new kernels based on graph matching and co-medication data for predicting drug interactions of arbitrary orders, advancing the accuracy of adverse drug reaction prediction.

Findings

01

Achieved an AUC of 0.912 on real-world data

02

Utilized co-medication patterns to measure drug similarities

03

Developed kernels effective for complex drug combination prediction

Abstract

Background: The problem of predicting whether a drug combination of arbitrary orders is likely to induce adverse drug reactions is considered in this manuscript. Methods: Novel kernels over drug combinations of arbitrary orders are developed within support vector machines for the prediction. Graph matching methods are used in the novel kernels to measure the similarities among drug combinations, in which drug co-medication patterns are leveraged to measure single drug similarities. Results: The experimental results on a real-world dataset demonstrated that the new kernels achieve an area under the curve (AUC) value 0.912 for the prediction problem. Conclusions: The new methods with drug co-medication based single drug similarities can accurately predict whether a drug combination is likely to induce adverse drug reactions of interest. Keywords: drug-drug interaction prediction; drug…

Tables8

Table 1. Table 1: Table of Notations

Notation	Description
$𝚍$	Drug
$𝐷$	Drug combination
$𝒢$	Complete graph for a drug combination
${SDS}_{2d}$	Single drug similarity from drug 2d structures
${SDS}_{cm}$	Single drug similarity based on co-medications
$𝒦_{gm}$	Kernel based on graph matching algorithm
$𝒦_{cd}$	Kernel from common drugs
$𝒦_{ds}$	Kernel from drug similarities
$𝒦_{pb}$	Probabilistic drug combination kernel

Table 2. Table 2: Contingency Table

$𝙾𝚁 = \frac{n_{1}}{m_{1}} / \frac{n_{2}}{m_{2}}$	$ADR$	no $ADR$	total
$𝐷$	$n_{1}$	$m_{1}$	$n_{1} + m_{1}$
$∖$ $𝐷$	$n_{2}$	$m_{2}$	$n_{2} + m_{2}$
total	$n_{1} + n_{2}$	$m_{1} + m_{2}$	$n_{1} + n_{2} + m_{1} + m_{2}$

Table 3. Table 3: Data Statistics

dataset	stats	$𝒩$		$ℳ$
dataset	stats	$𝒩^{-}$	$𝒩^{0}$	$ℳ^{0}$	$ℳ^{+}$
	# ${𝐷}$	621,449	1,264	8,986	27,387
	# ${𝚍}$	1,209	417	881	1,201
$𝒟_{FAERS}$	avgOrd	6.100	2.351	3.588	7.096
	avgFrq	1.761	225.317	13.730	1.402
	avg $𝙾𝚁$	-	0.546	16.343	-
	# ${𝐷}$	2,200	1,264	2,464	1,000
	# ${𝚍}$	562	417	692	679
$𝒟^{*}$	avgOrd	2.678	2.351	3.809	7.615
	avgFrq	42.082	225.317	20.565	5.520
	avg $𝙾𝚁$	-	0.546	31.998	-

Table 4. Table 4: Overall Performance Comparison

$𝒦$	$𝒦_{gm}$		$𝒦_{cd}$		$𝒦_{ds}$		$𝒦_{pb}$
$SDS$	$2d$	$cm$	$ord- 1$	$ord- 2$	$2d$	$cm$	$2d$	$cm$
acc	0.829	0.836	0.817	0.827	0.827	0.825	0.763	0.765
pre	0.889	0.892	0.879	0.878	0.893	0.865	0.810	0.770
rec	0.752	0.765	0.735	0.759	0.744	0.770	0.689	0.756
F1	0.815	0.823	0.801	0.814	0.812	0.815	0.744	0.763
AUC	0.898	0.912	0.907	0.909	0.900	0.900	0.843	0.853

Table 5. Table 5: Average Percentage (%) of 𝒟 Myo subscript 𝒟 Myo \mathop{\mathcal{D}_{\text{Myo}}}\limits Drugs ( 𝒦 gm cm superscript subscript 𝒦 gm cm \mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits )

${\tilde{ℳ}}^{10 -}$	${\tilde{ℳ}}^{-}$	$ℳ$	${\tilde{ℳ}}^{10 +}$	${\tilde{𝒩}}^{10 +}$	${\tilde{𝒩}}^{+}$	$𝒩$	${\tilde{𝒩}}^{10 -}$
13.3	16.6	24.3	89.8	30.7	18.6	15.6	0.10

Table 6. Table 6: Top mis-Classified 𝒩 𝒩 \mathop{\mathcal{N}}\limits Drug Combinations by 𝒦 gm cm superscript subscript 𝒦 gm cm \mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits

N	prd	frq	$𝙾𝚁$	combinations
1	2.696	26	-	atorvastatin fenofibrate rosiglitazone simvastatin
2	2.507	26	-	allopurinol amlodipine atorvastatin levothyroxine naproxen omeprazole simvastatin
3	1.878	22	-	acetylsalicylicacid atorvastatin bisoprolol clopidogrel ramipril simvastatin
4	1.855	27	-	acetylsalicylicacid atenolol atorvastatin furosemide lansoprazole lisinopril nitroglycerin
5	1.785	21	-	citalopram clozapine isosorbidemononitrate prochlorperazine simvastatin zopiclone
6	1.750	-	0.842	amlodipine bisoprolol pravastatin ramipril simvastatin spironolactone warfarin
7	1.696	22	-	amlodipine clopidogrel ibuprofen omeprazole ramipril simvastatin
8	1.669	29	-	bisoprolol flecainide ramipril simvastatin
9	1.613	35	-	aripiprazole atorvastatin bendroflumethiazide clozapine diazepam folicacid furosemide iron lactulose lansoprazole perindopril ramipril trimethoprim zopiclone
10	1.549	-	0.875	lansoprazole omeprazole pantoprazole rabeprazole

Table 7. Table 7: Top Predictions on ℳ ℳ \mathop{\mathcal{M}}\limits by 𝒦 gm cm superscript subscript 𝒦 gm cm \mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits

N	prd	frq	$𝙾𝚁$	Combinations
1	4.167	3	-	atorvastatin lansoprazole pravastatin rosuvastatin simvastatin
2	4.009	-	11.372	atorvastatin pravastatin rosuvastatin simvastatin
3	3.776	-	50.043	atorvastatin fenofibrate metformin pravastatin rosuvastatin simvastatin
4	3.734	-	68.232	atorvastatin metformin pravastatin rosuvastatin simvastatin
5	3.676	-	45.487	atorvastatin lovastatin rosuvastatin simvastatin
6	3.618	-	136.470	atorvastatin pravastatin rosuvastatin simvastatin tadalafil
7	3.573	9	-	atorvastatin fenofibrate pravastatin simvastatin
8	3.552	10	-	atorvastatin ezetimibe fenofibrate rosuvastatin
9	3.519	-	22.746	atorvastatin ezetimibe rosuvastatin simvastatin
10	3.461	11	-	atorvastatin lansoprazole pravastatin simvastatin

Table 8. Table 8: Top Predictions without 𝒟 Myo subscript 𝒟 Myo \mathop{\mathcal{D}_{\text{Myo}}}\limits Drugs by 𝒦 gm cm superscript subscript 𝒦 gm cm \mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits

N	prd	frq	$𝙾𝚁$	Combinations
1	2.083	4	-	calcium clonazepam colestipol prednisone teriparatide
2	1.992	-	45.487	alendronate anastrozole desloratadine hydrochlorothiazide lisinopril triamterene valdecoxib vitaminc
3	1.968	-	17.058	alendronate raloxifene risedronate teriparatide
4	1.960	-	90.978	alendronate amlodipine atenolol clonazepam raloxifene teriparatide
5	1.901	-	45.489	alendronate fexofenadine hydrochlorothiazide omeprazole prednisone risedronate triamterene
6	1.850	5	-	alendronate fexofenadine levothyroxine nabumetone oxybutynin
7	1.849	-	113.720	alendronate calcium esomeprazole ibandronate levothyroxine rabeprazole
8	1.843	-	7.581	alendronate calciumgluconate teriparatide
9	1.838	-	22.744	alendronate calcium levothyroxine raloxifene teriparatide
10	1.834	4	-	calcium escitalopram iron ketorolac raloxifene teriparatide

Equations20

Tanimoto (S_{1}, S_{2}) = \frac{∣ S _{1} \cap S _{2} ∣}{∣ S _{1} ∣ + ∣ S _{2} ∣ - ∣ S _{1} \cap S _{2} ∣},

Tanimoto (S_{1}, S_{2}) = \frac{∣ S _{1} \cap S _{2} ∣}{∣ S _{1} ∣ + ∣ S _{2} ∣ - ∣ S _{1} \cap S _{2} ∣},

\displaystyle\begin{aligned} \mbox{$\mathop{\text{SDS}_{\text{2d}}}\limits$}(\mbox{$\mathop{\mathtt{d}}\limits$}_{i},\mbox{$\mathop{\mathtt{d}}\limits$}_{j})=\text{Tanimoto}(\{\boldsymbol{x}_{i}\},\{\boldsymbol{x}_{j}\}),\end{aligned}

\displaystyle\begin{aligned} \mbox{$\mathop{\text{SDS}_{\text{2d}}}\limits$}(\mbox{$\mathop{\mathtt{d}}\limits$}_{i},\mbox{$\mathop{\mathtt{d}}\limits$}_{j})=\text{Tanimoto}(\{\boldsymbol{x}_{i}\},\{\boldsymbol{x}_{j}\}),\end{aligned}

cost(\mbox{$\mathop{\mathtt{d}_{i}}\limits$},\mbox{$\mathop{\mathtt{d}_{j}}\limits$})=1-\mbox{$\mathop{\text{SDS}}\limits$}(\mbox{$\mathop{\mathtt{d}_{i}}\limits$},\mbox{$\mathop{\mathtt{d}_{j}}\limits$}),

cost(\mbox{$\mathop{\mathtt{d}_{i}}\limits$},\mbox{$\mathop{\mathtt{d}_{j}}\limits$})=1-\mbox{$\mathop{\text{SDS}}\limits$}(\mbox{$\mathop{\mathtt{d}_{i}}\limits$},\mbox{$\mathop{\mathtt{d}_{j}}\limits$}),

X min

X min

X \in P,

\displaystyle\mathcal{P}\coloneqq}{\{{X}\mid{X}\in\mathbb{R}^{k_{p}\times k_{q}},{X}_{i,j}\in\{0,1\},

i = 1 \sum k_{p} X_{i, j} \leq 1, j = 1 \sum k_{q} X_{i, j} \leq 1,

i = 1 \sum k_{p} j = 1 \sum k_{q} X_{i, j} = min (k_{p}, k_{q})},

\mbox{$\mathop{\mathcal{S}_{\text{gm}}}\limits$}(\mbox{$\mathop{D_{p}}\limits$},\mbox{$\mathop{D_{q}}\limits$})=\text{trace}(J-C(\mbox{$\mathop{\mathcal{G}_{p}}\limits$},\mbox{$\mathop{\mathcal{G}_{q}}\limits$}){X}^{\mathsf{T}}),

\mbox{$\mathop{\mathcal{S}_{\text{gm}}}\limits$}(\mbox{$\mathop{D_{p}}\limits$},\mbox{$\mathop{D_{q}}\limits$})=\text{trace}(J-C(\mbox{$\mathop{\mathcal{G}_{p}}\limits$},\mbox{$\mathop{\mathcal{G}_{q}}\limits$}){X}^{\mathsf{T}}),

\mbox{$\mathop{\mathcal{K}_{\text{cd}}}\limits$}(\mbox{$\mathop{D_{p}}\limits$},\mbox{$\mathop{D_{q}}\limits$})=\text{Tanimoto}(\mbox{$\mathop{D_{p}}\limits$},\mbox{$\mathop{D_{q}}\limits$}),

\mbox{$\mathop{\mathcal{K}_{\text{cd}}}\limits$}(\mbox{$\mathop{D_{p}}\limits$},\mbox{$\mathop{D_{q}}\limits$})=\text{Tanimoto}(\mbox{$\mathop{D_{p}}\limits$},\mbox{$\mathop{D_{q}}\limits$}),

\mbox{$\mathop{\mathcal{K}^{(2)}_{\text{cd}}}\limits$}(\mbox{$\mathop{D_{p}}\limits$},\mbox{$\mathop{D_{q}}\limits$})=\text{Tanimoto}(\mbox{$\mathop{D^{(2)}_{p}}\limits$},\mbox{$\mathop{D^{(2)}_{q}}\limits$}).

\mbox{$\mathop{\mathcal{K}^{(2)}_{\text{cd}}}\limits$}(\mbox{$\mathop{D_{p}}\limits$},\mbox{$\mathop{D_{q}}\limits$})=\text{Tanimoto}(\mbox{$\mathop{D^{(2)}_{p}}\limits$},\mbox{$\mathop{D^{(2)}_{q}}\limits$}).

\mbox{$\mathop{\mathcal{K}_{\text{ds}}}\limits$}(\mbox{$\mathop{D_{p}}\limits$},\mbox{$\mathop{D_{q}}\limits$})=\frac{1}{k_{p}k_{q}}\sum_{\scriptsize{\mbox{$\mathop{\mathtt{d}}\limits$}_{i}\in\mbox{$\mathop{D_{p}}\limits$}}}\sum_{\scriptsize{\mbox{$\mathop{\mathtt{d}}\limits$}_{j}\in\mbox{$\mathop{D_{q}}\limits$}}}\mbox{$\mathop{\text{SDS}}\limits$}(\mbox{$\mathop{\mathtt{d}}\limits$}_{i},\mbox{$\mathop{\mathtt{d}}\limits$}_{j}),

\mbox{$\mathop{\mathcal{K}_{\text{ds}}}\limits$}(\mbox{$\mathop{D_{p}}\limits$},\mbox{$\mathop{D_{q}}\limits$})=\frac{1}{k_{p}k_{q}}\sum_{\scriptsize{\mbox{$\mathop{\mathtt{d}}\limits$}_{i}\in\mbox{$\mathop{D_{p}}\limits$}}}\sum_{\scriptsize{\mbox{$\mathop{\mathtt{d}}\limits$}_{j}\in\mbox{$\mathop{D_{q}}\limits$}}}\mbox{$\mathop{\text{SDS}}\limits$}(\mbox{$\mathop{\mathtt{d}}\limits$}_{i},\mbox{$\mathop{\mathtt{d}}\limits$}_{j}),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational Drug Discovery Methods · Pharmacogenetics and Drug Metabolism · Chemical Synthesis and Analysis

Full text

Drug-drug interaction prediction based on

co-medication patterns and graph matching

WC\fnmWen-Hao Chiang

LS\fnmLi Shen

LS\fnmLang Li

XN\fnmXia Ning

\orgnameDepartment of Computer & Information Science, Indiana University - Purdue University Indianapolis, \postcode46202 \cityIndianapolis, \cnyUSA. Email: [email protected]

\orgnameDepartment of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, \postcode19104 \cityPhiladelphia, \cnyUSA. Email: [email protected]

\orgnameDepartment of Biomedical Informatics, Ohio State University, \postcode43210 \cityColumbus, \cnyUSA. Email: [email protected]

\orgnameDepartment of Computer & Information Science, Indiana University - Purdue University Indianapolis, \postcode46202 \cityIndianapolis, \cnyUSA. Email: [email protected]

Abstract

\parttitle

Background The problem of predicting whether a drug combination of arbitrary orders is likely to induce adverse drug reactions is considered in this manuscript. \parttitleMethods Novel kernels over drug combinations of arbitrary orders are developed within support vector machines for the prediction. Graph matching methods are used in the novel kernels to measure the similarities among drug combinations, in which drug co-medication patterns are leveraged to measure single drug similarities. \parttitleResults The experimental results on a real-world dataset demonstrated that the new kernels achieve an area under the curve (AUC) value 0.912 for the prediction problem. \parttitleConclusions The new methods with drug co-medication based single drug similarities can accurately predict whether a drug combination is likely to induce adverse drug reactions of interest.

drug-drug interaction prediction,

drug combination similarity,

co-medication,

graph matching,

keywords:

\startlocaldefs\endlocaldefs

{fmbox}\dochead

Research

{artnotes}

{abstractbox}

Introduction

Drug-Drug Interactions ( $\mathop{\text{DDIs}}\limits$ ) and the associated Adverse Drug Reactions (ADRs) represent a consistent detriment to the public health in the United States. $\mathop{\text{DDIs}}\limits$ have accounted for approximately 26% of the ADRs, occurred among 50% of the hospitalized patients [1], and caused nearly 74,000 emergency room visits and 195,000 hospitalizations annually in the US [2]. Apart from these, because of the common practice of co-medication among elderly Americans, particularly co-medication of more than two drugs, the high-order drug-drug interactions and their associated ADRs have imposed significant scientific and public health challenges. The National Health and Nutrition Examination Survey [3] reports that more than 76% of the elderly Americans take two or more drugs every day. Another study [4] estimates that about 29.4% of elderly American patients take six or more drugs every day. However, for most of such high-order $\mathop{\text{DDIs}}\limits$ , their mechanisms are unknown.

In this manuscript, novel approaches to predicting whether high-order drug combinations are likely to induce ADRs are presented. The prediction problems are formulated as a binary classification problem and support vector machines (SVMs) are used for the prediction. Novel kernels over drug combinations of arbitrary orders are developed within the framework of SVMs. These kernels are constructed using drug co-medication information to measure single drug similarities and graph matching on drug combination graphs to measure drug combination similarities. A comparison on the new kernels with other convolutional kernels and probabilistic kernels on drug combinations is also conducted. The experimental results demonstrate that the new kernels outperform the others and can accurately predict whether a drug combination is likely to induce ADRs of interest with an AUC value 0.912. To the best of our knowledge, this manuscript represents the first effort in predicting $\mathop{\text{DDIs}}\limits$ for drug combinations of arbitrary orders.

Background

Drug-drug interactions

Significant research efforts have been dedicated to detect pairwise drug-drug interactions ( $\mathop{\text{DDIs}}\limits$ ) [5, 6] in recent years. Existing methods either extract $\mathop{\text{DDI}}\limits$ pairs mentioned in medical literature or Electronic Health Records (EHRs) [4], or predict/score $\mathop{\text{DDI}}\limits$ pairs from various drug/target information [7]. While most of the existing $\mathop{\text{DDI}}\limits$ studies are focused on interactions between a pair of drugs (i.e., order-2 $\mathop{\text{DDIs}}\limits$ ), understanding high-order $\mathop{\text{DDIs}}\limits$ and their associated ADRs has attracted increasing attention recently [2, 8]. These emerging methods on high-order $\mathop{\text{DDI}}\limits$ studies are largely focused on how to discover high-order $\mathop{\text{DDIs}}\limits$ through mining frequent itemsets (i.e., drug combinations) from EHRs efficiently. Most recent work also includes pattern discovery from directional high-order $\mathop{\text{DDIs}}\limits$ [9] and directional high-order $\mathop{\text{DDI}}\limits$ prediction [10].

Graph matching

Graph matching is to find the optimal vertex correspondence between two graphs [11, 12]. Graph matching problems can be broadly classified into two categories. The first category is exact graph matching, which is to find the graph and subgraph isomorphisms so that the mapping of vertices between two graphs is bijective and edge-preserving (i.e., vertices connected by an edge in one graph are mapped to vertices in the other graph that are also connected by an edge). The second category is inexact graph matching, which allows errors (e.g., different types of matched vertices in attributed graphs) during matching, and thus it is to minimize the total errors in finding optimal graph matching. Typical algorithms for graph matching include spectral methods [13], probabilistic methods [14], tree search [15], etc.

Definitions and notations

We use $d_{i}$ to represent a drug, and $D^{k}=\{d_{1},d_{2},\cdots,d_{k}\}$ to represent a combination of $k$ drugs, where $k$ is the number of unique drugs in $D^{k}$ (i.e., $k=|D^{k}|$ ) and thus the order of $D^{k}$ . A drug combination $D^{k}$ is defined when the drugs and only the drugs in $D^{k}$ are taken simultaneously. There are no orderings among the drugs in a drug combination. When no ambiguity is raised, we drop the superscript $k$ in $D^{k}$ and represent a drug combination as $D$ . An event is referred to as a patient taking a drug combination. In addition, in this manuscript, all vectors (e.g., $\mathbf{c}$ ) are represented by bold lower-case letters and all matrices (e.g., $X$ ) are represented by upper-case letters. Row vectors are represented by having the transpose superscript T, otherwise by default they are column vectors. Table 1 summarizes the important notations in the manuscript.

Methods

We formulate the problem of predicting whether high-order drug combinations induce a particular $\mathop{\text{ADR}}\limits$ as a binary classification problem, and solve the classification problem within the framework of kernel methods and support vector machines (SVMs). In this manuscript, we consider myopathy as the $\mathop{\text{ADR}}\limits$ in particular. The central concept of SVM-based classification methods is that “similar” instances are likely to share similar labels, and thus the key is to capture and measure the “similarities” among instances (i.e., drug combinations in our $\mathop{\text{ADR}}\limits$ prediction problem) via kernels. In the case of drug combinations, we hypothesize that if two drug combinations share similar pharmaceutical, pharmacokinetic and/or pharmacodynamic properties, they may induce similar $\mathop{\text{ADRs}}\limits$ . Therefore, the question boils down to effectively representing and measuring the similarities in terms of such properties. To this end, we develop various kernels over drug combinations. A key property of such kernels as will be discussed later is that they are able to deal with drug combinations of arbitrary orders. These kernels are constructed using single drug similarities, which incorporate various drug information that could relate to $\mathop{\text{DDIs}}\limits$ . Here we decompose the discussion on such kernels from three aspects: 1). single drug similarities ( $\mathop{\text{SDS}}\limits$ ) as in Section Single drug similarities, 2). our new kernel based on matching similar drugs in drug combination graphs in Section Drug combination kernels from graph matching, and 3). other convolutional kernels [16] in Section Convolutional drug-combination kernels. Given these kernels, we further employ the freely available SVM-Light software to build up the binary classifiers and conduct our experiments based on such classifiers [17].

Single drug similarities

We use two different approaches to measuring single drug similarities ( $\mathop{\text{SDS}}\limits$ ). The first approach measures single drug similarities based on their intrinsic properties that can be represented by their 2D structures [18]. The second approach measures the similarities in a more data-driven fashion based on the co-occurrence patterns among drugs.

$\mathop{\text{SDS}}\limits$ from drug 2d structures

A straightforward way to measure $\mathop{\text{SDSs}}\limits$ between two drugs is to look at their structures, which ultimately determine their physicochemical properties. We use Extended Connectivity Fingerprints (ECFP) [19] of length 2,048 to represent drug 2D structures. Each of the fingerprint dimensions corresponds to a substructure among the drugs of interest. The binary values in the fingerprints represent whether a drug has the corresponding substructure or not. We use a vector $\boldsymbol{x}_{i}\in\mathbb{R}^{2048}$ to represent the fingerprint for drug $\mathop{\mathtt{d}_{i}}\limits$ . The $\mathop{\text{SDS}}\limits$ between two drugs from their 2D structures, denoted as $\mathop{\text{SDS}_{\text{2d}}}\limits$ , is calculated as the Tanimoto coefficient between their ECFP fingerprints [20]. Tanimoto coefficient between two sets is defined as follows,

[TABLE]

where $|S|$ is the cardinality of set $S$ . Thus, $\mathop{\text{SDS}_{\text{2d}}}\limits$ is defined as

[TABLE]

where $\{\boldsymbol{x}_{i}\}$ represents the set of substructures that $\mbox{$ \mathop{\mathtt{d}}\limits $}_{i}$ has in its fingerprint $\boldsymbol{x}_{i}$ .

$\mathop{\text{SDS}}\limits$ based on co-medications

We develop a new approach to measuring the $\mathop{\text{SDS}}\limits$ between two drugs by looking at whether they are often involved in co-medications with similar other drugs, respectively. The hypothesis is that drugs that are respectively taken together with other similar drugs may share similar therapeutic purposes and target similar therapeutic targets, and thus behave similarly in inducing $\mathop{\text{ADRs}}\limits$ . Such data-driven co-medication based $\mathop{\text{SDSs}}\limits$ have a potential advantage over $\mathop{\text{SDS}_{\text{2d}}}\limits$ in that they leverage the signals from $\mathop{\text{ADRs}}\limits$ information directly that may not be captured or explained by drug 2D structures or other features on individual drugs. Such co-medication based $\mathop{\text{SDS}}\limits$ is denoted as $\mathop{\text{SDS}_{\text{cm}}}\limits$ .

We use two vectors $\mbox{$ \mathop{\boldsymbol{c}^{+}{i}}\limits $}\in\mathbb{R}^{n}$ and $\mbox{$ \mathop{\boldsymbol{c}^{-}{i}}\limits $}\in\mathbb{R}^{n}$ ( $n$ is the total number of drugs) to represent the co-medication information for drug $\mathop{\mathtt{d}_{i}}\limits$ . The $j$ -th dimension ( $j=1,\cdots,n$ ) in $\mathop{\boldsymbol{c}^{+}_{i}}\limits$ / $\mathop{\boldsymbol{c}^{-}_{i}}\limits$ corresponds to drug $\mathop{\mathtt{d}_{j}}\limits$ , and the value on the $j$ -th dimension in $\mathop{\boldsymbol{c}^{+}_{i}}\limits$ / $\mathop{\boldsymbol{c}^{-}_{i}}\limits$ is the co-medication frequency of $\mathop{\mathtt{d}_{i}}\limits$ and $\mathop{\mathtt{d}_{j}}\limits$ in all the events with/without $\mathop{\text{ADRs}}\limits$ . Both $\mathop{\boldsymbol{c}^{+}_{i}}\limits$ and $\mathop{\boldsymbol{c}^{-}_{i}}\limits$ values are then normalized into probabilities. The normalized $\mathop{\boldsymbol{c}^{+}_{i}}\limits$ and $\mathop{\boldsymbol{c}^{-}_{i}}\limits$ are further concatenated into one vector $\mathop{\boldsymbol{c}_{i}}\limits$ , that is, $\mbox{$ \mathop{\boldsymbol{c}{i}}\limits $}=[\mbox{$ \mathop{\boldsymbol{c}^{+}{i}}\limits $};\mbox{$ \mathop{\boldsymbol{c}^{-}_{i}}\limits $}]$ , for $\mathop{\mathtt{d}_{i}}\limits$ . The $\mathop{\text{SDS}_{\text{cm}}}\limits$ between drug $\mathop{\mathtt{d}_{i}}\limits$ and $\mathop{\mathtt{d}_{j}}\limits$ is calculated as the cosine similarity between $\mathop{\boldsymbol{c}_{i}}\limits$ and $\mathop{\boldsymbol{c}_{j}}\limits$ . The reason why we use $\mathop{\boldsymbol{c}^{+}_{i}}\limits$ and $\mathop{\boldsymbol{c}^{-}_{i}}\limits$ to construct $\mathop{\boldsymbol{c}_{i}}\limits$ instead of co-medication frequencies from all events with and without $\mathop{\text{ADRs}}\limits$ together is that the co-medication patterns from the two types of events can be very different, and thus one unified co-medication vector for both of them could not necessarily capture discriminative information among drugs.

Drug combination kernels from graph matching

We formulate the problem of comparing drug combination similarities through matching drug combination graphs, and develop a graph-matching based kernel for drug combination similarities. Specifically, for a drug combination $\mbox{$ \mathop{D_{p}}\limits $}=\{\mbox{$ \mathop{\mathtt{d}}\limits $}_{p1},\mbox{$ \mathop{\mathtt{d}}\limits $}_{p2},\cdots,\mbox{$ \mathop{\mathtt{d}}\limits $}_{pk_{p}}\}$ , we construct a complete graph $\mathop{\mathcal{G}_{p}}\limits$ of $k_{p}$ nodes, in which each node represents a drug in $\mathop{D_{p}}\limits$ , and all the nodes are connected to one another. Thus, the similarity between drug combination $\mathop{D_{p}}\limits$ and $\mathop{D_{q}}\limits$ can be measured based on how $\mathop{\mathcal{G}_{p}}\limits$ and $\mathop{\mathcal{G}_{q}}\limits$ match to each other. In matching such graphs, we consider $\mathop{\text{SDSs}}\limits$ so that drugs that are similar to each other should be matched, and the graph matching procedure should maximize the overall $\mathop{\text{SDSs}}\limits$ from matched drugs. The underlying assumption is that if two drug combinations share similar drugs, they could have similar $\mathop{\text{ADRs}}\limits$ . Figure 1 illustrates the idea of complete graph matching for two drug combinations, in which the drugs connected by dash lines are matched between $\mathop{D_{p}}\limits$ and $\mathop{D_{q}}\limits$ . The similarity calculated from graph matching over two drug combinations, denoted as $\mathop{\mathcal{S}_{\text{gm}}}\limits$ , will the sum of $\mathop{\text{SDSs}}\limits$ from matched drugs. $\mathop{\mathcal{S}_{\text{gm}}}\limits$ will be further converted to a valid kernel, denoted as $\mathop{\mathcal{K}_{\text{gm}}}\limits$ .

Graph matching algorithm for $\mathop{\mathcal{K}_{\text{gm}}}\limits$

The drug combination graph matching problem can be solved as a well known linear sum assignment problem (LSAP) [21]. The objective is to minimize the total cost of matching vertices in two graphs, and thus to find the graph matching with minimal total cost. In the case of high-order drug combinations, we define the cost of matching two drugs $\mathop{\mathtt{d}_{i}}\limits$ and $\mathop{\mathtt{d}_{j}}\limits$ as the dis-similarity between the drugs, that is,

[TABLE]

where $cost(\mbox{$ \mathop{\mathtt{d}{i}}\limits $},\mbox{$ \mathop{\mathtt{d}{j}}\limits $})$ is the cost between $\mathop{\mathtt{d}_{i}}\limits$ and $\mathop{\mathtt{d}_{j}}\limits$ , $\mathop{\text{SDS}}\limits$ can be either $\mathop{\text{SDS}_{\text{2d}}}\limits$ or $\mathop{\text{SDS}_{\text{cm}}}\limits$ . Thus, if two drugs are very similar (i.e., large $\mathop{\text{SDS}}\limits$ ), the cost of matching them will be small and therefore they are more likely to be matched.

Therefore, the graph matching can be solved by solving the following LSAP problem:

[TABLE]

where $\text{trace}()$ is the trace of a matrix; and $k_{p}$ and $k_{q}$ are the number of vertices in $\mathop{\mathcal{G}_{p}}\limits$ and $\mathop{\mathcal{G}_{q}}\limits$ (and thus the order of $\mathop{D_{p}}\limits$ and $\mathop{D_{q}}\limits$ ), respectively; $C(\mbox{$ \mathop{\mathcal{G}{p}}\limits $},\mbox{$ \mathop{\mathcal{G}{q}}\limits $})\in\mathbb{R}^{k_{p}\times k_{q}}$ is the pairwise drug-matching cost matrix for two drug combinations $\mathop{D_{p}}\limits$ and $\mathop{D_{q}}\limits$ ( $C(i,j)=cost(\mbox{$ \mathop{\mathtt{d}}\limits $}_{pi},\mbox{$ \mathop{\mathtt{d}}\limits $}_{qj})$ , $\mbox{$ \mathop{\mathtt{d}}\limits $}_{pi}\in\mbox{$ \mathop{D_{p}}\limits $}$ , $\mbox{$ \mathop{\mathtt{d}}\limits $}_{qj}\in\mbox{$ \mathop{D_{q}}\limits $}$ ). In Problem 4, $X$ is the assignment matrix to match $\mathop{\mathcal{G}_{p}}\limits$ and $\mathop{\mathcal{G}_{q}}\limits$ (i.e., to assign a vertex in $\mathop{\mathcal{G}_{p}}\limits$ to a vertex in $\mathop{\mathcal{G}_{q}}\limits$ ), in which all the values are either 0 or 1, both the row sum and the column sum are either 0 or 1 (i.e., a vertex is either matched or not; if it is matched, it is matched to only one vertex in the other graph), and thus the sum of all the values is exactly the minimal of $k_{p}$ and $k_{q}$ (i.e., the vertices in the small graph have to be all matched). Essentially, $X$ assigns each of the vertices in the smaller graph of $\mathop{\mathcal{G}_{p}}\limits$ and $\mathop{\mathcal{G}_{q}}\limits$ to exactly one vertex in the larger graph. The optimization problem in 4 can be solved by the Hungarian algorithm [22]. The drug-combination similarity $\mathop{\mathcal{S}_{\text{gm}}}\limits$ is then calculated as

[TABLE]

where ${J}\in\mathbb{R}^{k_{p}\times k_{q}}$ is a matrix of all 1’s.

The drug-combination similarity matrix $\mathop{\mathcal{S}_{\text{gm}}}\limits$ is always symmetric but not necessarily positive semi-definite, and thus not always a valid kernel. To convert $\mathop{\mathcal{S}_{\text{gm}}}\limits$ to a valid kernel $\mathop{\mathcal{K}_{\text{gm}}}\limits$ , we follow the approach in Saigo et al.[23]. Specifically, we first conduct an eigenvalue decomposition on $\mathop{\mathcal{S}_{\text{gm}}}\limits$ , subtract from the diagonal of the eigenvalue matrix its smallest negative eigenvalue, and reconstruct the original matrix from the altered decomposition. The resulted matrix is positive, semi-definite, and is used as $\mathop{\mathcal{K}_{\text{gm}}}\limits$ .

Convolutional drug-combination kernels

Drug combination kernels from common drugs

We define a drug-combination kernel, denoted as $\mathop{\mathcal{K}_{\text{cd}}}\limits$ , based on common drugs among drug combinations. $\mathop{\mathcal{K}_{\text{cd}}}\limits$ is calculated as the Tanimoto coefficient over the sets of drugs in the drug combinations, that is,

[TABLE]

where $\text{Tanimoto}()$ is defined as in Equation 1. It has been proved that Tanimoto coefficient is a valid kernel function [24]. $\mathop{\mathcal{K}_{\text{cd}}}\limits$ essentially measures the proportion of shared common drugs among two drug combinations. The underlying assumption is that if two drug combinations share many common drugs, they are likely to have similar properties.

To further enhance the similarity between two drug combinations from their common drugs, we also define an order-2 $\mathop{\mathcal{K}_{\text{cd}}}\limits$ of drug combinations, denoted as $\mathop{\mathcal{K}^{(2)}_{\text{cd}}}\limits$ ( $\mathop{\mathcal{K}_{\text{cd}}}\limits$ in Equation 6 is correspondingly referred to as order-1 $\mathop{\mathcal{K}_{\text{cd}}}\limits$ and denoted as $\mathop{\mathcal{K}^{(1)}_{\text{cd}}}\limits$ ). We first represent a drug combination $\mbox{$ \mathop{D}\limits $}=\{\mbox{$ \mathop{\mathtt{d}}\limits $}_{1},\mbox{$ \mathop{\mathtt{d}}\limits $}_{2},\cdots,\mbox{$ \mathop{\mathtt{d}}\limits $}_{k}\}$ by all its single drugs and drug pairs, denoted as $\mbox{$ \mathop{D}\limits $}^{(2)}=\{\mbox{$ \mathop{\mathtt{d}}\limits $}_{1},\mbox{$ \mathop{\mathtt{d}}\limits $}_{2},\cdots,\mbox{$ \mathop{\mathtt{d}}\limits $}_{k},(\mbox{$ \mathop{\mathtt{d}}\limits $}_{1},\mbox{$ \mathop{\mathtt{d}}\limits $}_{2}),(\mbox{$ \mathop{\mathtt{d}}\limits $}_{1},\mbox{$ \mathop{\mathtt{d}}\limits $}_{3}),\cdots,(\mbox{$ \mathop{\mathtt{d}}\limits $}_{k-1},\mbox{$ \mathop{\mathtt{d}}\limits $}_{k})\}$ . Thus, $\mathop{\mathcal{K}^{(2)}_{\text{cd}}}\limits$ on two drug combinations $\mathop{D_{p}}\limits$ and $\mathop{D_{q}}\limits$ can be calculated as the Tanimoto coefficient on $\mathop{D^{(2)}_{p}}\limits$ and $\mathop{D^{(2)}_{q}}\limits$ , that is,

[TABLE]

Intuitively, $\mathop{\mathcal{K}^{(2)}_{\text{cd}}}\limits$ better differentiates drug combinations with many shared drugs from those with fewer shared drugs than $\mathop{\mathcal{K}^{(1)}_{\text{cd}}}\limits$ . We only extend $\mathop{\mathcal{K}_{\text{cd}}}\limits$ to order 2 since higher-order extension does not lead to better performance according to our experimental results. According to Equation 6, when the order becomes much higher, Tanimoto( $\mathop{D^{(n)}_{p}}\limits$ , $\mathop{D^{(n)}_{q}}\limits$ ) may become very small due to a rapid combinatorial growth in the denominator and the insufficient common drug $n$ -tuples (i.e., the number in the nominator). Thus, $\mathop{\mathcal{K}_{\text{cd}}}\limits$ with extension to much higher order may lose the ability to differentiate drug combinations that contain more common drugs.

Drug combination kernels from drug similarities

The drug combination similarities can also be measured by the average drug similarities. The hypothesis is that if two drug combinations have drugs that are similar on average, they may share similar properties. If two drug combinations have drugs that are similar on average, they may share similar properties. Therefore, we define an average-drug-similarity based kernel for drug combinations, denoted as $\mathop{\mathcal{K}_{\text{ds}}}\limits$ , as follows,

[TABLE]

where $k_{p}$ and $k_{q}$ are the order of $\mathop{D_{p}}\limits$ and $\mathop{D_{q}}\limits$ , respectively, and $\mathop{\text{SDS}}\limits$ can be $\mathop{\text{SDS}_{\text{2d}}}\limits$ or $\mathop{\text{SDS}_{\text{cm}}}\limits$ . Intuitively, $\mathop{\mathcal{K}_{\text{ds}}}\limits$ tends to capture averaged and smoothed drug combination similarities. It has been proved that as long as the involved $\mathop{\text{SDSs}}\limits$ are valid kernels (i.e., positive semi-definite), $\mathop{\mathcal{K}_{\text{ds}}}\limits$ will also be a valid kernel [16].

Probabilistic drug combination kernels from drug sets

We apply an ensemble kernel for drug combinations based on the idea as in [25]. The key idea is to use a reproducing kernel to characterize sample similarities (i.e., $\mathop{\text{SDS}}\limits$ ), and to use a probabilistic distance in the reproducing kernel Hilbert space (RKHS) to measure the ensemble similarity. The resulted ensemble similarity matrix is a valid kernel matrix, denoted as $\mathop{\mathcal{K}_{\text{pb}}}\limits$ . This ensemble involves an eigen value decomposition, during which, it is possible that some similarity matrices are deprecated numerically and it leads to defeats in $\mathop{\mathcal{K}_{\text{pb}}}\limits$ calculation. To deal with this issue, we increase the diagonals of involved square matrices by a small value to guarantee the positive semi-definite properties.

Materials

Mining drug combinations

We extract high-order drug combinations from FDA Adverse Event Reporting System (FAERS) [26]. We use myopathy as the $\mathop{\text{ADR}}\limits$ of particular interest, and extract 64,892 case (myopathy) events, in which patients report myopathy after taking multiple drugs, and 1,475,840 control (non-myopathy) events, in which patients do not report myopathy after taking drugs. Each of these events involves a combination of more than one drug.

Among all the involved drug combinations, 10,250 unique drug combinations appear in both case and control events. For those 10,250 drug combinations, we use Odds Ratio ( $\mathop{\mathtt{OR}}\limits$ ) to quantify their $\mathop{\text{ADR}}\limits$ risks. The $\mathop{\mathtt{OR}}\limits$ for a drug combination $\mathop{D}\limits$ is defined based on the contingency table 2, that is, it is the ratio of the following two values: 1). the odds that the $\mathop{\text{ADR}}\limits$ occurs when $\mathop{D}\limits$ is taken (i.e., $\frac{n_{1}}{m_{1}}$ in Table 2); and 2). the odds that the $\mathop{\text{ADR}}\limits$ occurs when $\mathop{D}\limits$ is not taken (i.e., $\frac{n_{2}}{m_{2}}$ in Table 2). $\mathop{\mathtt{OR}}\limits$ $<$ 1 indicates the decreased risk of $\mathop{\text{ADR}}\limits$ after a patient takes the drug combination, $\mathop{\mathtt{OR}}\limits$ $=1$ indicates no risk change, and $\mathop{\mathtt{OR}}\limits$ $>$ 1 indicates the increased risk. In the 10,250 drug combinations, 8,986 combinations have $\mbox{$ \mathop{\mathtt{OR}}\limits $}>1$ and 1,264 combinations have $\mbox{$ \mathop{\mathtt{OR}}\limits $}<1$ . These two sets of drug combinations are denoted as $\mathop{\mathcal{M}^{0}}\limits$ and $\mathop{\mathcal{N}^{0}}\limits$ , respectively. In addition to these combinations, there are 27,387 unique drug combinations that only appear in case events and 621,449 unique drug combinations that only appear in control events. These two sets are denoted as $\mathop{\mathcal{M}^{+}}\limits$ and $\mathop{\mathcal{N}^{-}}\limits$ , respectively. The set of drug combinations in case events is denoted as $\mathop{\mathcal{M}}\limits$ (i.e., $\mbox{$ \mathop{\mathcal{M}}\limits $}=\mbox{$ \mathop{\mathcal{M}^{+}}\limits $}\cup\mbox{$ \mathop{\mathcal{M}^{0}}\limits $}$ ), and the set of drug combinations in control events is denoted as $\mathop{\mathcal{N}}\limits$ (i.e., $\mbox{$ \mathop{\mathcal{N}}\limits $}=\mbox{$ \mathop{\mathcal{N}^{-}}\limits $}\cup\mbox{$ \mathop{\mathcal{N}^{0}}\limits $}$ ). All these four sets together define a high-order drug combination dataset from FAERS, denoted as $\mathop{\mathcal{D}_{\text{FAERS}}}\limits$ . Table 3 presents the statistics of $\mathop{\mathcal{D}_{\text{FAERS}}}\limits$ .

Training data generation

As shown in Table 3, $\mathop{\mathcal{M}^{+}}\limits$ and $\mathop{\mathcal{M}^{0}}\limits$ of $\mathop{\mathcal{D}_{\text{FAERS}}}\limits$ have fewer drug combinations than $\mathop{\mathcal{N}^{-}}\limits$ and $\mathop{\mathcal{N}^{0}}\limits$ , and the drug combinations in $\mathop{\mathcal{M}^{+}}\limits$ are very infrequent (average frequency 1.402). To use more frequent and more confident drug combinations from case events, we further pruned drug combinations from $\mathop{\mathcal{M}^{+}}\limits$ and $\mathop{\mathcal{M}^{0}}\limits$ as follows. From $\mathop{\mathcal{M}^{+}}\limits$ , we retained the top 1,000 most frequent drug combinations. For $\mathop{\mathcal{M}^{0}}\limits$ , we applied right-tailed Fisher’s exact test on the drug combinations to further test the significance of their $\mathop{\mathtt{OR}}\limits$ s at 5% significance level. Then we retained drug combinations with statistically significant $\mathop{\mathtt{OR}}\limits$ s. Thus, the pruned $\mathop{\mathcal{M}^{+}}\limits$ and $\mathop{\mathcal{M}^{0}}\limits$ contain statistically confident drug combinations, which are very likely to induce myopathy, and therefore, these drug combinations are labeled as positive instances for classification model learning.

We retained all 1,264 the drug combinations in $\mathop{\mathcal{N}^{0}}\limits$ because this set is not large and contains informative drug combinations that may or may not induce myopathy. We further prune $\mathop{\mathcal{N}^{-}}\limits$ and retain the top most frequent drug combinations. The drug combinations from $\mathop{\mathcal{N}^{0}}\limits$ and the pruned $\mathop{\mathcal{N}^{-}}\limits$ are labeled as negative instances. To make the positive and negative training sets balance, we retained 2,200 drug combinations from $\mathop{\mathcal{N}^{-}}\limits$ . The pruned dataset from $\mathop{\mathcal{D}_{\text{FAERS}}}\limits$ is denoted as $\mathop{\mathcal{D}^{*}}\limits$ . Table 3 presents the description of $\mathop{\mathcal{D}^{*}}\limits$ . $\mathop{\mathcal{D}^{*}}\limits$ is the set of labeled drug combinations that are used for model learning. In $\mathop{\mathcal{D}^{*}}\limits$ , there are in total 1,210 drugs involved. 71 out of these 1,210 drugs induce myopathy on their own based on the Side Effect Resource (SIDER) [27]. This set of 71 drugs is denoted as $\mathop{\mathcal{D}_{\text{Myo}}}\limits$ .

Evaluation protocol and metrics

The performance of the different methods is evaluated through five-fold cross validation. The dataset is randomly split into five folds of equal size (i.e., same number of drug combinations). Four folds are used for model training and the rest fold is used for testing. This process is performed five times, with one fold for testing each time. The final result is the average out of the five experiments.

We use accuracy, precision, recall, F1 and AUC to evaluate the performance of the methods. Accuracy is defined as the fraction of all correctly classified instances (i.e., true positives and true negatives) over all the instances in the testing set. Precision is defined as the fraction of correctly classified positive instances (i.e., true positives) over all instances that are classified as positive instances (i.e., true positives and false positives). Recall is the fraction of correctly classified positive instances (i.e., true positives) over all positive instances in the testing set (i.e., true positives and false negatives). F1 is the harmonic mean of precision and recall. AUC score is the normalized area under the curve that plots the true positives against the false positives for different thresholds for classification [28]. Larger accuracy, precision, recall, F1 and AUC values indicate better classification performance.

Results

Overall performance

Table 4 presents the performance comparison among the four different kernels in combination with different single drug similarities on dataset $\mathop{\mathcal{D}^{*}}\limits$ . Kernel $\mathop{\mathcal{K}_{\text{gm}}}\limits$ with $\mathop{\text{SDS}_{\text{cm}}}\limits$ outperforms others in three (i.e., accuracy, F1 and AUC) out of five evaluation metrics. Specifically, in accuracy, $\mathop{\mathcal{K}_{\text{gm}}}\limits$ with $\mathop{\text{SDS}_{\text{cm}}}\limits$ outperforms the second best kernel $\mathop{\mathcal{K}_{\text{gm}}}\limits$ with $\mathop{\text{SDS}_{\text{2d}}}\limits$ at 0.84%. In F1, $\mathop{\mathcal{K}_{\text{gm}}}\limits$ with $\mathop{\text{SDS}_{\text{cm}}}\limits$ outperforms the second best kernel $\mathop{\mathcal{K}_{\text{gm}}}\limits$ with $\mathop{\text{SDS}_{\text{2d}}}\limits$ and $\mathop{\mathcal{K}_{\text{ds}}}\limits$ with $\mathop{\text{SDS}_{\text{cm}}}\limits$ at 0.98%. In AUC, $\mathop{\mathcal{K}_{\text{gm}}}\limits$ with $\mathop{\text{SDS}_{\text{cm}}}\limits$ outperforms the second best kernel order-2 $\mathop{\mathcal{K}_{\text{cd}}}\limits$ at 0.33%. In precision and recall, $\mathop{\mathcal{K}_{\text{gm}}}\limits$ with $\mathop{\text{SDS}_{\text{cm}}}\limits$ is the second best kernel, whereas $\mathop{\mathcal{K}_{\text{ds}}}\limits$ with $\mathop{\text{SDS}_{\text{cm}}}\limits$ and $\mathop{\mathcal{K}_{\text{ds}}}\limits$ with $\mathop{\text{SDS}_{\text{2d}}}\limits$ , respectively, is the best one. Overall, $\mathop{\mathcal{K}_{\text{gm}}}\limits$ with $\mathop{\text{SDS}_{\text{cm}}}\limits$ has the best performance compared to other kernels. This indicates that it is effective to classify drug combinations by representing and comparing them as graphs (i.e., a set of drugs and their co-medication relation within the set), and measuring such graph similarities using their optimal matching (i.e., the optimal correspondence among drugs). In the following discussion, we use $\mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits$ to represent $\mathop{\mathcal{K}_{\text{gm}}}\limits$ with $\mathop{\text{SDS}_{\text{cm}}}\limits$ . More experimental results on other datasets are available in the supplementary materials (see Additional file 1).

$\mathop{\text{SDS}}\limits$ performance

Table 4 shows that $\mathop{\text{SDS}_{\text{cm}}}\limits$ on average outperforms $\mathop{\text{SDS}_{\text{2d}}}\limits$ across different kernels (with a few exceptions on in precision for $\mathop{\mathcal{K}_{\text{ds}}}\limits$ and $\mathop{\mathcal{K}_{\text{pb}}}\limits$ ). $\mathop{\text{SDS}_{\text{2d}}}\limits$ considers drug intrinsic 2D structures. However, drug efficacy and side effects are the results of many complicated interactions and processes among drugs and various bioentities, which may not be sufficiently explained only by drug 2D structures. Compared to $\mathop{\text{SDS}_{\text{2d}}}\limits$ , $\mathop{\text{SDS}_{\text{cm}}}\limits$ measures drug similarity based on their co-medication patterns, which could be regarded as a high-level abstraction and representation of drug therapeutic properties that may or may not be explicitly explained by each drug and its intrinsic properties independently.

In $\mathop{\mathcal{K}_{\text{cd}}}\limits$ , order-2 representation (i.e., in $\mathop{\mathcal{K}^{(2)}_{\text{cd}}}\limits$ ) for drug combinations outperforms order-1 representation (i.e., in $\mathop{\mathcal{K}^{(1)}_{\text{cd}}}\limits$ ). In order-2 representation, in addition to single drugs, drug pairs are also used as a feature for a drug combination, which stresses the signals in drug combinations. This also conforms to common observations in other applications [29], in which higher-order features improve classification performance.

Classification

Figure 2 and 3 present the $\mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits$ prediction values with respect to drug combination orders. In Figure 2, $\mathop{\mathcal{M}^{+}}\limits$ drug combinations have higher orders (on average 7.615 as in Table 3), and higher and mostly positive prediction values, while $\mathop{\mathcal{N}^{-}}\limits$ drug combinations have lower orders (on average 2.678), and lower and mostly negative prediction values. Meanwhile, the mis-classification typically happens on $\mathop{\mathcal{N}^{-}}\limits$ drug combinations of higher orders, and on $\mathop{\mathcal{M}^{+}}\limits$ drug combinations of lower orders. Similar trends apply for $\mathop{\mathcal{M}^{0}}\limits$ and $\mathop{\mathcal{N}^{0}}\limits$ in Figure 3. This indicates that $\mathop{\mathcal{K}_{\text{gm}}}\limits$ and $\mathop{\text{SDS}_{\text{cm}}}\limits$ together are able to learn and make predictions that correspond to drug combination orders. In addition, drug combination order is correlated with their $\mathop{\text{ADR}}\limits$ labels.

Figure 4 presents the $\mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits$ prediction values with respect to drug combination frequencies for $\mathop{\mathcal{M}^{+}}\limits$ and $\mathop{\mathcal{N}^{-}}\limits$ . $\mathop{\mathcal{M}^{+}}\limits$ drug combinations have lower frequencies (on average 5.520 as in Table 3), and higher and mostly positive prediction values, while $\mathop{\mathcal{N}^{-}}\limits$ drug combinations have higher frequencies (on average 42.082), and lower and mostly negative prediction values. For $\mathop{\mathcal{N}^{-}}\limits$ , the mis-classification typically happens on lower-frequency drug combinations (the mis-classification for $\mathop{\mathcal{M}^{+}}\limits$ does not show strong patterns with respect to drug combination frequencies). As for $\mathop{\mathcal{M}^{+}}\limits$ and $\mathop{\mathcal{N}^{-}}\limits$ , drug combination frequencies are used to define $\mathop{\text{ADR}}\limits$ labels. Figure 4 shows that $\mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits$ together are able to learn and make predictions that correspond to drug combination frequencies and thus $\mathop{\text{ADR}}\limits$ labels.

Figure 5 presents the $\mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits$ prediction values with respect to $\mathop{\mathtt{OR}}\limits$ values for $\mathop{\mathcal{M}^{0}}\limits$ and $\mathop{\mathcal{N}^{0}}\limits$ . $\mathop{\mathcal{M}^{0}}\limits$ drug combinations have higher $\mathop{\mathtt{OR}}\limits$ values and also higher and mostly positive prediction values, while $\mathop{\mathcal{N}^{0}}\limits$ drug combinations have lower $\mathop{\mathtt{OR}}\limits$ values and also lower and mostly negative prediction values. For $\mathop{\mathcal{N}^{0}}\limits$ , the mis-classification typically happens on drug combinations of higher $\mathop{\mathtt{OR}}\limits$ values (close to 1 and thus more lean toward $\mathop{\text{ADR}}\limits$ ; the mis-classification for $\mathop{\mathcal{M}^{0}}\limits$ does not show strong patterns with respect to $\mathop{\mathtt{OR}}\limits$ values). As we use $\mathop{\mathtt{OR}}\limits$ to define $\mathop{\text{ADR}}\limits$ labels on $\mathop{\mathcal{M}^{0}}\limits$ and $\mathop{\mathcal{N}^{0}}\limits$ , Figure 5 shows $\mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits$ is able to make reasonably accurate prediction values on the drug combinations.

$\mathop{\mathcal{D}_{\text{Myo}}}\limits$ drug enrichment

Table 5 presents the average percentage of $\mathop{\mathcal{D}_{\text{Myo}}}\limits$ drugs among all the drug combinations. For each drug combination, the percentage is calculated as the number of its drugs that can cause myopathy on their own (i.e., drugs in $\mathop{\mathcal{D}_{\text{Myo}}}\limits$ ) divided by the drug combination order. As Table 5 shows, top-10 mis-classified $\mathop{\mathcal{N}}\limits$ drug combinations (i.e., $\mathop{\tilde{\mathcal{N}}^{10+}}\limits$ ) have almost twice as many $\mathop{\mathcal{D}_{\text{Myo}}}\limits$ drugs (30.7%) as those in $\mathop{\mathcal{N}}\limits$ (15.6%), and even more than those in $\mathop{\mathcal{M}}\limits$ drug combinations (24.3%). In addition, mis-classified $\mathop{\mathcal{N}}\limits$ drug combinations (i.e., $\mathop{\tilde{\mathcal{N}}^{+}}\limits$ ) also have significantly more $\mathop{\mathcal{D}_{\text{Myo}}}\limits$ drugs (18.6%) than those in $\mathop{\mathcal{N}}\limits$ (15.6%). Since $\mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits$ matches similar drugs, high $\mathop{\mathcal{D}_{\text{Myo}}}\limits$ drug enrichment could be a primary reason for the mis-classification.

Top predictions

Top mis-classification on $\mathop{\mathcal{N}}\limits$

Table 6 lists the top-10 (in terms of prediction values) drug combinations in $\mathop{\mathcal{N}}\limits$ (i.e., without myopathy) that are mis-classified as positive (i.e., with myopathy) by $\mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits$ . For those drug combinations which appear in $\mathop{\mathcal{N}^{0}}\limits$ , we present their $\mathop{\mathtt{OR}}\limits$ values, otherwise only frequencies. Those top mis-classified $\mathop{\mathcal{N}}\limits$ drug combinations contain many single drugs, which on their own can induce myopathy (i.e., in $\mathop{\mathcal{D}_{\text{Myo}}}\limits$ , bold in Table 6). As a matter of fact, the percentage of $\mathop{\mathcal{D}_{\text{Myo}}}\limits$ drugs in top mis-classified $\mathop{\mathcal{N}}\limits$ drug combinations is significantly higher than average. In Table 6, one special mis-classified $\mathop{\mathcal{N}}\limits$ drug combination is {lansoprazole omeprazole pantoprazole rabeprazole}, which does not contain any $\mathop{\mathcal{D}_{\text{Myo}}}\limits$ drugs. This set of drugs is commonly used as proton pump inhibitors (PPIs) to decrease the amount of acid produced in the stomach. Some case studies show evidence of causality between the PPI drug class and myopathy [30, 31].

Top prediction on $\mathop{\mathcal{M}}\limits$

Table 7 presents the top-10 correctly predicted $\mathop{\mathcal{M}}\limits$ drug combinations by $\mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits$ . These drug combinations are significantly enriched with $\mathop{\mathcal{D}_{\text{Myo}}}\limits$ drugs (i.e., drugs that can induce myopathy on their own). As Table 5 shows, $\mathop{\tilde{\mathcal{M}}^{10+}}\limits$ has the most $\mathop{\mathcal{D}_{\text{Myo}}}\limits$ drugs (89.8%) compared to all the other sets and significantly more than $\mathop{\mathcal{M}}\limits$ . In particular, all of these combinations contain statin drugs (e.g., atorvastatin, simvastatin and rosuvastatin, etc.). These statin-related drugs have been studied in literature as a drug class that has high possibilities to induce myopathy [32, 33]. In addition, in Table 7, 4 out of the 6 $\mathop{\mathcal{M}^{0}}\limits$ drug combinations among top 10 (i.e., the drug combinations that have $\mathop{\mathtt{OR}}\limits$ values) have their $\mathop{\mathtt{OR}}\limits$ values higher than average in $\mathop{\mathcal{M}^{0}}\limits$ (31.998 as in Table 3), and 3 out of the 4 $\mathop{\mathcal{M}^{+}}\limits$ drug combinations among top 10 (i.e., the drug combinations that do not have $\mathop{\mathtt{OR}}\limits$ values) have their frequency higher than average in $\mathop{\mathcal{M}^{+}}\limits$ (5.520 as in Table 3). In addition, among the top-20 drug combinations predicted by $\mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits$ , 7 out of 12 $\mathop{\mathcal{M}^{0}}\limits$ drug combinations have their $\mathop{\mathtt{OR}}\limits$ values higher than average in $\mathop{\mathcal{M}^{0}}\limits$ , and 5 out of 8 $\mathop{\mathcal{M}^{+}}\limits$ drug combinations have their frequency higher than average in $\mathop{\mathcal{M}^{+}}\limits$ . The average $\mathop{\mathtt{OR}}\limits$ values of the top-10, top-20 and top-50 drug combinations from $\mathop{\mathcal{M}^{0}}\limits$ predicted by $\mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits$ are 55.725, 42.956 and 42.114, respectively, and they are all higher than the average 31.998 for $\mathop{\mathcal{M}^{0}}\limits$ . The average frequencies of the top-10, top-20 and top-50 drug combinations from $\mathop{\mathcal{M}^{+}}\limits$ predicted by $\mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits$ are 8.250, 6.875 and 6.524, respectively, and they are also all higher than the average 5.520 on $\mathop{\mathcal{M}^{+}}\limits$ . This indicates that $\mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits$ does learn signals from $\mathop{\mathcal{M}}\limits$ and correspondingly makes predictions.

Top non- $\mathop{\mathcal{D}_{\text{Myo}}}\limits$ prediction on $\mathop{\mathcal{M}}\limits$

Table 8 presents the top-10 correctly predicted $\mathop{\mathcal{M}}\limits$ drug combinations by $\mathop{\mathcal{K}_{\text{gm}}^{\text{cm}}}\limits$ that do not contain any drugs from $\mathop{\mathcal{D}_{\text{Myo}}}\limits$ (i.e., do not contain drugs that can induce myopathy on their own). 4 out of 7 drug combinations from $\mathop{\mathcal{M}^{0}}\limits$ in this table have their $\mathop{\mathtt{OR}}\limits$ values higher than the average in $\mathop{\mathcal{M}^{0}}\limits$ (31.998 as in Table 3). The 3 drug combinations from $\mathop{\mathcal{M}^{+}}\limits$ in this table have their frequency lower than the average in $\mathop{\mathcal{M}^{+}}\limits$ (5.520 as in Table 3) but very close. In Table 8, 8 out of top-10 drug combinations include alendronate. Case studies demonstrate that several events of severe muscle pain, which is the common symptom of myopathy, were reported after patients started therapy with alendronate [34], showing the association between the medical treatment with alendronate and myopathy.

Discussions

The experimental results show that the new methods with drug co-medication based single drug similarities outperform other kernels, such as convolutional kernels [16] and probabilistic kernels [25], and can accurately predict whether a drug combination is likely to induce ADRs of interest. The experimental results demonstrate the advance of such single drug similarities that leverage co-medication patterns among high-order drug-drug interactions, and also inspire further exploration that learns such similarities in a pure data-driven fashion without pre-defined kernels, for example, via manifold learning. Further research would also include learning drug representations in a data-driven fashion such that the representations better quantify drug similarities in terms of their co-medication patterns. Deep learning would be an optimistic option for such drug representation learning.

Conclusions

In this manuscript, SVM-based classification methods were developed to predict whether a drug combination of arbitrary orders is likely to induce adverse drug reactions. Novel kernels over drug combinations of arbitrary orders were developed for such classification. These kernels were constructed from various single-drug information including drug co-medication patterns, and compare drug combination similarities based on single drugs they have and the relations among the single drugs. Specifically, a novel kernel over drug combinations of arbitrary orders was developed based on graph matching over drug combination graphs. A dataset from FDA Adverse Event Reporting System (FAERS) was constructed to test the new methods. The experimental results demonstrated that the new methods with drug co-medication based single drug similarities and graph matching based kernels achieve the best AUC as 0.912. The prediction also revealed strong patterns among drug combinations (e.g., statin enriched) that may be highly correlated with their induced ADRs.

List of abbreviations

DDI: Drug-Drug Interactions; ADR: Adverse Drug Reaction; SDS: Single Drug Similarities; ECFP: Extended Connectivity Fingerprints; and $\mathop{\mathtt{OR}}\limits$ : Odds Ratio.

Declarations

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Availability of data and material

The data and materials will be made publicly available upon the acceptance of the manuscript.

Competing interests

The authors declare that they have no competing interests.

Funding

This material is based upon work supported by the National Science Foundation under Grant Number IIS-1566219 and IIS-1622526. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Author’s contributions

Wen-Hao Chiang implemented the methods and conducted the experiments. Li Shen, Lang Li and Xia Ning developed the methods and designed the experiments. Xia Ning analyzed the experimental results. Wen-Hao Chiang and Xia Ning wrote the manuscript.

Figures

Additional Files

Additional file 1 — Drug-Drug Interaction Prediction based on Co-Medication Patterns and Graph Matching (Supplementary Materials)

The additional file named as “supp.pdf” includes more experimental results on other datasets and it is provided in PDF format . Any PDF reader are recommended to view the file.

Tables

Bibliography34

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Ramirez, E., Carcas, A.J., Borobia, A.M., Lei, S.H., Piñana, E., Fudio, S., Frias, J.: A pharmacovigilance program from laboratory signals for the detection and reporting of serious adverse drug reactions in hospitalized patients. Clinical Pharmacology & Therapeutics 87 (1), 74–86 (2010). doi: 10.1038/clpt.2009.185
2[2] Percha, B., Altman, R.B.: Informatics confronts drug-drug interactions. Trends in pharmacological sciences 34 (3), 178–184 (2013). doi: 10.1016/j.tips.2013.01.006
3[3] The National Health and Nutrition Examination Survey. http://www.cdc.gov/NCHS/NHANES.htm
4[4] Iyer, S.V., Harpaz, R., Le Pendu, P., Bauer-Mehren, A., Shah, N.H.: Mining clinical text for signals of adverse drug-drug interactions. Journal of the American Medical Informatics Association 21 (2), 353–362 (2014)
5[5] Vilar, S., Uriarte, E., Santana, L., Lorberbaum, T., Hripcsak, G., Friedman, C., Tatonetti, N.P.: Similarity-based modeling in large-scale prediction of drug-drug interactions. Nature protocols 9 (9), 2147–2163 (2014)
6[6] Hammann, F., Drewe, J.: Data mining for potential adverse drug–drug interactions. Expert Opinion on Drug Metabolism & Toxicology 10 (5), 665–671 (2014). doi: 10.1517/17425255.2014.894507 . PMID: 24588496. http://dx.doi.org/10.1517/17425255.2014.894507
7[7] Luo, H., Zhang, P., Huang, H., Huang, J., Kao, E., Shi, L., He, L., Yang, L.: Ddi-cpi, a server that predicts drug–drug interactions through implementing the chemical–protein interactome. Nucleic Acids Research (2014). doi: 10.1093/nar/gku 433
8[8] Harpaz, R., Du Mouchel, W., Shah, N.H., Madigan, D., Ryan, P., Friedman, C.: Novel data-mining methodologies for adverse drug event discovery and analysis. Clinical Pharmacology & Therapeutics 91 (6), 1010–1021 (2012). doi: 10.1038/clpt.2012.50

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Drug-drug interaction prediction based on

Abstract

keywords:

Introduction

Background

Drug-drug interactions

Graph matching

Definitions and notations

Methods

Single drug similarities

SDS\mathop{\text{SDS}}\limitsSDS from drug 2d structures

SDS\mathop{\text{SDS}}\limitsSDS based on co-medications

Drug combination kernels from graph matching

Graph matching algorithm for Kgm\mathop{\mathcal{K}_{\text{gm}}}\limitsKgm​

Convolutional drug-combination kernels

Drug combination kernels from common drugs

Drug combination kernels from drug similarities

Probabilistic drug combination kernels from drug sets

Materials

Mining drug combinations

Training data generation

Evaluation protocol and metrics

Results

Overall performance

SDS\mathop{\text{SDS}}\limitsSDS performance

Classification

DMyo\mathop{\mathcal{D}_{\text{Myo}}}\limitsDMyo​ drug enrichment

Top predictions

Top mis-classification on N\mathop{\mathcal{N}}\limitsN

Top prediction on M\mathop{\mathcal{M}}\limitsM

Top non-DMyo\mathop{\mathcal{D}_{\text{Myo}}}\limitsDMyo​ prediction on M\mathop{\mathcal{M}}\limitsM

Discussions

Conclusions

List of abbreviations

Declarations

Ethics approval and consent to participate

Consent for publication

Availability of data and material

Competing interests

Funding

Author’s contributions

Figures

Additional Files

Additional file 1 — Drug-Drug Interaction Prediction based on Co-Medication Patterns and Graph Matching (Supplementary Materials)

Tables

$\mathop{\text{SDS}}\limits$ from drug 2d structures

$\mathop{\text{SDS}}\limits$ based on co-medications

Graph matching algorithm for $\mathop{\mathcal{K}_{\text{gm}}}\limits$

$\mathop{\text{SDS}}\limits$ performance

$\mathop{\mathcal{D}_{\text{Myo}}}\limits$ drug enrichment

Top mis-classification on $\mathop{\mathcal{N}}\limits$

Top prediction on $\mathop{\mathcal{M}}\limits$

Top non- $\mathop{\mathcal{D}_{\text{Myo}}}\limits$ prediction on $\mathop{\mathcal{M}}\limits$