Finding melanoma drugs through a probabilistic knowledge graph


Finding melanoma drugs through a
probabilistic knowledge graph

James P. McCusker1, Michel Dumontier2, Rui Yan1, Sylvia He1,
Jonathan S. Dordick3,4 and Deborah L. McGuinness1,4

1 Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY, USA
2 Stanford Center for Biomedical Informatics Research, Stanford University School of Medicine,

Stanford, CA, USA
3 Department of Chemical & Biological Engineering, Rensselaer Polytechnic Institute, Troy,

NY, USA
4 Center for Biotechnology & Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy,

NY, USA

ABSTRACT
Metastatic cutaneous melanoma is an aggressive skin cancer with some progression-

slowing treatments but no known cure. The omics data explosion has created many

possible drug candidates; however, filtering criteria remain challenging, and systems

biology approaches have become fragmented with many disconnected databases.

Using drug, protein and disease interactions, we built an evidence-weighted

knowledge graph of integrated interactions. Our knowledge graph-based system,

ReDrugS, can be used via an application programming interface or web interface,

and has generated 25 high-quality melanoma drug candidates. We show that

probabilistic analysis of systems biology graphs increases drug candidate quality

compared to non-probabilistic methods. Four of the 25 candidates are novel

therapies, three of which have been tested with other cancers. All other candidates

have current or completed clinical trials, or have been studied in in vivo or in vitro.

This approach can be used to identify candidate therapies for use in research or

personalized medicine.

Subjects Bioinformatics, Computational Biology, Data Science, World Wide Web and Web Science
Keywords Melanoma, Knowledge graphs, Drug repositioning, Uncertainty reasoning

INTRODUCTION
Metastatic cutaneous melanoma is an aggressive cancer of the skin with low prevalence

but very high mortality rate, with an estimated 5-year survival rate of 6% (Barth,

Wanek & Morton, 1995). There are currently no known therapies that can consistently

cure metastatic melanoma. Vemurafenib is effective against BRAF mutant melanomas

(Chapman et al., 2011) but resistant cells often result in recurrence of metastases (Le et al.,

2013). Melanoma itself may be best approached based on the individual genetics of the

tumor, as it has been shown to involve mutations in many different genes to produce

the same disease (Krauthammer et al., 2015). Because of this, an individualized approach

may be necessary to find effective treatments.

Drug repurposing, or the discovery of new uses for existing approved drugs, can often

lead to effective new treatments for diseases. A wide range of computational methods

have been developed in support of drug repositioning. Computational approaches

(Sanseau & Koehler, 2011) include topic modeling (Bisgin et al., 2012, 2014), side-effect

How to cite this article McCusker et al. (2017), Finding melanoma drugs through a probabilistic knowledge graph. PeerJ Comput. Sci.
3:e106; DOI 10.7717/peerj-cs.106

Submitted 27 April 2016
Accepted 27 December 2016
Published 13 February 2017

Corresponding authors
James P. McCusker,

mccusj@cs.rpi.edu

Deborah L. McGuinness,

dlm@cs.rpi.edu

Academic editor
Yonghong Peng

Additional Information and
Declarations can be found on
page 14

DOI 10.7717/peerj-cs.106

Copyright
2017 McCusker et al.

Distributed under
Creative Commons CC-BY 4.0

http://dx.doi.org/10.7717/peerj-cs.106
mailto:mccusj@�cs.�rpi.�edu
mailto:dlm@�cs.�rpi.�edu
https://peerj.com/academic-boards/editors/
https://peerj.com/academic-boards/editors/
http://dx.doi.org/10.7717/peerj-cs.106
http://www.creativecommons.org/licenses/by/4.0/
http://www.creativecommons.org/licenses/by/4.0/
https://peerj.com/computer-science/


similarity (Yang & Agarwal, 2011; Ye, Liu & Wei, 2014), drug and/or disease similarity

(Chiang & Butte, 2009; Gottlieb et al., 2011), genome-wide association studies (Kingsmore

et al., 2008; Grover et al., 2014), and gene expression (Lamb et al., 2006; Sirota et al., 2011).

Systems biology has also provided a number of network analysis approaches (Yang &

Agarwal, 2011; Wu, Wang & Chen, 2013; Cheng et al., 2012; Emig et al., 2013; Harrold,

Ramanathan & Mager, 2013; Wu et al., 2013; Vogt, Prinz & Campillos, 2014) but the

field has been limited by a fragmentation of databases. Most systems biology databases

are not aligned with each other, and typically leave out crucial information about how

other biological entities, like drugs and diseases, interact with the systems biology

graph. Further, while some interaction databases provide human curation and validation

of pathway interactions, and others provide experimental evidence for the recorded

interactions, there has not yet been, to our knowledge, a resource that combines the two

approaches and quantifies the reliability of the evidence used to assert the interactions.

A knowledge graph is a compilation of facts and figures that can be used to provide

contextual meaning to searches. Google is using knowledge graphs to improve its search

and to analyze the information graph of the web; Facebook is using them to analyze

the social graph. We built our knowledge graph with the goal of unifying large parts of

biomedical domain knowledge for both mining and interactive exploration related to

drugs, diseases, and proteins. Our knowledge graph is enhanced by the provenance of each

fragment of knowledge captured, which is used to compute the confidence probabilities

for each of those fragments. Further, we use open standards from the world wide web

consortium (W3C), including the resource description framework (RDF) (Richard, David

& Markus, 2014), web ontology language (OWL) (Motik, Patel-Schneider & Cuenca Grau,

2009), and SPARQL (Harris, Seaborne & Prud’hommeaux, 2013). The representation of

the knowledge in our knowledge graph is aligned with best practice vocabularies and

ontologies from the W3C and the biomedical community, including the provenance

ontology (PROV-O) (Lebo, Sahoo & McGuinness, 2013), the HUPO proteomics standards

initiative molecular interactions (PSI-MI) ontology (Hermjakob et al., 2004), and the

semanticscience integrated ontology (SIO) (Dumontier et al., 2014). Use of these

standards, vocabularies, and ontologies make it simple for ReDrugS to integrate with

other similar efforts in the future with minimal effort.

We proposed and built a novel computational drug repositioning platform, that we

refer to as ReDrugS, that applies probabilistic filtering over individually-supported

assertions drawn from multiple databases pertaining to systems biology, pharmacology,

disease association, and gene expression data. We use our platform to identify novel and

known drugs for melanoma.

RESULTS
We used ReDrugS to examine the drug–target–disease network and identify known, novel,

and well supported melanoma drugs. The ReDrugS knowledge base contained 6,180

drugs, 3,820 diseases, 69,279 proteins, and 899,198 interactions. The drugs included in

ReDrugS follow the distribution by the Anatomic Therapeutic Classification (ATC)

categories shown in Fig. 1.

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 2/20

http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


We examined drug and gene connections that were three or less interaction steps

from melanoma, and additionally filtered interactions with a joint probability greater or

equal to 0.93. We identified 25 drugs in the resulting drug–gene–disease network

surrounding melanoma as illustrated in Fig. 2.

We then validated the set of 25 drugs by determining their position in the drug

discovery pipeline for melanoma. Table 1 shows that nearly all drugs uncovered by

ReDrugS were previously been identified as potential melanoma therapies either in

clinical trials or in vivo or in vitro. Of the 25 drugs, 12 have been in Phase I, II, or III

clinical trials, five have been studied in vitro, four in vivo, one was investigated as a case

study, and three are novel.

To further evaluate our system, we examined the impact of decreasing the joint

probability or increasing the number of interaction steps. Figures 3A and 3B show

precision, recall, and f-measure curves while varying each parameter. Using these

information retrieval performance curves, we found that using a joint probability of

0.93 or greater with three or less interaction steps maximizes the precision and recall as

shown in Fig. 3.

By performing a sampled literature search on hypothesis candidates with a joint

probability of 0.5 or higher and six or fewer interaction steps, we were able to generate

precision, recall, and f-measure curves for both cutoffs to find our cutoff of 0.93 with three

or fewer interaction steps. The precision, recall, and f-measure curves are shown for

varying joint probability thresholds in Fig. 3A and for varying interaction step counts

in Fig. 3B.

Figure 1 Percentage approved drugs in each of the categories of the anatomic therapeutic classification (ATC) system.

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 3/20

http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


DISCUSSION
We designed ReDrugS to quickly and automatically integrate and filter a heterogeneous

biomedical knowledge graph to generate high-confidence drug repositioning

candidates. Our results indicate that ReDrugs generates clinically plausible drug

candidates, in which half are in various stages of clinical trials, while others are novel or

are being investigated in pre-clinical studies. By helping to consolidate the three

main datatypes—drug targets, protein interactions, and disease genes—ReDrugs can

amplify the ability of researchers to filter the vast amount of information into those

that are relevant for drug discovery.

Candidate significance
Three drugs were identified that have not previously been studied for melanoma

treatment. Framycetin, a CXCR4 inhibitor, has not previously been considered for

melanoma treatment. While it is nephrotoxic when administered orally (Greenberg,

1965), it is used topically as an antibacterial treatment. While it may not be of use for

metastasis, it might serve as a simple, inexpensive prophylactic treatment after excision of

primary tumors. Additionally, Lucanthone and Podofilox were identified as having

potential effects on melanoma through CDKN2A and MAP kinase, respectively.

Figure 2 The interaction graph of predicted melanoma drugs with a probability of 0.93 or higher

and have three or fewer intervening interactions between drug and disease. The “Explore” tab

contains the controls to expand the network in various ways, including the filtering parameters. Node

and edge detail tabs provide additional information about the selected node or edge, including

the probabilities of the edges selected. Users can control the layout algorithm and related options using

the “Options” tab.

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 4/20

http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


One drug we identified, Vemurafenib, is approved for treatment of late stage melanoma

has been shown to inhibit the BRAF protein in BRAF-V600 mutant melanomas

(Chapman et al., 2011). However, cells can become resistant to Vemurafenib, thereby

leading to metastasis (Le et al., 2013).

A number of the drugs we identified are in clinical trials for treatment of melanoma.

We identified BRAF-oriented drugs, Dabrafenib (Hauschild et al., 2012), Sorafenib

(National Cancer Institute, 2005), and Regorafenib (Istituto Clinico Humanitas, 2015), that

have been evaluated in clinical trials, but have not yet been approved. Zidovudine or

Azidothymidine is a TERT inhibitor that has shown significant melanoma tumor

reductions in mouse models (Humer et al., 2008). Three MAP kinase-related compounds,

Vinblastine (Luikart, Kennealey & Kirkwood, 1984), Trametinib (Kim et al., 2012), and

Vinorelbine (Whitehead et al., 2004) were identified that are in clinical trials for melanoma

treatment. CDKN2A was another popular target, as Irinotecan (Fiorentini et al., 2009),

Table 1 Drug discovery status for 25 drug candidates identified using ReDrugS.

Status Drug Pathway Steps Joint p

Approved Vemurafenib (Chapman et al., 2011) BRAF 2 0.98

Phase III Dabrafenib (Hauschild et al., 2012) BRAF 2 0.98

Sorafenib (National Cancer Institute, 2005) BRAF 2 0.98

Vinblastine (Luikart, Kennealey & Kirkwood, 1984) MAP kinase 3 0.93

Phase II Zidovudine (Humer et al., 2008) TERT 2 0.98

Trametinib (Kim et al., 2012) MAP kinase 2 0.98

Regorafenib (Istituto Clinico Humanitas, 2015) BRAF 2 0.98

Nadroparin (Nagy, Turcsik & Blaskó, 2009) MYC 3 0.97

Vinorelbine (Whitehead et al., 2004) MAP kinase 3 0.93

Irinotecan (Fiorentini et al., 2009) CDKN2A 3 0.93

Topotecan (Kraut et al., 1997) CDKN2A 3 0.93

Phase I Sodium stibogluconate (Naing, 2011) CDKN2A 3 0.93

Case study Ingenol mebutate (Mansuy et al., 2014) PRKCA/BRAF 3 0.95

In vitro Bosutinib (Homsi et al., 2009) MAP kinase 2 0.98

Purvalanol (Smalley et al., 2007) MAP kinase/TP53 3 0.97

Ellagic acid (Kim et al., 2008) PRKCA/BRAF 3 0.95

Albendazole (Patel et al., 2011) CDKN2A 3 0.93

Colchicine (Lemontt, Azzaria & Gros, 1988) MAP kinase 3 0.93

In vivo Plerixafor (D’Alterio et al., 2012) CXCR4 3 0.97

Vincristine (Sawada et al., 2004) MAP kinase 3 0.93

L-Methionine (Clavo & Wahl, 1996) CDKN2A 3 0.93

Mebendazole (Doudican et al., 2008) CDKN2A 3 0.93

Novel Framycetin CXCR4 3 0.97

Lucanthone CDKN2A 3 0.93

Podofilox MAP kinase 3 0.93

Note:
“Pathway” refers to the target or pathway that the drug acts on. “Steps” is distance in number of interactions between the
drug and the disease, and “Joint p” is the joint probability that all of those interactions occur.

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 5/20

http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


Topotecan (Kraut et al., 1997), and Sodium stibogluconate (Naing, 2011) are all drugs in

clinical trial that we identified as potential therapies.

Many other drugs were identified that are being studied in the lab. Additional drugs

were identified that target the MAP kinase pathway, including Bosutinib (Homsi et al.,

2009), Purvalanol (Smalley et al., 2007), Colchicine (Lemontt, Azzaria & Gros, 1988), and

Vincristine (Sawada et al., 2004). Podofilox has not yet been investigated in melanoma

treatments, but preliminary investigations have focused on treating chronic lymphocytic

leukemia (Shen et al., 2013) and non-small cell lung cancer (Peng et al., 2014). Since

these drugs attack MAPK2 and related proteins rather than BRAF or NRAS, they can

potentially synergize with other treatments (Homsi et al., 2009). Bosutinib in particular

has been investigated as a synergistic treatment for melanoma (Held et al., 2012). Another

possible treatment pathway is CXCR4 inhibition. Mouse models suggest that CXCR4

inhibitors like Plerixafor can reduce tumor metastasis and primary tumor growth

(D’Alterio et al., 2012). We identify both Plerixafor and Framycetin (Neomycin B) as

useful CXCR4 inhibitors. Two PKRCA activators, Ingenol mebutate and Ellagic acid, were

also identified. PKRCA binds with BRAF (Pardo et al., 2006), but it is mechanistically

unclear how PKRCA activation would result in treatment of melanoma. A number of

other therapies are also notable. Purvalenol can inhibit GSK3b, which in turn activates
TP53. Some, but not all, melanomas have TP53 deactivation (Smalley et al., 2007).

Nadroparin, a MYC inhibitor, may inhibit tumor progression (Nagy, Turcsik & Blaskó,

2009). More broadly, heparins can potentially inhibit the metastatic process in melanoma

and other cancers (Maraveyas et al., 2010).

(A) Information Retrieval by Probability Threshold(A) Information Retrieval by Probability Threshold

0.90.9 0.0.0.8 0.0.0.7 0.0.0.6

0.250.250.25

0.50.50.5

0.750.750.75

precisionprecision recallrecall f-measuref-measure

 
(B) Information Retrieval by Network Expansion Step(B) Information Retrieval by Network Expansion Step

33 444 555

.250.25

.50.5

.750.75

precisionprecision recallrecall f-measuref-measure

Figure 3 Precision,recall, and f-measure by(A) varyingthresholds forjoint probability and (B) varying number of interaction steps.Precision is

the percentage of returned candidates that have been validated experimentally or have been in a clinical trial (a “hit”) versus all candidates returned.

Recall is the percentage of all known validated “hits.” f-Measure is the geometric mean of precision and recall that provides a balanced evaluation of

the quality and completeness of the results.

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 6/20

http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


The approach that we present here offers a novel, mechanism-focused exploration to

identify and examine drugs and targets related to cancer. This approach filters our noisy or

poorly supported parts of the knowledge graph to identify more confident mechanisms

between drugs, targets, and diseases. Thus, our approach can be used to explore high

confidence associations that are produced as a result of large scale computational

screens that use network connectivity (Yang & Agarwal, 2011; Wu, Wang & Chen, 2013;

Cheng et al., 2012; Emig et al., 2013; Harrold, Ramanathan & Mager, 2013; Wu et al., 2013;

Vogt, Prinz & Campillos, 2014), the complementarity in drug-disease gene expression,

and the similarity of chemical fingerprints, side-effects, targets, or indications (Yang &

Agarwal, 2011; Ye, Liu & Wei, 2014; Chiang & Butte, 2009; Gottlieb et al., 2011; Lamb et al.,

2006; Sirota et al., 2011). Importantly, since we focus on protein networks that are strongly

linked with diseases, we believe that our mechanism focused approach will also aid in

the identification of disease-modifying drug candidates, rather than solely those that

would be useful for the treatment of symptomatic phenotypes or related co-morbid

conditions.

Architecture
ReDrugS uses a fairly straightforward web architecture, as shown in Fig. 4. It uses the

Blazegraph RDF database backend. The database layer is interchangeable except that

the full text search service needs to use Blazegraph-only properties to perform text

searches as text indexing is not yet standardized in the SPARQL query language. All

other aspects are standardized and should work with other RDF databases without

modification. ReDrugs currently uses the Python-based TurboGears web application

framework hosted using the web services gateway interface standard via an Apache

HTTP server. TurboGears in turn hosts the semantic automated discovery and

integration (SADI) web services that drive the application and access the database. It also

serves up the static HTML and supporting files.

RDF Store

Python + Apache Web Server

/api/search
/api/upstream
/api/downstream

Javascript Web Client

Cytoscape.js

Javascript Web Client

SPARQL

JSON-LD

Figure 4 The ReDrugS software architecture. Using web standards and a three-layer architecture (RDF

store, web server, and rich web client), we were able to build a complete knowledge graph analysis

platform.

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 7/20

http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


The user interface is implemented with AngularJS and Cytoscape.js, which submits

queries to the SADI web services using JSON-LD and aggregates results into the

networked view. The software relies exclusively on standardized protocols (HTTP, SADI,

SPARQL, RDF, and others) to make it simple to replace technologies as needed. The data

itself is processed using conversion scripts as shown in Fig. 5.

We have also adapted and featured ReDrugS in an immersive visualization laboratory

called the collaborative-research augmented immersive virtual environment (CRAIVE)

lab at RPI, as shown in Fig. 6. The goal of the demonstration was to explore new ways

to visualize, sonify, and interact with big data in large-scale virtual reality systems. We

also leveraged a gesture controller (Microsoft kinect) to interact with the visualization.

With the 360� projection, multiple people can explore the visualization concurrently,
which accelerates the exploration and discovery speed.

Limitations and future work
Our study has a some limitations. First, our study is limited by the sources of data used.

We used three databases (DrugBank, iRefIndex, and online Mendelian inheritance in man

(OMIM)) to construct the initial knowledge graph. These databases are continuously

changing and necessarily incomplete with respect to the total number of drugs, targets,

protein interactions, diseases, and disease genes. For instance, as of 8/15/2016 there are

over 2,000 additional FDA approved drugs in DrugBank than in the version that was

initially used. Second, the focus of our work is on the potential repositioning of FDA

ReDrugS API
Interaction network search 

and expansion

iRefIndex
ReDrugS

RDF Store

Analytical Tools ReDrugS
Cytoscape.js App

Ontological Resources
Protein/Protein Interaction Ontology, 

Semanticscience Integrated Ontology, 
Gene Ontology

vocabularies, relationships

queries

queries graphqueries graph

II

Experimental 
Method 

Assessment

experimental methods.

evidence to
probabilityconverted to

nanopubs

Cytoscape, R, Python, etc.

Figure 5 The ReDrugS data flow. Data is selected from external databases and converted using scripts

into nanopublication graphs, which are loaded into the ReDrugS data store. This is combined with

experimental method assessments, expressed in OWL, and public ontologies into the RDF store. The web

service layer queries the store and produces aggregate analyses of those nanopublications, which is con-

sumed and displayed by the rich web client. The same APIs can be used by other tools for further analysis.

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 8/20

http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


approved drugs, which means that tens of thousands of chemical compounds with protein

binding activity cannot be considered as candidates in the current study. Third, our path

expansion is currently limited to pairwise protein–protein interactions, which excludes

interactions as a result of protein complexes or regulatory pathways. Having a more

sophisticated understanding of non-direct interactions will help identify candidate drugs

that can regulate entire pathways in a more rational manner. Additionally, we aim to

incorporate knowledge of the complementarity of drug and disease gene expression

patterns as evidenced by the connectivity map (Lamb et al., 2006), which could suggest

therapeutic and adverse interactions. Finally, as we develop new hypotheses about

potential new drug effects, we plan to test them using a new three-dimensional cellular

microarray to perform high-throughput drug screening (Lee et al., 2008) with reference

samples. The integration of computational predictions and high-throughput screening

platform will enable the systematic evaluation of any drug or mechanism of action against

any disease or adverse event.

MATERIALS AND METHODS
This research project did not involve human subjects. The ReDrugS platform consists of a

graphical web application, an application programming interface (API), and a knowledge

base. The graphical web application enables users to initiate a search using drug, gene, and

disease names and synonyms. Users can then interact with the application to expand the

network at an arbitrary number of interactions away from the entity of interest, and to

filter the network based on a joint probability between the source and target entities.

Drug–protein, protein–protein, and gene–disease interactions were obtained from several

datasets and integrated into ontology-annotated and provenance and evidence bearing

representations called nanopublications. The web application obtains information from

the knowledge base using semantic web services. Finally, we evaluated our approach by

Figure 6 The authors demonstrate the ReDrugS user interface in the collaborative-research augmented immersive virtual environment

(CRAIVE) lab at RPI.

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 9/20

http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


examining the mechanistic plausibility of the drug in having melanoma-specific disease

modifying ability. We evaluated a large number of possible drug/disease associations with

varying joint probabilities and interaction steps to determine the thresholds with the

highest f-measure, resulting in our thresholds of three or less interactions and a joint

probability of 0.93 or higher.

Using the ReDrugS application page (http://redrugs.tw.rpi.edu) we initiate our search

for “melanoma,” and select the first suggestion obtained from the experimental factor

ontology (EFO) (http://www.ebi.ac.uk/efo/EFO_0000756). The application then provides

immediate neighborhood of drugs and genes that are associated with melanoma. We

expanded the network by first selecting the melanoma node and expanding the link

distance to |I| � 3 and changing the minimum joint probability to p � 0.93 in the search
options. Importantly, we also limit the node type to “Drug.” Finally, we click on the “find

incoming links” button (two left-facing arrows). When finished the network will show all

drugs interacting with melanoma that meet the above criteria, as well as any intervening

entities and their interactions. The resulting network can be downloaded as an image, or a

summary CSV file. We used the CSV file to validate the links by searching Google Scholar

and ClinicalTrials.gov for each proposed drug/disease combination. We consider a “hit” to

be a pairing with a published positive experiment in vivo or in vitro or any pairing that has

been tested in a clinical trial. While this level of validation does not guarantee efficacy, it

does determine if the resulting connection is a plausible hypothesis that might be tested.

Data fusion
We developed a structured knowledge base containing data pertaining to drugs, targets,

interactions, and diseases. We used five data sources: iRefIndex (Razick, Magklaras &

Donaldson, 2008), DrugBank (Wishart et al., 2006), UniProt gene ontology annotations

(GOA) (Camon et al., 2004), the online Mendelian inheritance in man (OMIM) (Hamosh

et al., 2005), and the catalogue of somatic mutations in cancer (COSMIC) gene census

(Futreal et al., 2004).

iRefIndex contains protein–protein interactions and protein complexes and is an

amalgam of the biomolecular interaction network database (Bader, Betel & Hogue, 2003),

BioGRID (Stark et al., 2006), the comprehensive resource of mammalian protein

complexes (Ruepp et al., 2010), database of interacting proteins (Xenarios et al., 2002),

human protein reference database (Keshava Prasad et al., 2009), InnateDB (Lynn et al.,

2008), IntAct (Kerrien et al., 2011), MatrixDB (Chautard et al., 2011), molecular interaction

database (Chatr-aryamontri et al., 2008), MPact (Güldener et al., 2006), microbial protein

interaction database (Goll et al., 2008), MIPS mammalian protein–protein interaction

database (Pagel et al., 2005), and online predicted human interaction database (Brown &

Jurisica, 2005). DrugBank provides information about experimental/approved drugs

and their targets, and UniProt GOA describes proteins in terms of their biological

processes, cellular locations, and molecular functions. OMIM provides associations

between genes and inherited or genetically-driven diseases. The COSMIC gene census is a

curated list of genes that have causal associations with one or more cancer types.

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 10/20

http://redrugs.tw.rpi.edu
http://www.ebi.ac.uk/efo/EFO_0000756
http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


Each association (e.g., drug–target, protein–protein, disease–gene) was captured using

the nanopublication (Groth, Gibson & Velterop, 2010) scheme. A nanopublication is a

digital artifact that consists of an assertion, its provenance, and information about the

digital publication. Our nanopublications are represented as linked data: each data item

is identified using an dereferenceable HTTP uniform resource identifier (URI) and

statements are represented using the RDF. Each nanopublication corresponds to a single

interaction assertion from one of the databases. We used a number of automated scripts to

produce the nanopublications and load them into the SPARQL endpoint. An example

nanopublication is shown in Fig. 7. We used the SIO (Dumontier et al., 2014) as a global

schema to describe the nature and components of the associations, and coupled this

with the PSI-MI ontology (Hermjakob et al., 2004) to denote the types of interactions.

We used the W3C’s PROV-O (Lebo, Sahoo & McGuinness, 2013) to capture provenance

of the assertion (which data source it originated from). We loaded our nanopublications

into Blazegraph, an RDF nanopublication compatible database. The data is accessed using

its native SPARQL endpoint by the web application.

Assertion probability
Each knowledge graph fragment, enclosed in a nanopublication, is assigned a probability

based on the quality of the methods used to create the assertions in the fragment.

We compute probabilities based on two different methods. Manually curated assertions,

from DrugBank, OMIM, and COSMIC gene census, are directly given a probability

p = 0.999. Assertions that have been derived from a specific experimental method are

given probabilities appropriate for that method. These probabilities are derived

from a expert-driven measure of the reliability of the experimental method used to

derive the association. Factors involved in the assessment of confidence include the

degree of indirection in the assay, the sensitivity and specificity of the approach, and

reproducibility of results under different conditions based on the comparative analyses of

techniques (Skrabanek et al. 2008; Sprinzak, Sattath & Margalit, 2003). Two expert

bioinformaticians rated the reliability of each method and assigned a score of 1–3, where 1

corresponds to low confidence and 3 to high confidence. After their initial assessment,

they conferred on their reasoning for each score to resolve differences where possible. The

experts considered level 1 to correspond to weak evidence that needs independent

verification. Level 2 methods are generally reliable, but should have additional biological

evidence. Level 3 methods are high-quality method that produces few false positives.

We calculated inter-annotator agreement between the two annotators over the three

categories using Scott’s Pi. Scott’s Pi is similar to Cohen’s kappa in that it improves on

simple observed agreement by factoring in the extent of agreement that might be expected

by chance. We determined the agreement to be 0.56 (Scott’s Pi value of 0.26) across 104

experimental methods comprising of 99.9999% of interaction annotations (Scott, 1955).

The scores of 1, 2, and 3 were then assigned provisional probabilities of p = 0.8, p = 0.95,

and p = 0.99 respectively. We chose these probabilities as approximations of the conceptual

levels of probability for each rating by the experts, and feel that those probabilities

correspond to how often an experiment at that confidence level can be expected to be

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 11/20

http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


accurate. We plan to provide a more rigorous assessment of the accuracy of each method

against gold standards in future work. These confidence values were encoded into an

OWL ontology along with the evidence codes. The full inferences were extracted using

Pellet (https://github.com/complexible/pellet) and loaded into the SPARQL endpoint,

where they were used to apply the probabilities to each assertion in the knowledge graph

that had experimental evidence.

Semantic web services
We developed four SADI web services (Wilkinson, Vandervalk & McCarthy, 2009) in

Python
1
to support easy access to the nanopubications (see Table 2) in ReDrugS. The

four services are enumerated in Table 2.

The first service is a simple free text lookup, that takes an pml:Query
2
(McGuinness

et al., 2007) with a prov:value as a query and produces a set of entities whose labels contain

the substring. This is used for interactive typeahead completion of search terms so users

can look up URIs and entities without needing to know the details.

The other three SADI services look up interactions that contain a named entity. Two of

them look at the entity to find upstream and downstream connections, and the third

service assumes that the entity is a biological process and finds all interactions that

related to that process. The services return only one interaction for each triple (source,

interaction type, target). There are often multiple probabilities per interaction, and

more than one interaction per interaction type. This is because the interaction may

have been recorded in multiple databases, based on different experimental methods.

Figure 7 Representation of a protein/protein interaction within a nanopublication. Three graphs are

represented. The assertion graph (NanoPub_501799_Assertion), states that an interaction (X) is of type

sio:DirectInteraction, and has the target of SLC4A8, and a participant of CA2. The supporting graph

(NanoPub_501799_Supporting), states that the assertion graph was generated by a pull down experi-

ment (one of many encoded experiment types used in, a subclass of prov:Activity. The attribution graph

(NanoPub_501799_Attribution), in turn, states that the assertion had a primary source of (Loiselle et al.,

2004) and that the interaction was quoted from BioGrid.

1
For further information on developing

web services in Python using SADI, see

this tutorial: https://github.com/

markwilkinson/SADI-Semantic-Web-

Services-Core/wiki/Building-Services-

in-Python

2
PML 3, in development: https://github.

com/timrdf/pml. This includes PML 2

constructs that are not covered in

PROV-O.

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 12/20

https://github.com/complexible/pellet
https://github.com/markwilkinson/SADI-Semantic-Web-Services-Core/wiki/Building-Services-in-Python
https://github.com/markwilkinson/SADI-Semantic-Web-Services-Core/wiki/Building-Services-in-Python
https://github.com/markwilkinson/SADI-Semantic-Web-Services-Core/wiki/Building-Services-in-Python
https://github.com/markwilkinson/SADI-Semantic-Web-Services-Core/wiki/Building-Services-in-Python
https://github.com/timrdf/pml
https://github.com/timrdf/pml
http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


To provide a single probability score for each interaction of a source and target, the

interactions are combined. A single probability is generated per identified interaction

by taking the geometric mean of the probabilities for that interaction. However, this

method is undesirable when combining multiple interaction records of the same type.

We instead combine the interaction records using a form of probabilistic voting using

composite Z-scores. This is done to model that multiple experiments that produce the

same results reinforce each other, and should therefore give a higher overall probability

than would be indicated by taking their mean or even by Bayes theorem. We do this

by converting each probability into a Z-score (aka standard score) using the quantile

function (Q()), summing the values, and applying the cumulative distribution function

(CDF()) to compute the corresponding probability:

P x1...nð Þ ¼ CDF
Xn
i¼1

Q P xið Þð Þ
 !

These composite Z-scores, which we transform back into probabilities, are frequently

used to combine multiple indicators of the same underlying phenomena, as in (Moller

et al., 1998). However, it has a drawback. One concern is that the strategy does not

account for multiple databases recording the same non-independent experiment. This can

possibly inflating the probabilities of interactions described by experiments that are

published in more than one database.

Graph expansion using joint probability
In order to compute the probability that a given entity affects another, we compute

the joint probability that each of the intervening interactions are true. Joint probability is

the probability that every assertion in the set is true. This is computed by taking the

product of probabilities of each interaction:

P x1 ^ . . . ^ xnð Þ ¼
Yn
i¼1

P xið Þ

This joint probability is used as a threshold that users can set to stop graph expansion.

We also provide expansion limits using the number of interaction steps that are needed

to connect the two entities.

Table 2 ReDrugS API SADI Web Services. The API endpoint prefix is http://redrugs.tw.rpi.edu/api/.

Service name Description URL Input Output

Resource text search Look up resources using free text search against

their RDFS labels. This service is optimized for

typeahead user interfaces.

search pml:Query pml:AnsweredQuery

Find interactions in a biological

process

Find interactions whose participants or targets

also participate in the input process.

process sio:Process sio:Process

Find upstream participants Find interactions that the input entity is a target of

in and have explicit participants.

upstream sio:MaterialEntity sio:Target

Find downstream targets Find interactions that the input entity participates

in and have explicit targets.

downstream sio:MaterialEntity sio:Agent

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 13/20

http://redrugs.tw.rpi.edu/api/
http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


User interface
The user interface was developed using the above SADI web services and uses Cytoscape.js

(http://cytoscape.github.io/cytoscape.js) angular.js (https://angularjs.org), and Bootstrap

3 (http://getbootstrap.com). An example network is shown in Fig. 2. Users can

search for biological entities and processes, which can then be autocompleted to specific

entities that are in the ReDrugS graph. Users can then add those entities and processes

to the displayed graph and retrieve upstream and downstream connections and link out to

more details for every entity. Cytoscape.js is used as the main rendering and network

visualization tool, and provides node and edge rendering, layout, and network analysis

capabilities, and has been integrated into a customized rich web client.

In order to evaluate this knowledge graph, we developed a demonstration web interface

(http://redrugs.tw.rpi.edu) based on the Cytoscape.js (http://cytoscape.github.io/

cytoscape.js) JavaScript library. The interface lets users enter biological entity names. As

the user types, the text is resolved to a list of entities. The user finishes by selecting from

the list, and submitting the search. The search returns interactions and nodes associated

with the entity selected, which are added to the Cytoscape.js graph. Users are also able to

select nodes and populate upstream or downstream connections. Figure 2 is an example

output of this process.

ACKNOWLEDGEMENTS
A special thanks to Pascale Gaudet who, with Michel Dumontier, evaluated the

experimental methods and evidence codes listed in the protein/protein interaction

ontology and gene ontology. Thank you also to Kusum Solanki and John Erickson for

evaluation, feedback, and planning in the initial stages of this project.

ADDITIONAL INFORMATION AND DECLARATIONS

Funding
The authors received no funding for this work.

Competing Interests
The authors declare that they have no competing interests.

Author Contributions
� James P. McCusker conceived and designed the experiments, performed the
experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote

the paper, prepared figures and/or tables, performed the computation work, reviewed

drafts of the paper.

� Michel Dumontier conceived and designed the experiments, analyzed the data,
contributed reagents/materials/analysis tools, wrote the paper, performed the

computation work, reviewed drafts of the paper.

� Rui Yan performed the experiments, contributed reagents/materials/analysis tools,
wrote the paper, prepared figures and/or tables, performed the computation work,

reviewed drafts of the paper.

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 14/20

http://cytoscape.github.io/cytoscape.js
https://angularjs.org
http://getbootstrap.com
http://redrugs.tw.rpi.edu
http://cytoscape.github.io/cytoscape.js
http://cytoscape.github.io/cytoscape.js
http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


� Sylvia He contributed reagents/materials/analysis tools, prepared figures and/or tables,
performed the computation work, reviewed drafts of the paper.

� Jonathan S. Dordick conceived and designed the experiments, reviewed drafts of the
paper.

� Deborah L. McGuinness conceived and designed the experiments, wrote the paper,
reviewed drafts of the paper.

Data Deposition
The following information was supplied regarding data availability:

Data can be found at https://data.rpi.edu/xmlui/handle/10833/1760.

REFERENCES
Bader GD, Betel D, Hogue CW. 2003. BIND: the biomolecular interaction network database.

Nucleic Acids Research 31(1):248–250 DOI 10.1093/nar/gkg056.

Barth A, Wanek L, Morton D. 1995. Prognostic factors in 1,521 melanoma patients with

distant metastases. Journal of the American College of Surgeons 181(3):193–201.

Bisgin H, Liu Z, Fang H, Kelly R, Xu X, Tong W. 2014. A phenome-guided drug

repositioning through a latent variable model. BMC Bioinformatics 15(1):267

DOI 10.1186/1471-2105-15-267.

Bisgin H, Liu Z, Kelly R, Fang H, Xu X, Tong W. 2012. Investigating drug repositioning

opportunities in FDA drug labels through topic modeling. BMC Bioinformatics 13(Suppl 1):S6

DOI 10.1186/1471-2105-13-s15-s6.

Brown KR, Jurisica I. 2005. Online predicted human interaction database. Bioinformatics

21(9):2076–2082 DOI 10.1093/bioinformatics/bti273.

Camon E, Magrane M, Barrell D, Lee V, Dimmer E, Maslen J, Binns D, Harte N, Lopez R,

Apweiler R. 2004. The gene ontology annotation (GOA) database: sharing knowledge in

UniProt with gene ontology. Nucleic Acids Research 32(Suppl 1):D262–D266

DOI 10.1093/nar/gkh021.

Chapman PB, Hauschild A, Robert C, Haanen JB, Ascierto P, Larkin J, Dummer R, Garbe C,

Testori A, Maio M, Hogg D, Lorigan P, Lebbe C, Jouary T, Schadendorf D, Ribas A, O’Day SJ,

Sosman JA, Kirkwood JM, Eggermont AM, Dreno B, Nolop K, Li J, Nelson B, Hou J, Lee RJ,

Flaherty KT, McArthur GA. 2011. Improved survival with vemurafenib in melanoma with

BRAF V600E mutation. New England Journal of Medicine 364(26):2507–2516.

Chatr-aryamontri A, Zanzoni A, Ceol A, Cesareni G. 2008. Searching the protein interaction

space through the MINT database. In: Thompson JD, Ueffing M, Schaeffer-Reiss C, eds.

Functional Proteomics: Methods and Protocols. Totowa: Humana Press, 305–317

DOI 10.1007/978-1-59745-398-1_20.

Chautard E, Fatoux-Ardore M, Ballut L, Thierry-Mieg N, Ricard-Blum S. 2011. MatrixDB, the

extracellular matrix interaction database. Nucleic Acids Research 39(Suppl 1):D235–D240

DOI 10.1093/nar/gkq830.

Cheng F, Liu C, Jiang J, Lu W, Li W, Liu G, Zhou W, Huang J, Tang Y. 2012. Prediction of drug-

target interactions and drug repositioning via network-based inference. PLoS Computational

Biology 8(5):e1002503 DOI 10.1371/journal.pcbi.1002503.

Chiang AP, Butte AJ. 2009. Systematic evaluation of drug-disease relationships to identify leads for

novel drug uses. Clinical Pharmacology and Therapeutics 86(5):507–510

DOI 10.1038/clpt.2009.103.

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 15/20

https://data.rpi.edu/xmlui/handle/10833/1760
http://dx.doi.org/10.1093/nar/gkg056
http://dx.doi.org/10.1186/1471-2105-15-267
http://dx.doi.org/10.1186/1471-2105-13-s15-s6
http://dx.doi.org/10.1093/bioinformatics/bti273
http://dx.doi.org/10.1093/nar/gkh021
http://dx.doi.org/10.1007/978-1-59745-398-1_20
http://dx.doi.org/10.1093/nar/gkq830
http://dx.doi.org/10.1371/journal.pcbi.1002503
http://dx.doi.org/10.1038/clpt.2009.103
http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


Clavo A, Wahl R. 1996. Effects of hypoxia on the uptake of tritiated thymidine, L-leucine,

L-methionine and FDG in cultured cancer cells. Journal of Nuclear Medicine 37:502–506.

D’Alterio C, Barbieri A, Portella L, Palma G, Polimeno M, Riccio A, Ieranò C, Franco R,

Scognamiglio G, Bryce J, Luciano A, Rea D, Arra C, Scala S. 2012. Inhibition of stromal

CXCR4 impairs development of lung metastases. Cancer Immunology Immunotherapy

61(10):1713–1720 DOI 10.1007/s00262-012-1223-7.

Doudican N, Rodriguez A, Osman I, Orlow SJ. 2008. Mebendazole induces apoptosis via Bcl-2

inactivation in chemoresistant melanoma cells. Molecular Cancer Research 6(8):1308–1315

DOI 10.1158/1541-7786.mcr-07-2159.

Dumontier M, Baker CJ, Baran J, Callahan A, Chepelev L, Cruz-Toledo J, Del Rio NR,

Duck G, Furlong LI, Keath N, Klassen D, McCusker JP, Queralt-Rosinach N, Samwald M,

Villanueva-Rosales N, Wilkinson MD, Hoehndorf R. 2014. The semanticscience integrated

ontology (SIO) for biomedical research and knowledge discovery. Journal of Biomedical

Semantics 5(1):14 DOI 10.1186/2041-1480-5-14.

Emig D, Ivliev A, Pustovalova O, Lancashire L, Bureeva S, Nikolsky Y, Bessarabova M. 2013.

Drug target prediction and repositioning using an integrated network-based approach.

PLoS ONE 8(4):e60618 DOI 10.1371/journal.pone.0060618.

Fiorentini G, Aliberti C, Del CA, Tilli M, Rossi S, Ballardini P, Turrisi G, Benea G. 2009.

Intra-arterial hepatic chemoembolization (TACE) of liver metastases from ocular melanoma

with slow-release irinotecan-eluting beads. Early results of a phase II clinical study. In Vivo

23(1):131–137.

Futreal PA, Coin L, Marshall M, Down T, Hubbard T, Wooster R, Rahman N, Stratton MR.

2004. A census of human cancer genes. Nature Reviews Cancer 4(3):177–183

DOI 10.1038/nrc1299.

Goll J, Rajagopala SV, Shiau SC, Wu H, Lamb BT, Uetz P. 2008. MPIDB: the microbial protein

interaction database. Bioinformatics 24(15):1743–1744 DOI 10.1093/bioinformatics/btn285.

Gottlieb A, Stein G, Ruppin E, Sharan R. 2011. PREDICT: a method for inferring novel drug

indications with application to personalized medicine. Molecular Systems Biology 7:496

DOI 10.1038/msb.2011.26.

Greenberg LH. 1965. Audiotoxicity and nephrotoxicity due to orally administered neomycin.

JAMA 194(7):827–828 DOI 10.1001/jama.194.7.827.

Groth P, Gibson A, Velterop J. 2010. The anatomy of a nanopublication. Information

Services and Use 30(1):51–56.

Grover MP, Ballouz S, Mohanasundaram KA, George RA, Sherman CDH, Crowley TM,

Wouters MA. 2014. Identification of novel therapeutics for complex diseases from genome-

wide association data. BMC Medical Genomics 7(Suppl 1):S8 DOI 10.1186/1755-8794-7-s1-s8.

Güldener U, Münsterkötter M, Oesterheld M, Pagel P, Ruepp A, Mewes H-W, Stümpflen V.

2006. MPact: the MIPS protein interaction resource on yeast. Nucleic Acids Research

34(Suppl 1):D436–D441 DOI 10.1093/nar/gkj003.

Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA. 2005. Online Mendelian

inheritance in man (OMIM), a knowledgebase of human genes and genetic disorders.

Nucleic Acids Research 33(Suppl 1):D514–D517.

Harris S, Seaborne A, Prud’hommeaux E. 2013. SPARQL 1.1 query language. W3C

Recommendation, 21. Available at https://www.w3.org/TR/sparql11-query/.

Harrold JM, Ramanathan M, Mager DE. 2013. Network-based approaches in drug discovery

and early development. Clinical Pharmacology and Therapeutics 94(6):651–658

DOI 10.1038/clpt.2013.176.

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 16/20

http://dx.doi.org/10.1007/s00262-012-1223-7
http://dx.doi.org/10.1158/1541-7786.mcr-07-2159
http://dx.doi.org/10.1186/2041-1480-5-14
http://dx.doi.org/10.1371/journal.pone.0060618
http://dx.doi.org/10.1038/nrc1299
http://dx.doi.org/10.1093/bioinformatics/btn285
http://dx.doi.org/10.1038/msb.2011.26
http://dx.doi.org/10.1001/jama.194.7.827
http://dx.doi.org/10.1186/1755-8794-7-s1-s8
http://dx.doi.org/10.1093/nar/gkj003
https://www.w3.org/TR/sparql11-query/
http://dx.doi.org/10.1038/clpt.2013.176
http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


Hauschild A, Grob J-J, Demidov LV, Jouary T, Gutzmer R, Millward M, Rutkowski P, Blank CU,

Miller WH, Kaempgen E, Martn-Algarra S, Karaszewska B, Mauch C, Chiarion-Sileni V,

Martin A-M, Swann S, Haney P, Mirakhur B, Guckert ME, Goodman V, Chapman PB. 2012.

Dabrafenib in BRAF-mutated metastatic melanoma: a multicentre open-label, phase 3

randomised controlled trial. Lancet 380(9839):358–365 DOI 10.1016/s0140-6736(12)60868-x.

Held MA, Langdon CG, Platt JT, Graham-Steed T, Liu Z, Chakraborty A, Bacchiocchi A, Koo A,

Haskins JW, Bosenberg MW, Stern DF. 2012. Genotype-selective combination therapies for

melanoma identified by high-throughput drug screening. Cancer Discovery 3(1):52–67

DOI 10.1158/2159-8290.cd-12-0408.

Hermjakob H, Montecchi-Palazzi L, Bader G, Wojcik J, Salwinski L, Ceol A, Moore S,

Orchard S, Sarkans U, von Mering C, Roechert B, Poux S, Jung E, Mersch H, Kersey P,

Lappe M, Li Y, Zeng R, Rana D, Nikolski M, Husi H, Brun C, Shanker K, Grant SGN,

Sander C, Bork P, Zhu W, Pandey A, Brazma A, Jacq B, Vidal M, Sherman D, Legrain P,

Cesareni G, Xenarios I, Eisenberg D, Steipe B, Hogue C, Apweiler R. 2004. The HUPO

PSI’s molecular interaction format—a community standard for the representation of

protein interaction data. Nature Biotechnology 22(2):177–183 DOI 10.1038/nbt926.

Homsi J, Cubitt CL, Zhang S, Munster PN, Yu H, Sullivan DM, Jove R, Messina JL, Daud AI.

2009. Src activation in melanoma and Src inhibitors as therapeutic agents in melanoma.

Melanoma Research 19(3):167–175 DOI 10.1097/cmr.0b013e328304974c.

Humer J, Ferko B, Waltenberger A, Rapberger R, Pehamberger H, Muster T. 2008.

Azidothymidine inhibits melanoma cell growth in vitro and in vivo. Melanoma Research

18(5):314–321 DOI 10.1097/cmr.0b013e32830aaaa6.

Istituto Clinico Humanitas. 2015. Regorafenib in patients with metastatic solid tumors who

have progressed after standard therapy (RESOUND). Available at https://clinicaltrials.gov/ct2/

show/NCT02307500 (accessed 10 January 2016).

Kerrien S, Aranda B, Breuza L, Bridge A, Broackes-Carter F, Chen C, Duesbury M,

Dumousseau M, Feuermann M, Hinz U, Jandrasits C, Jimenez RC, Khadake J, Mahadevan U,

Masson P, Pedruzzi I, Pfeiffenberger E, Porras P, Raghunath A, Roechert B, Orchard S,

Hermjakob H. 2011. The IntAct molecular interaction database in 2012. Nucleic Acids Research

40:D841–D846 DOI 10.1093/nar/gkr1088.

Keshava Prasad TS, Goel R, Kandasamy K, Keerthikumar S, Kumar S, Mathivanan S,

Telikicherla D, Raju R, Shafreen B, Venugopal A, Balakrishnan L, Marimuthu A, Banerjee S,

Somanathan DS, Sebastian A, Rani S, Ray S, Harrys Kishore CJ, Kanth S, Ahmed M,

Kashyap MK, Mohmood R, Ramachandra YL, Krishna V, Rahiman BA, Mohan S,

Ranganathan P, Ramabadran S, Chaerkady R, Pandey A. 2009. Human protein reference

database—2009 update. Nucleic Acids Research 37(Suppl 1):D767–D772

DOI 10.1093/nar/gkn892.

Kim KB, Kefford R, Pavlick AC, Infante JR, Ribas A, Sosman JA, Fecher LA, Millward M,

McArthur GA, Hwu P, Gonzalez R, Ott PA, Long GV, Gardner OS, Ouellet D, Xu Y,

DeMarini DJ, Le NT, Patel K, Lewis KD. 2012. Phase II study of the MEK1/MEK2 inhibitor

trametinib in patients with metastatic BRAF-mutant cutaneous melanoma previously

treated with or without a BRAF inhibitor. Journal of Clinical Oncology 31(4):482–489

DOI 10.1200/jco.2012.43.5966.

Kim S, Liu Y, Gaber MW, Bumgardner JD, Haggard WO, Yang Y. 2008. Development of

chitosan-ellagic acid films as a local drug delivery system to induce apoptotic death of human

melanoma cells. Journal of Biomedical Materials Research Part B: Applied Biomaterials

90B(1):145–155 DOI 10.1002/jbm.b.31266.

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 17/20

http://dx.doi.org/10.1016/s0140-6736(12)60868-x
http://dx.doi.org/10.1158/2159-8290.cd-12-0408
http://dx.doi.org/10.1038/nbt926
http://dx.doi.org/10.1097/cmr.0b013e328304974c
http://dx.doi.org/10.1097/cmr.0b013e32830aaaa6
https://clinicaltrials.gov/ct2/show/NCT02307500
https://clinicaltrials.gov/ct2/show/NCT02307500
http://dx.doi.org/10.1093/nar/gkr1088
http://dx.doi.org/10.1093/nar/gkn892
http://dx.doi.org/10.1200/jco.2012.43.5966
http://dx.doi.org/10.1002/jbm.b.31266
http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


Kingsmore SF, Lindquist IE, Mudge J, Gessler DD, Beavis WD. 2008. Genome-wide association

studies: progress and potential for drug discovery and development. Nature Reviews Drug

Discovery 7(3):221–230 DOI 10.1038/nrd2519.

Kraut EH, Walker MJ, Staubus A, Gochnour D, Balcerzak SP. 1997. Phase II trial of topotecan in

malignant melanoma. Cancer Investigation 15(4):318–320 DOI 10.3109/07357909709039732.

Krauthammer M, Kong Y, Bacchiocchi A, Evans P, Pornputtapong N, Wu C, McCusker J,

Ma S, Cheng E, Straub R, Serin M, Bosenberg M, Ariyan S, Narayan D, Sznol M,

Kluger H, Mane S, Schlessinger J, Lifton R, Halaban R. 2015. Exome sequencing identifies

recurrent mutations in NF1 and RASopathy genes in sun-exposed melanomas. Nature

Genetics 47(9):996–1002 DOI 10.1038/ng.3361.

Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Lerner J, Brunet J-P,

Subramanian A, Ross KN, Reich M, Hieronymus H, Wei G, Armstrong SA, Haggarty SJ,

Clemons PA, Wei R, Carr SA, Lander ES, Golub TR. 2006. The connectivity map: using

gene-expression signatures to connect small molecules, genes, and disease. Science

313(5795):1929–1935 DOI 10.1126/science.1132939.

Le K, Blomain ES, Rodeck U, Aplin AE. 2013. Selective RAF inhibitor impairs ERK1/2

phosphorylation and growth in mutant NRAS vemurafenib-resistant melanoma cells.

Pigment Cell & Melanoma Research 26(4):509–517 DOI 10.1111/pcmr.12092.

Lebo T, Sahoo S, McGuinness D. 2013. PROV-O: the PROV ontology.

Available at http://www.w3.org/TR/prov-o/.

Lee M-Y, Kumar RA, Sukumaran SM, Hogg MG, Clark DS, Dordick JS. 2008. Three-dimensional

cellular microarray for high-throughput toxicology assays. Proceedings of the National Academy

of Sciences of the United States of America 105(1):59–63 DOI 10.1073/pnas.0708756105.

Lemontt J, Azzaria M, Gros P. 1988. Increased mdr gene expression and decreased drug

accumulation in multidrug-resistant human melanoma cells. Cancer Research 48(22):6348–6353.

Loiselle FB, Morgan PE, Alvarez BV, Casey JR. 2004. Regulation of the human NBC3 Na
+
/HCO3

-

cotransporter by carbonic anhydrase II and PKA. American Journal of Physiology-Cell Physiology

286(6):C1423–C1433 DOI 10.1152/ajpcell.00382.2003.

Luikart S, Kennealey G, Kirkwood J. 1984. Randomized phase III trial of vinblastine, bleomycin,

and cis-dichlorodiammine-platinum versus dacarbazine in malignant melanoma. Journal of

Clinical Oncology 2(3):164–168.

Lynn DJ, Winsor GL, Chan C, Richard N, Laird MR, Barsky A, Gardy JL, Roche FM, Chan

THW, Shah N, Lo R, Naseer M, Que J, Yau M, Acab M, Tulpan D, Whiteside MD,

Chikatamarla A, Mah B, Munzner T, Hokamp K, Hancock REW, Brinkman FSL. 2008.

InnateDB: facilitating systems-level analyses of the mammalian innate immune response.

Molecular Systems Biology 4(1):218 DOI 10.1038/msb.2008.55.

Mansuy M, Nikkels-Tassoudji N, Arrese JE, Rorive A, Nikkels AF. 2014. Recurrent in situ

melanoma successfully treated with ingenol mebutate. Dermatology and Therapy

4(1):131–135 DOI 10.1007/s13555-014-0051-4.

Maraveyas A, Johnson MJ, Xiao YP, Noble S. 2010. Malignant melanoma as a target

malignancy for the study of the anti-metastatic properties of the heparins. Cancer and

Metastasis Reviews 29(4):777–784 DOI 10.1007/s10555-010-9263-y.

McGuinness DL, Ding L, Silva PPD, Chang C. 2007. PML 2: a modular explanation interlingua.

In: Proceedings of the AAAI 2007 Workshop on Explanation-Aware Computing, Vancouver, 22–23.

Moller JT, Cluitmans P, Rasmussen LS, Houx P, Rasmussen H, Canet J, Rabbitt P, Jolles J,

Larsen K, Hanning CD, Langeron O, Johnson T, Lauven PM, Kristensen PA, Biedler A,

van Beem H, Fraidakis O, Silverstein JH, Beneken JEW, Gravenstein JS. 1998. Long-term

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 18/20

http://dx.doi.org/10.1038/nrd2519
http://dx.doi.org/10.3109/07357909709039732
http://dx.doi.org/10.1038/ng.3361
http://dx.doi.org/10.1126/science.1132939
http://dx.doi.org/10.1111/pcmr.12092
http://www.w3.org/TR/prov-o/
http://dx.doi.org/10.1073/pnas.0708756105
http://dx.doi.org/10.1152/ajpcell.00382.2003
http://dx.doi.org/10.1038/msb.2008.55
http://dx.doi.org/10.1007/s13555-014-0051-4
http://dx.doi.org/10.1007/s10555-010-9263-y
http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


postoperative cognitive dysfunction in the elderly: ISPOCD1 study. Lancet 351(9106):857–861

DOI 10.1016/s0140-6736(97)07382-0.

Motik B, Patel-Schneider PF, Cuenca Grau B. 2009. OWL 2 Web Ontology Language: Direct

Semantics. Available at https://www.w3.org/TR/owl2-direct-semantics/.

Nagy Z, Turcsik V, Blaskó G. 2009. The effect of LMWH (nadroparin) on tumor progression.

Pathology & Oncology Research 15(4):689–692 DOI 10.1007/s12253-009-9204-7.

Naing A. 2011. Phase I dose escalation study of sodium stibogluconate (SSG) a protein

tyrosine phosphatase inhibitor, combined with interferon alpha for patients with solid tumors.

Journal of Cancer 2:81–89 DOI 10.7150/jca.2.81.

National Cancer Institute. 2005. Carboplatin and paclitaxel with or without sorafenib tosylate in

treating patients with stage III or stage IV melanoma that cannot be removed by surgery.

Available at https://clinicaltrials.gov/ct2/show/NCT00110019 (accessed 10 January 2016).

Pagel P, Kovac S, Oesterheld M, Brauner B, Dunger-Kaltenbach I, Frishman G, Montrone C,

Mark P, Stümpflen V, Mewes H-W, Ruepp A, Frishman D. 2005. The mips mammalian

protein-protein interaction database. Bioinformatics 21(6):832–834

DOI 10.1093/bioinformatics/bti115.

Pardo OE, Wellbrock C, Khanzada UK, Aubert M, Arozarena I, Davidson S, Bowen F, Parker PJ,

Filonenko VV, Gout IT, Sebire N, Marais R, Downward J, Seckl MJ. 2006. FGF-2 protects

small cell lung cancer cells from apoptosis through a complex involving PKCepsilon, B-Raf

and S6K2. EMBO Journal 25(13):3078–3088 DOI 10.1038/sj.emboj.7601198.

Patel K, Doudican NA, Schiff PB, Orlow SJ. 2011. Albendazole sensitizes cancer cells to ionizing

radiation. Radiation Oncology 6(1):160 DOI 10.1186/1748-717x-6-160.

Peng X, Wang F, Li L, Bum-Erdene K, Xu D, Wang B, Sinn AA, Pollok KE, Sandusky GE, Li L,

Turchi JJ, Jalal SI, Meroueh SO. 2014. Exploring a structural protein–drug interactome for

new therapeutics in lung cancer. Molecular BioSystems 10(3):581 DOI 10.1039/c3mb70503j.

Razick S, Magklaras G, Donaldson IM. 2008. iRefIndex: a consolidated protein interaction

database with provenance. BMC Bioinformatics 9(1):405 DOI 10.1186/1471-2105-9-405.

Richard C, David W, Markus L. 2014. RDF 1.1 Concepts and Abstract Syntax. W3C

Recommendation. Available at https://www.w3.org/TR/rdf11-concepts/.

Ruepp A, Waegele B, Lechner M, Brauner B, Dunger-Kaltenbach I, Fobo G, Frishman G,

Montrone C, Mewes H-W. 2010. Corum: the comprehensive resource of mammalian protein

complexes—2009. Nucleic Acids Research 38(Suppl 1):D497–D501 DOI 10.1093/nar/gkp914.

Sanseau P, Koehler J. 2011. Editorial: computational methods for drug repurposing. Briefings in

Bioinformatics 12(4):301–302 DOI 10.1093/bib/bbr047.

Sawada N, Kataoka K, Kondo K, Arimochi H, Fujino H, Takahashi Y, Miyoshi T, Kuwahara T,

Monden Y, Ohnishi Y. 2004. Betulinic acid augments the inhibitory effects of vincristine on

growth and lung metastasis of B16F10 melanoma cells in mice. British Journal of Cancer

90(8):1672–1678 DOI 10.1038/sj.bjc.6601746.

Scott WA. 1955. Reliability of content analysis: the case of nominal scale coding. Public Opinion

Quarterly 19(3):321–325 DOI 10.1086/266577.

Shen M, Zhang Y, Saba N, Austin CP, Wiestner A, Auld DS. 2013. Identification of therapeutic

candidates for chronic lymphocytic leukemia from a library of approved drugs. PLoS ONE

8(9):e75252 DOI 10.1371/journal.pone.0075252.

Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A, Sage J, Butte AJ. 2011.

Discovery and preclinical validation of drug indications using compendia of public gene

expression data. Science Translational Medicine 3(96):96ra77

DOI 10.1126/scitranslmed.3001318.

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 19/20

http://dx.doi.org/10.1016/s0140-6736(97)07382-0
https://www.w3.org/TR/owl2-direct-semantics/
http://dx.doi.org/10.1007/s12253-009-9204-7
http://dx.doi.org/10.7150/jca.2.81
https://clinicaltrials.gov/ct2/show/NCT00110019
http://dx.doi.org/10.1093/bioinformatics/bti115
http://dx.doi.org/10.1038/sj.emboj.7601198
http://dx.doi.org/10.1186/1748-717x-6-160
http://dx.doi.org/10.1039/c3mb70503j
http://dx.doi.org/10.1186/1471-2105-9-405
https://www.w3.org/TR/rdf11-concepts/
http://dx.doi.org/10.1093/nar/gkp914
http://dx.doi.org/10.1093/bib/bbr047
http://dx.doi.org/10.1038/sj.bjc.6601746
http://dx.doi.org/10.1086/266577
http://dx.doi.org/10.1371/journal.pone.0075252
http://dx.doi.org/10.1126/scitranslmed.3001318
http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/


Skrabanek L, Saini HK, Bader GD, Enright AJ. 2008. Computational prediction of protein-

protein interactions. Molecular Biotechnology 38(1):1–17 DOI 10.1007/s12033-007-0069-2.

Smalley KSM, Contractor R, Haass NK, Kulp AN, Atilla-Gokcumen GE, Williams DS,

Bregman H, Flaherty KT, Soengas MS, Meggers E, Herlyn M. 2007. An organometallic

protein kinase inhibitor pharmacologically activates p53 and induces apoptosis in human

melanoma cells. Cancer Research 67(1):209–217 DOI 10.1158/0008-5472.can-06-1538.

Sprinzak E, Sattath S, Margalit H. 2003. How reliable are experimental protein–protein

interaction data? Journal of Molecular Biology 327(5):919–923

DOI 10.1016/s0022-2836(03)00239-0.

Stark C, Breitkreutz B-J, Reguly T, Boucher L, Breitkreutz A, Tyers M. 2006. BioGRID: a general

repository for interaction datasets. Nucleic Acids Research 34(Suppl 1):D535–D539

DOI 10.1093/nar/gkj109.

Vogt I, Prinz J, Campillos M. 2014. Molecularly and clinically related drugs and diseases are

enriched in phenotypically similar drug-disease pairs. Genome Medicine 6(7):52

DOI 10.1186/s13073-014-0052-z.

Whitehead RP, Moon J, McCachren SS, Hersh EM, Samlowski WE, Beck JT, Tchekmedyian NS,

Sondak VK. 2004. A Phase II trial of vinorelbine tartrate in patients with disseminated

malignant melanoma and one prior systemic therapy. Cancer 100(8):1699–1704

DOI 10.1002/cncr.20183.

Wilkinson M, Vandervalk B, McCarthy L. 2009. SADI Semantic Web Services–cause you can’t

always GETwhat you want! In: 2009 IEEE Asia-Pacific Services Computing Conference (APSCC).

Singapore: IEEE, 13–18.

Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J.

2006. DrugBank: a comprehensive resource for in silico drug discovery and exploration.

Nucleic Acids Research 34(Suppl 1):D668–D672 DOI 10.1093/nar/gkj067.

Wu C, Gudivada RC, Aronow BJ, Jegga AG. 2013. Computational drug repositioning through

heterogeneous network clustering. BMC Systems Biology 7(Suppl 5):S6

DOI 10.1186/1752-0509-7-s5-s6.

Wu Z, Wang Y, Chen L. 2013. Network-based drug repositioning. Molecular BioSystems

9(6):1268–1281 DOI 10.1039/c3mb25382a.

Xenarios I, Salwinski L, Duan XJ, Higney P, Kim S-M, Eisenberg D. 2002. DIP, the database of

interacting proteins: a research tool for studying cellular networks of protein interactions.

Nucleic Acids Research 30(1):303–305 DOI 10.1093/nar/30.1.303.

Yang L, Agarwal P. 2011. Systematic drug repositioning based on clinical side-effects. PLoS ONE

6(12):e28025 DOI 10.1371/journal.pone.0028025.

Ye H, Liu Q, Wei J. 2014. Construction of drug network based on side effects and its application

for drug repositioning. PLoS ONE 9(2):e87864 DOI 10.1371/journal.pone.0087864.

McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 20/20

http://dx.doi.org/10.1007/s12033-007-0069-2
http://dx.doi.org/10.1158/0008-5472.can-06-1538
http://dx.doi.org/10.1016/s0022-2836(03)00239-0
http://dx.doi.org/10.1093/nar/gkj109
http://dx.doi.org/10.1186/s13073-014-0052-z
http://dx.doi.org/10.1002/cncr.20183
http://dx.doi.org/10.1093/nar/gkj067
http://dx.doi.org/10.1186/1752-0509-7-s5-s6
http://dx.doi.org/10.1039/c3mb25382a
http://dx.doi.org/10.1093/nar/30.1.303
http://dx.doi.org/10.1371/journal.pone.0028025
http://dx.doi.org/10.1371/journal.pone.0087864
http://dx.doi.org/10.7717/peerj-cs.106
https://peerj.com/computer-science/

	Finding melanoma drugs through a probabilistic knowledge graph
	Introduction
	Results
	Discussion
	Materials and Methods
	flink5
	References