Finding melanoma drugs through a probabilistic knowledge graph Finding melanoma drugs through a probabilistic knowledge graph James P. McCusker1, Michel Dumontier2, Rui Yan1, Sylvia He1, Jonathan S. Dordick3,4 and Deborah L. McGuinness1,4 1 Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY, USA 2 Stanford Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA, USA 3 Department of Chemical & Biological Engineering, Rensselaer Polytechnic Institute, Troy, NY, USA 4 Center for Biotechnology & Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, NY, USA ABSTRACT Metastatic cutaneous melanoma is an aggressive skin cancer with some progression- slowing treatments but no known cure. The omics data explosion has created many possible drug candidates; however, filtering criteria remain challenging, and systems biology approaches have become fragmented with many disconnected databases. Using drug, protein and disease interactions, we built an evidence-weighted knowledge graph of integrated interactions. Our knowledge graph-based system, ReDrugS, can be used via an application programming interface or web interface, and has generated 25 high-quality melanoma drug candidates. We show that probabilistic analysis of systems biology graphs increases drug candidate quality compared to non-probabilistic methods. Four of the 25 candidates are novel therapies, three of which have been tested with other cancers. All other candidates have current or completed clinical trials, or have been studied in in vivo or in vitro. This approach can be used to identify candidate therapies for use in research or personalized medicine. Subjects Bioinformatics, Computational Biology, Data Science, World Wide Web and Web Science Keywords Melanoma, Knowledge graphs, Drug repositioning, Uncertainty reasoning INTRODUCTION Metastatic cutaneous melanoma is an aggressive cancer of the skin with low prevalence but very high mortality rate, with an estimated 5-year survival rate of 6% (Barth, Wanek & Morton, 1995). There are currently no known therapies that can consistently cure metastatic melanoma. Vemurafenib is effective against BRAF mutant melanomas (Chapman et al., 2011) but resistant cells often result in recurrence of metastases (Le et al., 2013). Melanoma itself may be best approached based on the individual genetics of the tumor, as it has been shown to involve mutations in many different genes to produce the same disease (Krauthammer et al., 2015). Because of this, an individualized approach may be necessary to find effective treatments. Drug repurposing, or the discovery of new uses for existing approved drugs, can often lead to effective new treatments for diseases. A wide range of computational methods have been developed in support of drug repositioning. Computational approaches (Sanseau & Koehler, 2011) include topic modeling (Bisgin et al., 2012, 2014), side-effect How to cite this article McCusker et al. (2017), Finding melanoma drugs through a probabilistic knowledge graph. PeerJ Comput. Sci. 3:e106; DOI 10.7717/peerj-cs.106 Submitted 27 April 2016 Accepted 27 December 2016 Published 13 February 2017 Corresponding authors James P. McCusker, mccusj@cs.rpi.edu Deborah L. McGuinness, dlm@cs.rpi.edu Academic editor Yonghong Peng Additional Information and Declarations can be found on page 14 DOI 10.7717/peerj-cs.106 Copyright 2017 McCusker et al. Distributed under Creative Commons CC-BY 4.0 http://dx.doi.org/10.7717/peerj-cs.106 mailto:mccusj@�cs.�rpi.�edu mailto:dlm@�cs.�rpi.�edu https://peerj.com/academic-boards/editors/ https://peerj.com/academic-boards/editors/ http://dx.doi.org/10.7717/peerj-cs.106 http://www.creativecommons.org/licenses/by/4.0/ http://www.creativecommons.org/licenses/by/4.0/ https://peerj.com/computer-science/ similarity (Yang & Agarwal, 2011; Ye, Liu & Wei, 2014), drug and/or disease similarity (Chiang & Butte, 2009; Gottlieb et al., 2011), genome-wide association studies (Kingsmore et al., 2008; Grover et al., 2014), and gene expression (Lamb et al., 2006; Sirota et al., 2011). Systems biology has also provided a number of network analysis approaches (Yang & Agarwal, 2011; Wu, Wang & Chen, 2013; Cheng et al., 2012; Emig et al., 2013; Harrold, Ramanathan & Mager, 2013; Wu et al., 2013; Vogt, Prinz & Campillos, 2014) but the field has been limited by a fragmentation of databases. Most systems biology databases are not aligned with each other, and typically leave out crucial information about how other biological entities, like drugs and diseases, interact with the systems biology graph. Further, while some interaction databases provide human curation and validation of pathway interactions, and others provide experimental evidence for the recorded interactions, there has not yet been, to our knowledge, a resource that combines the two approaches and quantifies the reliability of the evidence used to assert the interactions. A knowledge graph is a compilation of facts and figures that can be used to provide contextual meaning to searches. Google is using knowledge graphs to improve its search and to analyze the information graph of the web; Facebook is using them to analyze the social graph. We built our knowledge graph with the goal of unifying large parts of biomedical domain knowledge for both mining and interactive exploration related to drugs, diseases, and proteins. Our knowledge graph is enhanced by the provenance of each fragment of knowledge captured, which is used to compute the confidence probabilities for each of those fragments. Further, we use open standards from the world wide web consortium (W3C), including the resource description framework (RDF) (Richard, David & Markus, 2014), web ontology language (OWL) (Motik, Patel-Schneider & Cuenca Grau, 2009), and SPARQL (Harris, Seaborne & Prud’hommeaux, 2013). The representation of the knowledge in our knowledge graph is aligned with best practice vocabularies and ontologies from the W3C and the biomedical community, including the provenance ontology (PROV-O) (Lebo, Sahoo & McGuinness, 2013), the HUPO proteomics standards initiative molecular interactions (PSI-MI) ontology (Hermjakob et al., 2004), and the semanticscience integrated ontology (SIO) (Dumontier et al., 2014). Use of these standards, vocabularies, and ontologies make it simple for ReDrugS to integrate with other similar efforts in the future with minimal effort. We proposed and built a novel computational drug repositioning platform, that we refer to as ReDrugS, that applies probabilistic filtering over individually-supported assertions drawn from multiple databases pertaining to systems biology, pharmacology, disease association, and gene expression data. We use our platform to identify novel and known drugs for melanoma. RESULTS We used ReDrugS to examine the drug–target–disease network and identify known, novel, and well supported melanoma drugs. The ReDrugS knowledge base contained 6,180 drugs, 3,820 diseases, 69,279 proteins, and 899,198 interactions. The drugs included in ReDrugS follow the distribution by the Anatomic Therapeutic Classification (ATC) categories shown in Fig. 1. McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 2/20 http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ We examined drug and gene connections that were three or less interaction steps from melanoma, and additionally filtered interactions with a joint probability greater or equal to 0.93. We identified 25 drugs in the resulting drug–gene–disease network surrounding melanoma as illustrated in Fig. 2. We then validated the set of 25 drugs by determining their position in the drug discovery pipeline for melanoma. Table 1 shows that nearly all drugs uncovered by ReDrugS were previously been identified as potential melanoma therapies either in clinical trials or in vivo or in vitro. Of the 25 drugs, 12 have been in Phase I, II, or III clinical trials, five have been studied in vitro, four in vivo, one was investigated as a case study, and three are novel. To further evaluate our system, we examined the impact of decreasing the joint probability or increasing the number of interaction steps. Figures 3A and 3B show precision, recall, and f-measure curves while varying each parameter. Using these information retrieval performance curves, we found that using a joint probability of 0.93 or greater with three or less interaction steps maximizes the precision and recall as shown in Fig. 3. By performing a sampled literature search on hypothesis candidates with a joint probability of 0.5 or higher and six or fewer interaction steps, we were able to generate precision, recall, and f-measure curves for both cutoffs to find our cutoff of 0.93 with three or fewer interaction steps. The precision, recall, and f-measure curves are shown for varying joint probability thresholds in Fig. 3A and for varying interaction step counts in Fig. 3B. Figure 1 Percentage approved drugs in each of the categories of the anatomic therapeutic classification (ATC) system. McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 3/20 http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ DISCUSSION We designed ReDrugS to quickly and automatically integrate and filter a heterogeneous biomedical knowledge graph to generate high-confidence drug repositioning candidates. Our results indicate that ReDrugs generates clinically plausible drug candidates, in which half are in various stages of clinical trials, while others are novel or are being investigated in pre-clinical studies. By helping to consolidate the three main datatypes—drug targets, protein interactions, and disease genes—ReDrugs can amplify the ability of researchers to filter the vast amount of information into those that are relevant for drug discovery. Candidate significance Three drugs were identified that have not previously been studied for melanoma treatment. Framycetin, a CXCR4 inhibitor, has not previously been considered for melanoma treatment. While it is nephrotoxic when administered orally (Greenberg, 1965), it is used topically as an antibacterial treatment. While it may not be of use for metastasis, it might serve as a simple, inexpensive prophylactic treatment after excision of primary tumors. Additionally, Lucanthone and Podofilox were identified as having potential effects on melanoma through CDKN2A and MAP kinase, respectively. Figure 2 The interaction graph of predicted melanoma drugs with a probability of 0.93 or higher and have three or fewer intervening interactions between drug and disease. The “Explore” tab contains the controls to expand the network in various ways, including the filtering parameters. Node and edge detail tabs provide additional information about the selected node or edge, including the probabilities of the edges selected. Users can control the layout algorithm and related options using the “Options” tab. McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 4/20 http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ One drug we identified, Vemurafenib, is approved for treatment of late stage melanoma has been shown to inhibit the BRAF protein in BRAF-V600 mutant melanomas (Chapman et al., 2011). However, cells can become resistant to Vemurafenib, thereby leading to metastasis (Le et al., 2013). A number of the drugs we identified are in clinical trials for treatment of melanoma. We identified BRAF-oriented drugs, Dabrafenib (Hauschild et al., 2012), Sorafenib (National Cancer Institute, 2005), and Regorafenib (Istituto Clinico Humanitas, 2015), that have been evaluated in clinical trials, but have not yet been approved. Zidovudine or Azidothymidine is a TERT inhibitor that has shown significant melanoma tumor reductions in mouse models (Humer et al., 2008). Three MAP kinase-related compounds, Vinblastine (Luikart, Kennealey & Kirkwood, 1984), Trametinib (Kim et al., 2012), and Vinorelbine (Whitehead et al., 2004) were identified that are in clinical trials for melanoma treatment. CDKN2A was another popular target, as Irinotecan (Fiorentini et al., 2009), Table 1 Drug discovery status for 25 drug candidates identified using ReDrugS. Status Drug Pathway Steps Joint p Approved Vemurafenib (Chapman et al., 2011) BRAF 2 0.98 Phase III Dabrafenib (Hauschild et al., 2012) BRAF 2 0.98 Sorafenib (National Cancer Institute, 2005) BRAF 2 0.98 Vinblastine (Luikart, Kennealey & Kirkwood, 1984) MAP kinase 3 0.93 Phase II Zidovudine (Humer et al., 2008) TERT 2 0.98 Trametinib (Kim et al., 2012) MAP kinase 2 0.98 Regorafenib (Istituto Clinico Humanitas, 2015) BRAF 2 0.98 Nadroparin (Nagy, Turcsik & Blaskó, 2009) MYC 3 0.97 Vinorelbine (Whitehead et al., 2004) MAP kinase 3 0.93 Irinotecan (Fiorentini et al., 2009) CDKN2A 3 0.93 Topotecan (Kraut et al., 1997) CDKN2A 3 0.93 Phase I Sodium stibogluconate (Naing, 2011) CDKN2A 3 0.93 Case study Ingenol mebutate (Mansuy et al., 2014) PRKCA/BRAF 3 0.95 In vitro Bosutinib (Homsi et al., 2009) MAP kinase 2 0.98 Purvalanol (Smalley et al., 2007) MAP kinase/TP53 3 0.97 Ellagic acid (Kim et al., 2008) PRKCA/BRAF 3 0.95 Albendazole (Patel et al., 2011) CDKN2A 3 0.93 Colchicine (Lemontt, Azzaria & Gros, 1988) MAP kinase 3 0.93 In vivo Plerixafor (D’Alterio et al., 2012) CXCR4 3 0.97 Vincristine (Sawada et al., 2004) MAP kinase 3 0.93 L-Methionine (Clavo & Wahl, 1996) CDKN2A 3 0.93 Mebendazole (Doudican et al., 2008) CDKN2A 3 0.93 Novel Framycetin CXCR4 3 0.97 Lucanthone CDKN2A 3 0.93 Podofilox MAP kinase 3 0.93 Note: “Pathway” refers to the target or pathway that the drug acts on. “Steps” is distance in number of interactions between the drug and the disease, and “Joint p” is the joint probability that all of those interactions occur. McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 5/20 http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ Topotecan (Kraut et al., 1997), and Sodium stibogluconate (Naing, 2011) are all drugs in clinical trial that we identified as potential therapies. Many other drugs were identified that are being studied in the lab. Additional drugs were identified that target the MAP kinase pathway, including Bosutinib (Homsi et al., 2009), Purvalanol (Smalley et al., 2007), Colchicine (Lemontt, Azzaria & Gros, 1988), and Vincristine (Sawada et al., 2004). Podofilox has not yet been investigated in melanoma treatments, but preliminary investigations have focused on treating chronic lymphocytic leukemia (Shen et al., 2013) and non-small cell lung cancer (Peng et al., 2014). Since these drugs attack MAPK2 and related proteins rather than BRAF or NRAS, they can potentially synergize with other treatments (Homsi et al., 2009). Bosutinib in particular has been investigated as a synergistic treatment for melanoma (Held et al., 2012). Another possible treatment pathway is CXCR4 inhibition. Mouse models suggest that CXCR4 inhibitors like Plerixafor can reduce tumor metastasis and primary tumor growth (D’Alterio et al., 2012). We identify both Plerixafor and Framycetin (Neomycin B) as useful CXCR4 inhibitors. Two PKRCA activators, Ingenol mebutate and Ellagic acid, were also identified. PKRCA binds with BRAF (Pardo et al., 2006), but it is mechanistically unclear how PKRCA activation would result in treatment of melanoma. A number of other therapies are also notable. Purvalenol can inhibit GSK3b, which in turn activates TP53. Some, but not all, melanomas have TP53 deactivation (Smalley et al., 2007). Nadroparin, a MYC inhibitor, may inhibit tumor progression (Nagy, Turcsik & Blaskó, 2009). More broadly, heparins can potentially inhibit the metastatic process in melanoma and other cancers (Maraveyas et al., 2010). (A) Information Retrieval by Probability Threshold(A) Information Retrieval by Probability Threshold 0.90.9 0.0.0.8 0.0.0.7 0.0.0.6 0.250.250.25 0.50.50.5 0.750.750.75 precisionprecision recallrecall f-measuref-measure (B) Information Retrieval by Network Expansion Step(B) Information Retrieval by Network Expansion Step 33 444 555 .250.25 .50.5 .750.75 precisionprecision recallrecall f-measuref-measure Figure 3 Precision,recall, and f-measure by(A) varyingthresholds forjoint probability and (B) varying number of interaction steps.Precision is the percentage of returned candidates that have been validated experimentally or have been in a clinical trial (a “hit”) versus all candidates returned. Recall is the percentage of all known validated “hits.” f-Measure is the geometric mean of precision and recall that provides a balanced evaluation of the quality and completeness of the results. McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 6/20 http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ The approach that we present here offers a novel, mechanism-focused exploration to identify and examine drugs and targets related to cancer. This approach filters our noisy or poorly supported parts of the knowledge graph to identify more confident mechanisms between drugs, targets, and diseases. Thus, our approach can be used to explore high confidence associations that are produced as a result of large scale computational screens that use network connectivity (Yang & Agarwal, 2011; Wu, Wang & Chen, 2013; Cheng et al., 2012; Emig et al., 2013; Harrold, Ramanathan & Mager, 2013; Wu et al., 2013; Vogt, Prinz & Campillos, 2014), the complementarity in drug-disease gene expression, and the similarity of chemical fingerprints, side-effects, targets, or indications (Yang & Agarwal, 2011; Ye, Liu & Wei, 2014; Chiang & Butte, 2009; Gottlieb et al., 2011; Lamb et al., 2006; Sirota et al., 2011). Importantly, since we focus on protein networks that are strongly linked with diseases, we believe that our mechanism focused approach will also aid in the identification of disease-modifying drug candidates, rather than solely those that would be useful for the treatment of symptomatic phenotypes or related co-morbid conditions. Architecture ReDrugS uses a fairly straightforward web architecture, as shown in Fig. 4. It uses the Blazegraph RDF database backend. The database layer is interchangeable except that the full text search service needs to use Blazegraph-only properties to perform text searches as text indexing is not yet standardized in the SPARQL query language. All other aspects are standardized and should work with other RDF databases without modification. ReDrugs currently uses the Python-based TurboGears web application framework hosted using the web services gateway interface standard via an Apache HTTP server. TurboGears in turn hosts the semantic automated discovery and integration (SADI) web services that drive the application and access the database. It also serves up the static HTML and supporting files. RDF Store Python + Apache Web Server /api/search /api/upstream /api/downstream Javascript Web Client Cytoscape.js Javascript Web Client SPARQL JSON-LD Figure 4 The ReDrugS software architecture. Using web standards and a three-layer architecture (RDF store, web server, and rich web client), we were able to build a complete knowledge graph analysis platform. McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 7/20 http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ The user interface is implemented with AngularJS and Cytoscape.js, which submits queries to the SADI web services using JSON-LD and aggregates results into the networked view. The software relies exclusively on standardized protocols (HTTP, SADI, SPARQL, RDF, and others) to make it simple to replace technologies as needed. The data itself is processed using conversion scripts as shown in Fig. 5. We have also adapted and featured ReDrugS in an immersive visualization laboratory called the collaborative-research augmented immersive virtual environment (CRAIVE) lab at RPI, as shown in Fig. 6. The goal of the demonstration was to explore new ways to visualize, sonify, and interact with big data in large-scale virtual reality systems. We also leveraged a gesture controller (Microsoft kinect) to interact with the visualization. With the 360� projection, multiple people can explore the visualization concurrently, which accelerates the exploration and discovery speed. Limitations and future work Our study has a some limitations. First, our study is limited by the sources of data used. We used three databases (DrugBank, iRefIndex, and online Mendelian inheritance in man (OMIM)) to construct the initial knowledge graph. These databases are continuously changing and necessarily incomplete with respect to the total number of drugs, targets, protein interactions, diseases, and disease genes. For instance, as of 8/15/2016 there are over 2,000 additional FDA approved drugs in DrugBank than in the version that was initially used. Second, the focus of our work is on the potential repositioning of FDA ReDrugS API Interaction network search and expansion iRefIndex ReDrugS RDF Store Analytical Tools ReDrugS Cytoscape.js App Ontological Resources Protein/Protein Interaction Ontology, Semanticscience Integrated Ontology, Gene Ontology vocabularies, relationships queries queries graphqueries graph II Experimental Method Assessment experimental methods. evidence to probabilityconverted to nanopubs Cytoscape, R, Python, etc. Figure 5 The ReDrugS data flow. Data is selected from external databases and converted using scripts into nanopublication graphs, which are loaded into the ReDrugS data store. This is combined with experimental method assessments, expressed in OWL, and public ontologies into the RDF store. The web service layer queries the store and produces aggregate analyses of those nanopublications, which is con- sumed and displayed by the rich web client. The same APIs can be used by other tools for further analysis. McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 8/20 http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ approved drugs, which means that tens of thousands of chemical compounds with protein binding activity cannot be considered as candidates in the current study. Third, our path expansion is currently limited to pairwise protein–protein interactions, which excludes interactions as a result of protein complexes or regulatory pathways. Having a more sophisticated understanding of non-direct interactions will help identify candidate drugs that can regulate entire pathways in a more rational manner. Additionally, we aim to incorporate knowledge of the complementarity of drug and disease gene expression patterns as evidenced by the connectivity map (Lamb et al., 2006), which could suggest therapeutic and adverse interactions. Finally, as we develop new hypotheses about potential new drug effects, we plan to test them using a new three-dimensional cellular microarray to perform high-throughput drug screening (Lee et al., 2008) with reference samples. The integration of computational predictions and high-throughput screening platform will enable the systematic evaluation of any drug or mechanism of action against any disease or adverse event. MATERIALS AND METHODS This research project did not involve human subjects. The ReDrugS platform consists of a graphical web application, an application programming interface (API), and a knowledge base. The graphical web application enables users to initiate a search using drug, gene, and disease names and synonyms. Users can then interact with the application to expand the network at an arbitrary number of interactions away from the entity of interest, and to filter the network based on a joint probability between the source and target entities. Drug–protein, protein–protein, and gene–disease interactions were obtained from several datasets and integrated into ontology-annotated and provenance and evidence bearing representations called nanopublications. The web application obtains information from the knowledge base using semantic web services. Finally, we evaluated our approach by Figure 6 The authors demonstrate the ReDrugS user interface in the collaborative-research augmented immersive virtual environment (CRAIVE) lab at RPI. McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 9/20 http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ examining the mechanistic plausibility of the drug in having melanoma-specific disease modifying ability. We evaluated a large number of possible drug/disease associations with varying joint probabilities and interaction steps to determine the thresholds with the highest f-measure, resulting in our thresholds of three or less interactions and a joint probability of 0.93 or higher. Using the ReDrugS application page (http://redrugs.tw.rpi.edu) we initiate our search for “melanoma,” and select the first suggestion obtained from the experimental factor ontology (EFO) (http://www.ebi.ac.uk/efo/EFO_0000756). The application then provides immediate neighborhood of drugs and genes that are associated with melanoma. We expanded the network by first selecting the melanoma node and expanding the link distance to |I| � 3 and changing the minimum joint probability to p � 0.93 in the search options. Importantly, we also limit the node type to “Drug.” Finally, we click on the “find incoming links” button (two left-facing arrows). When finished the network will show all drugs interacting with melanoma that meet the above criteria, as well as any intervening entities and their interactions. The resulting network can be downloaded as an image, or a summary CSV file. We used the CSV file to validate the links by searching Google Scholar and ClinicalTrials.gov for each proposed drug/disease combination. We consider a “hit” to be a pairing with a published positive experiment in vivo or in vitro or any pairing that has been tested in a clinical trial. While this level of validation does not guarantee efficacy, it does determine if the resulting connection is a plausible hypothesis that might be tested. Data fusion We developed a structured knowledge base containing data pertaining to drugs, targets, interactions, and diseases. We used five data sources: iRefIndex (Razick, Magklaras & Donaldson, 2008), DrugBank (Wishart et al., 2006), UniProt gene ontology annotations (GOA) (Camon et al., 2004), the online Mendelian inheritance in man (OMIM) (Hamosh et al., 2005), and the catalogue of somatic mutations in cancer (COSMIC) gene census (Futreal et al., 2004). iRefIndex contains protein–protein interactions and protein complexes and is an amalgam of the biomolecular interaction network database (Bader, Betel & Hogue, 2003), BioGRID (Stark et al., 2006), the comprehensive resource of mammalian protein complexes (Ruepp et al., 2010), database of interacting proteins (Xenarios et al., 2002), human protein reference database (Keshava Prasad et al., 2009), InnateDB (Lynn et al., 2008), IntAct (Kerrien et al., 2011), MatrixDB (Chautard et al., 2011), molecular interaction database (Chatr-aryamontri et al., 2008), MPact (Güldener et al., 2006), microbial protein interaction database (Goll et al., 2008), MIPS mammalian protein–protein interaction database (Pagel et al., 2005), and online predicted human interaction database (Brown & Jurisica, 2005). DrugBank provides information about experimental/approved drugs and their targets, and UniProt GOA describes proteins in terms of their biological processes, cellular locations, and molecular functions. OMIM provides associations between genes and inherited or genetically-driven diseases. The COSMIC gene census is a curated list of genes that have causal associations with one or more cancer types. McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 10/20 http://redrugs.tw.rpi.edu http://www.ebi.ac.uk/efo/EFO_0000756 http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ Each association (e.g., drug–target, protein–protein, disease–gene) was captured using the nanopublication (Groth, Gibson & Velterop, 2010) scheme. A nanopublication is a digital artifact that consists of an assertion, its provenance, and information about the digital publication. Our nanopublications are represented as linked data: each data item is identified using an dereferenceable HTTP uniform resource identifier (URI) and statements are represented using the RDF. Each nanopublication corresponds to a single interaction assertion from one of the databases. We used a number of automated scripts to produce the nanopublications and load them into the SPARQL endpoint. An example nanopublication is shown in Fig. 7. We used the SIO (Dumontier et al., 2014) as a global schema to describe the nature and components of the associations, and coupled this with the PSI-MI ontology (Hermjakob et al., 2004) to denote the types of interactions. We used the W3C’s PROV-O (Lebo, Sahoo & McGuinness, 2013) to capture provenance of the assertion (which data source it originated from). We loaded our nanopublications into Blazegraph, an RDF nanopublication compatible database. The data is accessed using its native SPARQL endpoint by the web application. Assertion probability Each knowledge graph fragment, enclosed in a nanopublication, is assigned a probability based on the quality of the methods used to create the assertions in the fragment. We compute probabilities based on two different methods. Manually curated assertions, from DrugBank, OMIM, and COSMIC gene census, are directly given a probability p = 0.999. Assertions that have been derived from a specific experimental method are given probabilities appropriate for that method. These probabilities are derived from a expert-driven measure of the reliability of the experimental method used to derive the association. Factors involved in the assessment of confidence include the degree of indirection in the assay, the sensitivity and specificity of the approach, and reproducibility of results under different conditions based on the comparative analyses of techniques (Skrabanek et al. 2008; Sprinzak, Sattath & Margalit, 2003). Two expert bioinformaticians rated the reliability of each method and assigned a score of 1–3, where 1 corresponds to low confidence and 3 to high confidence. After their initial assessment, they conferred on their reasoning for each score to resolve differences where possible. The experts considered level 1 to correspond to weak evidence that needs independent verification. Level 2 methods are generally reliable, but should have additional biological evidence. Level 3 methods are high-quality method that produces few false positives. We calculated inter-annotator agreement between the two annotators over the three categories using Scott’s Pi. Scott’s Pi is similar to Cohen’s kappa in that it improves on simple observed agreement by factoring in the extent of agreement that might be expected by chance. We determined the agreement to be 0.56 (Scott’s Pi value of 0.26) across 104 experimental methods comprising of 99.9999% of interaction annotations (Scott, 1955). The scores of 1, 2, and 3 were then assigned provisional probabilities of p = 0.8, p = 0.95, and p = 0.99 respectively. We chose these probabilities as approximations of the conceptual levels of probability for each rating by the experts, and feel that those probabilities correspond to how often an experiment at that confidence level can be expected to be McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 11/20 http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ accurate. We plan to provide a more rigorous assessment of the accuracy of each method against gold standards in future work. These confidence values were encoded into an OWL ontology along with the evidence codes. The full inferences were extracted using Pellet (https://github.com/complexible/pellet) and loaded into the SPARQL endpoint, where they were used to apply the probabilities to each assertion in the knowledge graph that had experimental evidence. Semantic web services We developed four SADI web services (Wilkinson, Vandervalk & McCarthy, 2009) in Python 1 to support easy access to the nanopubications (see Table 2) in ReDrugS. The four services are enumerated in Table 2. The first service is a simple free text lookup, that takes an pml:Query 2 (McGuinness et al., 2007) with a prov:value as a query and produces a set of entities whose labels contain the substring. This is used for interactive typeahead completion of search terms so users can look up URIs and entities without needing to know the details. The other three SADI services look up interactions that contain a named entity. Two of them look at the entity to find upstream and downstream connections, and the third service assumes that the entity is a biological process and finds all interactions that related to that process. The services return only one interaction for each triple (source, interaction type, target). There are often multiple probabilities per interaction, and more than one interaction per interaction type. This is because the interaction may have been recorded in multiple databases, based on different experimental methods. Figure 7 Representation of a protein/protein interaction within a nanopublication. Three graphs are represented. The assertion graph (NanoPub_501799_Assertion), states that an interaction (X) is of type sio:DirectInteraction, and has the target of SLC4A8, and a participant of CA2. The supporting graph (NanoPub_501799_Supporting), states that the assertion graph was generated by a pull down experi- ment (one of many encoded experiment types used in, a subclass of prov:Activity. The attribution graph (NanoPub_501799_Attribution), in turn, states that the assertion had a primary source of (Loiselle et al., 2004) and that the interaction was quoted from BioGrid. 1 For further information on developing web services in Python using SADI, see this tutorial: https://github.com/ markwilkinson/SADI-Semantic-Web- Services-Core/wiki/Building-Services- in-Python 2 PML 3, in development: https://github. com/timrdf/pml. This includes PML 2 constructs that are not covered in PROV-O. McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 12/20 https://github.com/complexible/pellet https://github.com/markwilkinson/SADI-Semantic-Web-Services-Core/wiki/Building-Services-in-Python https://github.com/markwilkinson/SADI-Semantic-Web-Services-Core/wiki/Building-Services-in-Python https://github.com/markwilkinson/SADI-Semantic-Web-Services-Core/wiki/Building-Services-in-Python https://github.com/markwilkinson/SADI-Semantic-Web-Services-Core/wiki/Building-Services-in-Python https://github.com/timrdf/pml https://github.com/timrdf/pml http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ To provide a single probability score for each interaction of a source and target, the interactions are combined. A single probability is generated per identified interaction by taking the geometric mean of the probabilities for that interaction. However, this method is undesirable when combining multiple interaction records of the same type. We instead combine the interaction records using a form of probabilistic voting using composite Z-scores. This is done to model that multiple experiments that produce the same results reinforce each other, and should therefore give a higher overall probability than would be indicated by taking their mean or even by Bayes theorem. We do this by converting each probability into a Z-score (aka standard score) using the quantile function (Q()), summing the values, and applying the cumulative distribution function (CDF()) to compute the corresponding probability: P x1...nð Þ ¼ CDF Xn i¼1 Q P xið Þð Þ ! These composite Z-scores, which we transform back into probabilities, are frequently used to combine multiple indicators of the same underlying phenomena, as in (Moller et al., 1998). However, it has a drawback. One concern is that the strategy does not account for multiple databases recording the same non-independent experiment. This can possibly inflating the probabilities of interactions described by experiments that are published in more than one database. Graph expansion using joint probability In order to compute the probability that a given entity affects another, we compute the joint probability that each of the intervening interactions are true. Joint probability is the probability that every assertion in the set is true. This is computed by taking the product of probabilities of each interaction: P x1 ^ . . . ^ xnð Þ ¼ Yn i¼1 P xið Þ This joint probability is used as a threshold that users can set to stop graph expansion. We also provide expansion limits using the number of interaction steps that are needed to connect the two entities. Table 2 ReDrugS API SADI Web Services. The API endpoint prefix is http://redrugs.tw.rpi.edu/api/. Service name Description URL Input Output Resource text search Look up resources using free text search against their RDFS labels. This service is optimized for typeahead user interfaces. search pml:Query pml:AnsweredQuery Find interactions in a biological process Find interactions whose participants or targets also participate in the input process. process sio:Process sio:Process Find upstream participants Find interactions that the input entity is a target of in and have explicit participants. upstream sio:MaterialEntity sio:Target Find downstream targets Find interactions that the input entity participates in and have explicit targets. downstream sio:MaterialEntity sio:Agent McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 13/20 http://redrugs.tw.rpi.edu/api/ http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ User interface The user interface was developed using the above SADI web services and uses Cytoscape.js (http://cytoscape.github.io/cytoscape.js) angular.js (https://angularjs.org), and Bootstrap 3 (http://getbootstrap.com). An example network is shown in Fig. 2. Users can search for biological entities and processes, which can then be autocompleted to specific entities that are in the ReDrugS graph. Users can then add those entities and processes to the displayed graph and retrieve upstream and downstream connections and link out to more details for every entity. Cytoscape.js is used as the main rendering and network visualization tool, and provides node and edge rendering, layout, and network analysis capabilities, and has been integrated into a customized rich web client. In order to evaluate this knowledge graph, we developed a demonstration web interface (http://redrugs.tw.rpi.edu) based on the Cytoscape.js (http://cytoscape.github.io/ cytoscape.js) JavaScript library. The interface lets users enter biological entity names. As the user types, the text is resolved to a list of entities. The user finishes by selecting from the list, and submitting the search. The search returns interactions and nodes associated with the entity selected, which are added to the Cytoscape.js graph. Users are also able to select nodes and populate upstream or downstream connections. Figure 2 is an example output of this process. ACKNOWLEDGEMENTS A special thanks to Pascale Gaudet who, with Michel Dumontier, evaluated the experimental methods and evidence codes listed in the protein/protein interaction ontology and gene ontology. Thank you also to Kusum Solanki and John Erickson for evaluation, feedback, and planning in the initial stages of this project. ADDITIONAL INFORMATION AND DECLARATIONS Funding The authors received no funding for this work. Competing Interests The authors declare that they have no competing interests. Author Contributions � James P. McCusker conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, performed the computation work, reviewed drafts of the paper. � Michel Dumontier conceived and designed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, performed the computation work, reviewed drafts of the paper. � Rui Yan performed the experiments, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, performed the computation work, reviewed drafts of the paper. McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 14/20 http://cytoscape.github.io/cytoscape.js https://angularjs.org http://getbootstrap.com http://redrugs.tw.rpi.edu http://cytoscape.github.io/cytoscape.js http://cytoscape.github.io/cytoscape.js http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ � Sylvia He contributed reagents/materials/analysis tools, prepared figures and/or tables, performed the computation work, reviewed drafts of the paper. � Jonathan S. Dordick conceived and designed the experiments, reviewed drafts of the paper. � Deborah L. McGuinness conceived and designed the experiments, wrote the paper, reviewed drafts of the paper. Data Deposition The following information was supplied regarding data availability: Data can be found at https://data.rpi.edu/xmlui/handle/10833/1760. REFERENCES Bader GD, Betel D, Hogue CW. 2003. BIND: the biomolecular interaction network database. Nucleic Acids Research 31(1):248–250 DOI 10.1093/nar/gkg056. Barth A, Wanek L, Morton D. 1995. Prognostic factors in 1,521 melanoma patients with distant metastases. Journal of the American College of Surgeons 181(3):193–201. Bisgin H, Liu Z, Fang H, Kelly R, Xu X, Tong W. 2014. A phenome-guided drug repositioning through a latent variable model. BMC Bioinformatics 15(1):267 DOI 10.1186/1471-2105-15-267. Bisgin H, Liu Z, Kelly R, Fang H, Xu X, Tong W. 2012. Investigating drug repositioning opportunities in FDA drug labels through topic modeling. BMC Bioinformatics 13(Suppl 1):S6 DOI 10.1186/1471-2105-13-s15-s6. Brown KR, Jurisica I. 2005. Online predicted human interaction database. Bioinformatics 21(9):2076–2082 DOI 10.1093/bioinformatics/bti273. Camon E, Magrane M, Barrell D, Lee V, Dimmer E, Maslen J, Binns D, Harte N, Lopez R, Apweiler R. 2004. The gene ontology annotation (GOA) database: sharing knowledge in UniProt with gene ontology. Nucleic Acids Research 32(Suppl 1):D262–D266 DOI 10.1093/nar/gkh021. Chapman PB, Hauschild A, Robert C, Haanen JB, Ascierto P, Larkin J, Dummer R, Garbe C, Testori A, Maio M, Hogg D, Lorigan P, Lebbe C, Jouary T, Schadendorf D, Ribas A, O’Day SJ, Sosman JA, Kirkwood JM, Eggermont AM, Dreno B, Nolop K, Li J, Nelson B, Hou J, Lee RJ, Flaherty KT, McArthur GA. 2011. Improved survival with vemurafenib in melanoma with BRAF V600E mutation. New England Journal of Medicine 364(26):2507–2516. Chatr-aryamontri A, Zanzoni A, Ceol A, Cesareni G. 2008. Searching the protein interaction space through the MINT database. In: Thompson JD, Ueffing M, Schaeffer-Reiss C, eds. Functional Proteomics: Methods and Protocols. Totowa: Humana Press, 305–317 DOI 10.1007/978-1-59745-398-1_20. Chautard E, Fatoux-Ardore M, Ballut L, Thierry-Mieg N, Ricard-Blum S. 2011. MatrixDB, the extracellular matrix interaction database. Nucleic Acids Research 39(Suppl 1):D235–D240 DOI 10.1093/nar/gkq830. Cheng F, Liu C, Jiang J, Lu W, Li W, Liu G, Zhou W, Huang J, Tang Y. 2012. Prediction of drug- target interactions and drug repositioning via network-based inference. PLoS Computational Biology 8(5):e1002503 DOI 10.1371/journal.pcbi.1002503. Chiang AP, Butte AJ. 2009. Systematic evaluation of drug-disease relationships to identify leads for novel drug uses. Clinical Pharmacology and Therapeutics 86(5):507–510 DOI 10.1038/clpt.2009.103. McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 15/20 https://data.rpi.edu/xmlui/handle/10833/1760 http://dx.doi.org/10.1093/nar/gkg056 http://dx.doi.org/10.1186/1471-2105-15-267 http://dx.doi.org/10.1186/1471-2105-13-s15-s6 http://dx.doi.org/10.1093/bioinformatics/bti273 http://dx.doi.org/10.1093/nar/gkh021 http://dx.doi.org/10.1007/978-1-59745-398-1_20 http://dx.doi.org/10.1093/nar/gkq830 http://dx.doi.org/10.1371/journal.pcbi.1002503 http://dx.doi.org/10.1038/clpt.2009.103 http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ Clavo A, Wahl R. 1996. Effects of hypoxia on the uptake of tritiated thymidine, L-leucine, L-methionine and FDG in cultured cancer cells. Journal of Nuclear Medicine 37:502–506. D’Alterio C, Barbieri A, Portella L, Palma G, Polimeno M, Riccio A, Ieranò C, Franco R, Scognamiglio G, Bryce J, Luciano A, Rea D, Arra C, Scala S. 2012. Inhibition of stromal CXCR4 impairs development of lung metastases. Cancer Immunology Immunotherapy 61(10):1713–1720 DOI 10.1007/s00262-012-1223-7. Doudican N, Rodriguez A, Osman I, Orlow SJ. 2008. Mebendazole induces apoptosis via Bcl-2 inactivation in chemoresistant melanoma cells. Molecular Cancer Research 6(8):1308–1315 DOI 10.1158/1541-7786.mcr-07-2159. Dumontier M, Baker CJ, Baran J, Callahan A, Chepelev L, Cruz-Toledo J, Del Rio NR, Duck G, Furlong LI, Keath N, Klassen D, McCusker JP, Queralt-Rosinach N, Samwald M, Villanueva-Rosales N, Wilkinson MD, Hoehndorf R. 2014. The semanticscience integrated ontology (SIO) for biomedical research and knowledge discovery. Journal of Biomedical Semantics 5(1):14 DOI 10.1186/2041-1480-5-14. Emig D, Ivliev A, Pustovalova O, Lancashire L, Bureeva S, Nikolsky Y, Bessarabova M. 2013. Drug target prediction and repositioning using an integrated network-based approach. PLoS ONE 8(4):e60618 DOI 10.1371/journal.pone.0060618. Fiorentini G, Aliberti C, Del CA, Tilli M, Rossi S, Ballardini P, Turrisi G, Benea G. 2009. Intra-arterial hepatic chemoembolization (TACE) of liver metastases from ocular melanoma with slow-release irinotecan-eluting beads. Early results of a phase II clinical study. In Vivo 23(1):131–137. Futreal PA, Coin L, Marshall M, Down T, Hubbard T, Wooster R, Rahman N, Stratton MR. 2004. A census of human cancer genes. Nature Reviews Cancer 4(3):177–183 DOI 10.1038/nrc1299. Goll J, Rajagopala SV, Shiau SC, Wu H, Lamb BT, Uetz P. 2008. MPIDB: the microbial protein interaction database. Bioinformatics 24(15):1743–1744 DOI 10.1093/bioinformatics/btn285. Gottlieb A, Stein G, Ruppin E, Sharan R. 2011. PREDICT: a method for inferring novel drug indications with application to personalized medicine. Molecular Systems Biology 7:496 DOI 10.1038/msb.2011.26. Greenberg LH. 1965. Audiotoxicity and nephrotoxicity due to orally administered neomycin. JAMA 194(7):827–828 DOI 10.1001/jama.194.7.827. Groth P, Gibson A, Velterop J. 2010. The anatomy of a nanopublication. Information Services and Use 30(1):51–56. Grover MP, Ballouz S, Mohanasundaram KA, George RA, Sherman CDH, Crowley TM, Wouters MA. 2014. Identification of novel therapeutics for complex diseases from genome- wide association data. BMC Medical Genomics 7(Suppl 1):S8 DOI 10.1186/1755-8794-7-s1-s8. Güldener U, Münsterkötter M, Oesterheld M, Pagel P, Ruepp A, Mewes H-W, Stümpflen V. 2006. MPact: the MIPS protein interaction resource on yeast. Nucleic Acids Research 34(Suppl 1):D436–D441 DOI 10.1093/nar/gkj003. Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA. 2005. Online Mendelian inheritance in man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Research 33(Suppl 1):D514–D517. Harris S, Seaborne A, Prud’hommeaux E. 2013. SPARQL 1.1 query language. W3C Recommendation, 21. Available at https://www.w3.org/TR/sparql11-query/. Harrold JM, Ramanathan M, Mager DE. 2013. Network-based approaches in drug discovery and early development. Clinical Pharmacology and Therapeutics 94(6):651–658 DOI 10.1038/clpt.2013.176. McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 16/20 http://dx.doi.org/10.1007/s00262-012-1223-7 http://dx.doi.org/10.1158/1541-7786.mcr-07-2159 http://dx.doi.org/10.1186/2041-1480-5-14 http://dx.doi.org/10.1371/journal.pone.0060618 http://dx.doi.org/10.1038/nrc1299 http://dx.doi.org/10.1093/bioinformatics/btn285 http://dx.doi.org/10.1038/msb.2011.26 http://dx.doi.org/10.1001/jama.194.7.827 http://dx.doi.org/10.1186/1755-8794-7-s1-s8 http://dx.doi.org/10.1093/nar/gkj003 https://www.w3.org/TR/sparql11-query/ http://dx.doi.org/10.1038/clpt.2013.176 http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ Hauschild A, Grob J-J, Demidov LV, Jouary T, Gutzmer R, Millward M, Rutkowski P, Blank CU, Miller WH, Kaempgen E, Martn-Algarra S, Karaszewska B, Mauch C, Chiarion-Sileni V, Martin A-M, Swann S, Haney P, Mirakhur B, Guckert ME, Goodman V, Chapman PB. 2012. Dabrafenib in BRAF-mutated metastatic melanoma: a multicentre open-label, phase 3 randomised controlled trial. Lancet 380(9839):358–365 DOI 10.1016/s0140-6736(12)60868-x. Held MA, Langdon CG, Platt JT, Graham-Steed T, Liu Z, Chakraborty A, Bacchiocchi A, Koo A, Haskins JW, Bosenberg MW, Stern DF. 2012. Genotype-selective combination therapies for melanoma identified by high-throughput drug screening. Cancer Discovery 3(1):52–67 DOI 10.1158/2159-8290.cd-12-0408. Hermjakob H, Montecchi-Palazzi L, Bader G, Wojcik J, Salwinski L, Ceol A, Moore S, Orchard S, Sarkans U, von Mering C, Roechert B, Poux S, Jung E, Mersch H, Kersey P, Lappe M, Li Y, Zeng R, Rana D, Nikolski M, Husi H, Brun C, Shanker K, Grant SGN, Sander C, Bork P, Zhu W, Pandey A, Brazma A, Jacq B, Vidal M, Sherman D, Legrain P, Cesareni G, Xenarios I, Eisenberg D, Steipe B, Hogue C, Apweiler R. 2004. The HUPO PSI’s molecular interaction format—a community standard for the representation of protein interaction data. Nature Biotechnology 22(2):177–183 DOI 10.1038/nbt926. Homsi J, Cubitt CL, Zhang S, Munster PN, Yu H, Sullivan DM, Jove R, Messina JL, Daud AI. 2009. Src activation in melanoma and Src inhibitors as therapeutic agents in melanoma. Melanoma Research 19(3):167–175 DOI 10.1097/cmr.0b013e328304974c. Humer J, Ferko B, Waltenberger A, Rapberger R, Pehamberger H, Muster T. 2008. Azidothymidine inhibits melanoma cell growth in vitro and in vivo. Melanoma Research 18(5):314–321 DOI 10.1097/cmr.0b013e32830aaaa6. Istituto Clinico Humanitas. 2015. Regorafenib in patients with metastatic solid tumors who have progressed after standard therapy (RESOUND). Available at https://clinicaltrials.gov/ct2/ show/NCT02307500 (accessed 10 January 2016). Kerrien S, Aranda B, Breuza L, Bridge A, Broackes-Carter F, Chen C, Duesbury M, Dumousseau M, Feuermann M, Hinz U, Jandrasits C, Jimenez RC, Khadake J, Mahadevan U, Masson P, Pedruzzi I, Pfeiffenberger E, Porras P, Raghunath A, Roechert B, Orchard S, Hermjakob H. 2011. The IntAct molecular interaction database in 2012. Nucleic Acids Research 40:D841–D846 DOI 10.1093/nar/gkr1088. Keshava Prasad TS, Goel R, Kandasamy K, Keerthikumar S, Kumar S, Mathivanan S, Telikicherla D, Raju R, Shafreen B, Venugopal A, Balakrishnan L, Marimuthu A, Banerjee S, Somanathan DS, Sebastian A, Rani S, Ray S, Harrys Kishore CJ, Kanth S, Ahmed M, Kashyap MK, Mohmood R, Ramachandra YL, Krishna V, Rahiman BA, Mohan S, Ranganathan P, Ramabadran S, Chaerkady R, Pandey A. 2009. Human protein reference database—2009 update. Nucleic Acids Research 37(Suppl 1):D767–D772 DOI 10.1093/nar/gkn892. Kim KB, Kefford R, Pavlick AC, Infante JR, Ribas A, Sosman JA, Fecher LA, Millward M, McArthur GA, Hwu P, Gonzalez R, Ott PA, Long GV, Gardner OS, Ouellet D, Xu Y, DeMarini DJ, Le NT, Patel K, Lewis KD. 2012. Phase II study of the MEK1/MEK2 inhibitor trametinib in patients with metastatic BRAF-mutant cutaneous melanoma previously treated with or without a BRAF inhibitor. Journal of Clinical Oncology 31(4):482–489 DOI 10.1200/jco.2012.43.5966. Kim S, Liu Y, Gaber MW, Bumgardner JD, Haggard WO, Yang Y. 2008. Development of chitosan-ellagic acid films as a local drug delivery system to induce apoptotic death of human melanoma cells. Journal of Biomedical Materials Research Part B: Applied Biomaterials 90B(1):145–155 DOI 10.1002/jbm.b.31266. McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 17/20 http://dx.doi.org/10.1016/s0140-6736(12)60868-x http://dx.doi.org/10.1158/2159-8290.cd-12-0408 http://dx.doi.org/10.1038/nbt926 http://dx.doi.org/10.1097/cmr.0b013e328304974c http://dx.doi.org/10.1097/cmr.0b013e32830aaaa6 https://clinicaltrials.gov/ct2/show/NCT02307500 https://clinicaltrials.gov/ct2/show/NCT02307500 http://dx.doi.org/10.1093/nar/gkr1088 http://dx.doi.org/10.1093/nar/gkn892 http://dx.doi.org/10.1200/jco.2012.43.5966 http://dx.doi.org/10.1002/jbm.b.31266 http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ Kingsmore SF, Lindquist IE, Mudge J, Gessler DD, Beavis WD. 2008. Genome-wide association studies: progress and potential for drug discovery and development. Nature Reviews Drug Discovery 7(3):221–230 DOI 10.1038/nrd2519. Kraut EH, Walker MJ, Staubus A, Gochnour D, Balcerzak SP. 1997. Phase II trial of topotecan in malignant melanoma. Cancer Investigation 15(4):318–320 DOI 10.3109/07357909709039732. Krauthammer M, Kong Y, Bacchiocchi A, Evans P, Pornputtapong N, Wu C, McCusker J, Ma S, Cheng E, Straub R, Serin M, Bosenberg M, Ariyan S, Narayan D, Sznol M, Kluger H, Mane S, Schlessinger J, Lifton R, Halaban R. 2015. Exome sequencing identifies recurrent mutations in NF1 and RASopathy genes in sun-exposed melanomas. Nature Genetics 47(9):996–1002 DOI 10.1038/ng.3361. Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ, Lerner J, Brunet J-P, Subramanian A, Ross KN, Reich M, Hieronymus H, Wei G, Armstrong SA, Haggarty SJ, Clemons PA, Wei R, Carr SA, Lander ES, Golub TR. 2006. The connectivity map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313(5795):1929–1935 DOI 10.1126/science.1132939. Le K, Blomain ES, Rodeck U, Aplin AE. 2013. Selective RAF inhibitor impairs ERK1/2 phosphorylation and growth in mutant NRAS vemurafenib-resistant melanoma cells. Pigment Cell & Melanoma Research 26(4):509–517 DOI 10.1111/pcmr.12092. Lebo T, Sahoo S, McGuinness D. 2013. PROV-O: the PROV ontology. Available at http://www.w3.org/TR/prov-o/. Lee M-Y, Kumar RA, Sukumaran SM, Hogg MG, Clark DS, Dordick JS. 2008. Three-dimensional cellular microarray for high-throughput toxicology assays. Proceedings of the National Academy of Sciences of the United States of America 105(1):59–63 DOI 10.1073/pnas.0708756105. Lemontt J, Azzaria M, Gros P. 1988. Increased mdr gene expression and decreased drug accumulation in multidrug-resistant human melanoma cells. Cancer Research 48(22):6348–6353. Loiselle FB, Morgan PE, Alvarez BV, Casey JR. 2004. Regulation of the human NBC3 Na + /HCO3 - cotransporter by carbonic anhydrase II and PKA. American Journal of Physiology-Cell Physiology 286(6):C1423–C1433 DOI 10.1152/ajpcell.00382.2003. Luikart S, Kennealey G, Kirkwood J. 1984. Randomized phase III trial of vinblastine, bleomycin, and cis-dichlorodiammine-platinum versus dacarbazine in malignant melanoma. Journal of Clinical Oncology 2(3):164–168. Lynn DJ, Winsor GL, Chan C, Richard N, Laird MR, Barsky A, Gardy JL, Roche FM, Chan THW, Shah N, Lo R, Naseer M, Que J, Yau M, Acab M, Tulpan D, Whiteside MD, Chikatamarla A, Mah B, Munzner T, Hokamp K, Hancock REW, Brinkman FSL. 2008. InnateDB: facilitating systems-level analyses of the mammalian innate immune response. Molecular Systems Biology 4(1):218 DOI 10.1038/msb.2008.55. Mansuy M, Nikkels-Tassoudji N, Arrese JE, Rorive A, Nikkels AF. 2014. Recurrent in situ melanoma successfully treated with ingenol mebutate. Dermatology and Therapy 4(1):131–135 DOI 10.1007/s13555-014-0051-4. Maraveyas A, Johnson MJ, Xiao YP, Noble S. 2010. Malignant melanoma as a target malignancy for the study of the anti-metastatic properties of the heparins. Cancer and Metastasis Reviews 29(4):777–784 DOI 10.1007/s10555-010-9263-y. McGuinness DL, Ding L, Silva PPD, Chang C. 2007. PML 2: a modular explanation interlingua. In: Proceedings of the AAAI 2007 Workshop on Explanation-Aware Computing, Vancouver, 22–23. Moller JT, Cluitmans P, Rasmussen LS, Houx P, Rasmussen H, Canet J, Rabbitt P, Jolles J, Larsen K, Hanning CD, Langeron O, Johnson T, Lauven PM, Kristensen PA, Biedler A, van Beem H, Fraidakis O, Silverstein JH, Beneken JEW, Gravenstein JS. 1998. Long-term McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 18/20 http://dx.doi.org/10.1038/nrd2519 http://dx.doi.org/10.3109/07357909709039732 http://dx.doi.org/10.1038/ng.3361 http://dx.doi.org/10.1126/science.1132939 http://dx.doi.org/10.1111/pcmr.12092 http://www.w3.org/TR/prov-o/ http://dx.doi.org/10.1073/pnas.0708756105 http://dx.doi.org/10.1152/ajpcell.00382.2003 http://dx.doi.org/10.1038/msb.2008.55 http://dx.doi.org/10.1007/s13555-014-0051-4 http://dx.doi.org/10.1007/s10555-010-9263-y http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ postoperative cognitive dysfunction in the elderly: ISPOCD1 study. Lancet 351(9106):857–861 DOI 10.1016/s0140-6736(97)07382-0. Motik B, Patel-Schneider PF, Cuenca Grau B. 2009. OWL 2 Web Ontology Language: Direct Semantics. Available at https://www.w3.org/TR/owl2-direct-semantics/. Nagy Z, Turcsik V, Blaskó G. 2009. The effect of LMWH (nadroparin) on tumor progression. Pathology & Oncology Research 15(4):689–692 DOI 10.1007/s12253-009-9204-7. Naing A. 2011. Phase I dose escalation study of sodium stibogluconate (SSG) a protein tyrosine phosphatase inhibitor, combined with interferon alpha for patients with solid tumors. Journal of Cancer 2:81–89 DOI 10.7150/jca.2.81. National Cancer Institute. 2005. Carboplatin and paclitaxel with or without sorafenib tosylate in treating patients with stage III or stage IV melanoma that cannot be removed by surgery. Available at https://clinicaltrials.gov/ct2/show/NCT00110019 (accessed 10 January 2016). Pagel P, Kovac S, Oesterheld M, Brauner B, Dunger-Kaltenbach I, Frishman G, Montrone C, Mark P, Stümpflen V, Mewes H-W, Ruepp A, Frishman D. 2005. The mips mammalian protein-protein interaction database. Bioinformatics 21(6):832–834 DOI 10.1093/bioinformatics/bti115. Pardo OE, Wellbrock C, Khanzada UK, Aubert M, Arozarena I, Davidson S, Bowen F, Parker PJ, Filonenko VV, Gout IT, Sebire N, Marais R, Downward J, Seckl MJ. 2006. FGF-2 protects small cell lung cancer cells from apoptosis through a complex involving PKCepsilon, B-Raf and S6K2. EMBO Journal 25(13):3078–3088 DOI 10.1038/sj.emboj.7601198. Patel K, Doudican NA, Schiff PB, Orlow SJ. 2011. Albendazole sensitizes cancer cells to ionizing radiation. Radiation Oncology 6(1):160 DOI 10.1186/1748-717x-6-160. Peng X, Wang F, Li L, Bum-Erdene K, Xu D, Wang B, Sinn AA, Pollok KE, Sandusky GE, Li L, Turchi JJ, Jalal SI, Meroueh SO. 2014. Exploring a structural protein–drug interactome for new therapeutics in lung cancer. Molecular BioSystems 10(3):581 DOI 10.1039/c3mb70503j. Razick S, Magklaras G, Donaldson IM. 2008. iRefIndex: a consolidated protein interaction database with provenance. BMC Bioinformatics 9(1):405 DOI 10.1186/1471-2105-9-405. Richard C, David W, Markus L. 2014. RDF 1.1 Concepts and Abstract Syntax. W3C Recommendation. Available at https://www.w3.org/TR/rdf11-concepts/. Ruepp A, Waegele B, Lechner M, Brauner B, Dunger-Kaltenbach I, Fobo G, Frishman G, Montrone C, Mewes H-W. 2010. Corum: the comprehensive resource of mammalian protein complexes—2009. Nucleic Acids Research 38(Suppl 1):D497–D501 DOI 10.1093/nar/gkp914. Sanseau P, Koehler J. 2011. Editorial: computational methods for drug repurposing. Briefings in Bioinformatics 12(4):301–302 DOI 10.1093/bib/bbr047. Sawada N, Kataoka K, Kondo K, Arimochi H, Fujino H, Takahashi Y, Miyoshi T, Kuwahara T, Monden Y, Ohnishi Y. 2004. Betulinic acid augments the inhibitory effects of vincristine on growth and lung metastasis of B16F10 melanoma cells in mice. British Journal of Cancer 90(8):1672–1678 DOI 10.1038/sj.bjc.6601746. Scott WA. 1955. Reliability of content analysis: the case of nominal scale coding. Public Opinion Quarterly 19(3):321–325 DOI 10.1086/266577. Shen M, Zhang Y, Saba N, Austin CP, Wiestner A, Auld DS. 2013. Identification of therapeutic candidates for chronic lymphocytic leukemia from a library of approved drugs. PLoS ONE 8(9):e75252 DOI 10.1371/journal.pone.0075252. Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A, Sage J, Butte AJ. 2011. Discovery and preclinical validation of drug indications using compendia of public gene expression data. Science Translational Medicine 3(96):96ra77 DOI 10.1126/scitranslmed.3001318. McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 19/20 http://dx.doi.org/10.1016/s0140-6736(97)07382-0 https://www.w3.org/TR/owl2-direct-semantics/ http://dx.doi.org/10.1007/s12253-009-9204-7 http://dx.doi.org/10.7150/jca.2.81 https://clinicaltrials.gov/ct2/show/NCT00110019 http://dx.doi.org/10.1093/bioinformatics/bti115 http://dx.doi.org/10.1038/sj.emboj.7601198 http://dx.doi.org/10.1186/1748-717x-6-160 http://dx.doi.org/10.1039/c3mb70503j http://dx.doi.org/10.1186/1471-2105-9-405 https://www.w3.org/TR/rdf11-concepts/ http://dx.doi.org/10.1093/nar/gkp914 http://dx.doi.org/10.1093/bib/bbr047 http://dx.doi.org/10.1038/sj.bjc.6601746 http://dx.doi.org/10.1086/266577 http://dx.doi.org/10.1371/journal.pone.0075252 http://dx.doi.org/10.1126/scitranslmed.3001318 http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ Skrabanek L, Saini HK, Bader GD, Enright AJ. 2008. Computational prediction of protein- protein interactions. Molecular Biotechnology 38(1):1–17 DOI 10.1007/s12033-007-0069-2. Smalley KSM, Contractor R, Haass NK, Kulp AN, Atilla-Gokcumen GE, Williams DS, Bregman H, Flaherty KT, Soengas MS, Meggers E, Herlyn M. 2007. An organometallic protein kinase inhibitor pharmacologically activates p53 and induces apoptosis in human melanoma cells. Cancer Research 67(1):209–217 DOI 10.1158/0008-5472.can-06-1538. Sprinzak E, Sattath S, Margalit H. 2003. How reliable are experimental protein–protein interaction data? Journal of Molecular Biology 327(5):919–923 DOI 10.1016/s0022-2836(03)00239-0. Stark C, Breitkreutz B-J, Reguly T, Boucher L, Breitkreutz A, Tyers M. 2006. BioGRID: a general repository for interaction datasets. Nucleic Acids Research 34(Suppl 1):D535–D539 DOI 10.1093/nar/gkj109. Vogt I, Prinz J, Campillos M. 2014. Molecularly and clinically related drugs and diseases are enriched in phenotypically similar drug-disease pairs. Genome Medicine 6(7):52 DOI 10.1186/s13073-014-0052-z. Whitehead RP, Moon J, McCachren SS, Hersh EM, Samlowski WE, Beck JT, Tchekmedyian NS, Sondak VK. 2004. A Phase II trial of vinorelbine tartrate in patients with disseminated malignant melanoma and one prior systemic therapy. Cancer 100(8):1699–1704 DOI 10.1002/cncr.20183. Wilkinson M, Vandervalk B, McCarthy L. 2009. SADI Semantic Web Services–cause you can’t always GETwhat you want! In: 2009 IEEE Asia-Pacific Services Computing Conference (APSCC). Singapore: IEEE, 13–18. Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J. 2006. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Research 34(Suppl 1):D668–D672 DOI 10.1093/nar/gkj067. Wu C, Gudivada RC, Aronow BJ, Jegga AG. 2013. Computational drug repositioning through heterogeneous network clustering. BMC Systems Biology 7(Suppl 5):S6 DOI 10.1186/1752-0509-7-s5-s6. Wu Z, Wang Y, Chen L. 2013. Network-based drug repositioning. Molecular BioSystems 9(6):1268–1281 DOI 10.1039/c3mb25382a. Xenarios I, Salwinski L, Duan XJ, Higney P, Kim S-M, Eisenberg D. 2002. DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Research 30(1):303–305 DOI 10.1093/nar/30.1.303. Yang L, Agarwal P. 2011. Systematic drug repositioning based on clinical side-effects. PLoS ONE 6(12):e28025 DOI 10.1371/journal.pone.0028025. Ye H, Liu Q, Wei J. 2014. Construction of drug network based on side effects and its application for drug repositioning. PLoS ONE 9(2):e87864 DOI 10.1371/journal.pone.0087864. McCusker et al. (2017), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.106 20/20 http://dx.doi.org/10.1007/s12033-007-0069-2 http://dx.doi.org/10.1158/0008-5472.can-06-1538 http://dx.doi.org/10.1016/s0022-2836(03)00239-0 http://dx.doi.org/10.1093/nar/gkj109 http://dx.doi.org/10.1186/s13073-014-0052-z http://dx.doi.org/10.1002/cncr.20183 http://dx.doi.org/10.1093/nar/gkj067 http://dx.doi.org/10.1186/1752-0509-7-s5-s6 http://dx.doi.org/10.1039/c3mb25382a http://dx.doi.org/10.1093/nar/30.1.303 http://dx.doi.org/10.1371/journal.pone.0028025 http://dx.doi.org/10.1371/journal.pone.0087864 http://dx.doi.org/10.7717/peerj-cs.106 https://peerj.com/computer-science/ Finding melanoma drugs through a probabilistic knowledge graph Introduction Results Discussion Materials and Methods flink5 References