key: cord-0127077-59sc7y7n authors: Kanatsoulis, Charilaos I.; Sidiropoulos, Nicholas D. title: TeX-Graph: Coupled tensor-matrix knowledge-graph embedding for COVID-19 drug repurposing date: 2020-10-22 journal: nan DOI: nan sha: de650a3b2885b85cc9895b5c2c5a67ca70a79d09 doc_id: 127077 cord_uid: 59sc7y7n Knowledge graphs (KGs) are powerful tools that codify relational behaviour between entities in knowledge bases. KGs can simultaneously model many different types of subject-predicate-object and higher-order relations. As such, they offer a flexible modeling framework that has been applied to many areas, including biology and pharmacology -- most recently, in the fight against COVID-19. The flexibility of KG modeling is both a blessing and a challenge from the learning point of view. In this paper we propose a novel coupled tensor-matrix framework for KG embedding. We leverage tensor factorization tools to learn concise representations of entities and relations in knowledge bases and employ these representations to perform drug repurposing for COVID-19. Our proposed framework is principled, elegant, and achieves 100% improvement over the best baseline in the COVID-19 drug repurposing task using a recently developed biological KG. How does COVID-19 relate to better-studied viral infections and biological mechanisms? Can we use existing drugs to effectively treat COVID-19 symptoms? Since the COVID-19 pandemic has disrupted our lives, there is a pressing need to answer such questions, and COVID-19 research has swiftly risen to the top of the scientific agenda, worldwide. While these questions will ultimately be answered by medical experts, data-driven methods can help to cut-down the immense search space, thus helping to accelerate progress and optimize the allocation of precious research resources. In this paper, our goal is to derive such a method by using network science and multi-view analysis tools. Networks are powerful abstractions that model interactions between the entities of a system [3] . Networks and network science offer concise and informative modeling, analysis and processing for various biological, engineering and social systems, to name a few [11, 22] . Networks are usually rep- * Electrical and Computer Engineering Department, University of Minnesota. Email: kanat003@umn.edu, † Electrical and Computer Engineering Department, University of Virginia. Email: nikos@virginia.edu resented by graphs, that are defined by a set of nodes and a set of edges connecting pairs of nodes. The entities of a system are usually represented by the nodes of the graph, and the interactions by the edges. A knowledge graph (KG) models the relational behavior of various entities in knowledge bases. A KG is heterogeneous in the sense that it models interactions between entities of different type, e.g., drugs and diseases, and is also a multidimensional network (edge-labeled multi-graph) [4] , since the edges (interactions) that connect the nodes (entities) can be multiple and also of different type. Knowledge graphs (KGs) have recently attracted significant attention due to their applicability to various science and engineering tasks. For instance, popular knowledge graphs are YAGO [32] , DBpedia [1] , NELL [8] , Freebase [5] , and the Google KG [29] . A recent trend codifies knowledge bases of biomedical components and processes, such as genes, diseases and drugs into KGs e.g., [14, 15, 17] . KGs can model any relations of the form subject-predicate-object, as well as higher-order generalizations. However, this broad modeling freedom can sometimes be a challenge, as the entities can be very diverse and the dimensions of the KG can turn prohibitively large. A common way to exploit KGs for data mining and machine learning applications is via knowledge graph embedding. KG embedding aims to extract meaningful low dimensional representations of the entities and the relations present in a KG. A plethora of methods have been proposed to perform KG embedding [2, 6, 12, 18, 20, 21, 23, 25, 26, 30, 33, 34] . The most popular among them adopt a single-layer perceptron or neural network approach e.g., [6, 21, 30, 33, 34] . Various tensor factorization models have also been proposed, e.g., [2, 12, 20, 23, 25] . Matrix factorization is also a tool that has been utilized for KG embedding, e.g., [18, 26] . In this paper we propose TeX-Graph, a novel coupled tensor-matrix framework to perform KG embedding. The proposed KG coupled tensor-matrix modeling extracts meaningful information from a set of diverse entities with multi-modal interactions in a principled and concise manner. TeX-Graph avoids modeling inefficiencies in previously proposed tensor models, and relative to neural network approaches it offers a principled and effective way to produce unique KG representations. The proposed framework is used for drug repurposing, a pivotal tool in the fight against COVID-19 and other diseases. Learning concise representations for drug compounds, diseases, and the relations between them, our approach allows for link prediction between drug compounds and COVID-19 or other diseases. The impact is critical. First, compound repurposing enables drug design that drastically reduces the design exploration cycle and the failure rate. Second, it markedly reduces drug development cost, as developing new therapeutic drugs is tremendously expensive. The contributions of our work can be summarized as follows: • Novel KG modeling: We propose a principled coupled tensor-matrix model tailored to KG needs for efficient and parsimonious representations. • Analysis: The TeX-Graph embeddings are unique and permutation invariant, a property which is important for consistency and necessary for interpretability. • Algorithm: We design a scalable algorithmic framework with lightweight updates, that can effectively handle very large KGs. • Application: The proposed framework is developed to perform drug repurposing, a pivotal task in the fight against COVID-19. • Performance: TeX-Graph achieves 100% performance improvement compared to the best available baseline for COVID-19 drug repurposing using a recently developed COVID-19 KG. Reproducibility: The DRKG dataset used in the experiments is publicly available; we will also release our code upon publication of the paper. Notation: Our notation is summarized in Table 1 . x, y, z scalars (m, n), (h, r, t) ordered tuple x, y, z vectors largest integer that is less than or equal to x nnz number of non-zeros Figure 1 : The columns, rows and fibers of a third-order tensor. Figure 2 : The horizontal, vertical, and frontal slabs of a third-order tensor. In order to facilitate the upcoming discussion we now discuss some tensor algebra preliminaries. For more background on tensors the reader is referred to [19, 28] . A third-order tensor X ∈ R I×J×K is a three-way array indexed by i, j, k with elements X(i, j, k). It has three mode types -columns X(i, :, k) (: is used to denote all relevant index values for that mode, i.e., from beginning to end), rows X(:, j, k), and fibers X(i, j, :) -see Fig. 1 . A third order tensor can also be described by a set of matrices (slabs), i.e., horizontal X(i, :, :), vertical X(:, j, :) and frontal slabs X(:, :, k) -see Fig. 2 . A rank-one third order tensor Z ∈ R I×J×K is defined as the outer product of three vectors. Recall that a rank one matrix is the outer product of two vectors. Any third order tensor can be decomposed into a sum of three-way outer products (rank one tensors) as: where A ∈ R I×F , B ∈ R J×F , C ∈ R K×F are matrices collecting the respective mode factors, i.e., hold in their columns the vectors involved in the F three-way outer products. The above expression is known as the polyadic decomposition (PD) of a third-order tensor. If F is the minimum number of outer products required to synthesize X, then F is the rank of tensor X and the decomposition is known as the canonical polyadic decomposition (CPD) or parallel factor analysis (PARAFAC) [13] . For the rest of the paper we use the notation X = A, B, C to denote the CPD of the tensor. A striking property of the CPD is that it is essentially unique under mild conditions, even if the rank is higher than the dimensions-see [9] for a generic result. A tensor can be represented in a matrix form by employing the matricization operation. There are three common ways to matricize (or unfold) a third-order tensor, by stacking columns, rows, or fibers of the tensor to form a matrix. To be more precise let: where X k are the frontal slabs of tensor X and in the context of this paper they model adjacency matrices. Then the mode-1, mode-2 and mode-3 unfoldings of X are: , · · · , X K (:, 1) . . . Another important tensor model is the coupled CPD. In coupled CPD we are interested in decomposing an array of tensors that share at least one common latent factor. In particular, consider a collection of N tensors X n ∈ R I×Jn×Kn , n ∈ {1, . . . N }. The rank-F coupled CPD of {X n } can be expressed as: where A ∈ R I×F is the common factor and Bn ∈ R Jn×F , Cn ∈ R Kn×F are unshared factors. The coupled CPD is also unique under certain conditions, even if individual CPDs of Xn are not unique. In this work we will use the following uniqueness theorem for coupled CPD: where G is defined as: and C2 (Cn) is the second compound matrix of Cnsee [10, p. 861-862] . In the context of coupled CPD, essential uniqueness corresponds to A being unique and {Bn, Cn} being identifiable up to column scaling and counter-scaling. As mentioned in the introduction knowledge graphs (KGs) have attracted significant attention over the past decade due to their tremendous modeling capabilities. In particular, KGs model triplets of subject-predicate-object, denoted as (head, relation, tail) or (h, r, t). Subjects (heads) and objects (tails) are entities that are represented as graph nodes and predicates (relations) define the type of edge according to which the subject is connected to the object. A schematic representation of a KG, which models relations between genes, compounds and diseases is presented in Fig. 3 . In this paper, we focus our attention on a biological KG that models relational triplets between biological entities. For example, (compound 1, interacts with, compound 2), (compound 1, activates, gene 1), (gene 1, regulates, gene 2), (compound 1, prevents, disease 1), (gene 2, is linked with, disease 2) are common triplets in numerous recently developed knowledge bases [14, 15, 17] . Modeling these types of relations as a KG enables embedding entities and relations in a Euclidean space which can further facilitate any type of processing and analysis. For instance, obtaining a low dimensional representation of compounds, diseases and the 'prevents' relation allows measuring similarity, and thus predicting and testing hypotheses regarding (compound, prevents, disease) interactions. Drug repurposing can be performed by predicting candidate compounds for new and existing target diseases. Note that the proposed framework to be introduced shortly is not limited to biological KGsit can be applied to a wide variety of interesting KGs. learn low dimensional representations of KGs. To properly describe them we need to define the score function f (·) and the loss function L(·). Let (hn, rn, tn) be an available triplet and hn ∈ R F , tn ∈ R F and rn ∈ R d be the low dimensional embeddings we aim to learn. Note that entity and relation embeddings need not be of the same dimension. The score function determines the relation model between the head (subject) and the tail (object). In simple words, high values of the score function f (hn, rn, tn) are desirable for existing triplets (hn, rn, tn) and low values of f (hn, rn, tn) for nonexisting ones. In order to produce the entity and relational embeddings we define the following forward model for each triplet (hn, rn, tn): Model score function f (h, r, t) TransE [6] 1 h T diag(r)t RESCAL [23] h T Rt RotatE [33] 1 , 1} Kr are one-hot input vectors corresponding to the head, tail and relation index of the triplet (hn, rn, tn) respectively, with Le, Kr being the total number of entities (nodes) and types of relations; γ(·) and δ(·) are element-wise functions and We ∈ R Le×F , Wr ∈ R Kr ×d are matrices that contain the model parameters to be learned. Popular choices for γ(·) and δ(·) are the identity function and hyperbolic tangent. If γ(·) or δ(·) are identity functions then the rows of We or We are the learned embeddings for entities and relations respectively. For TransE, DistMult and RotatE F = d, whereas for TransR and RESCAL d = F . In the TransR model Mr ∈ R d×F is a projection matrix associated with relation r and in RESCAL R ∈ R F ×F . In order to learn the embeddings, state-of-the-art methods attempt to minimize the following risk: where N is the number of data points (triplets or nontriplets), yn = 1 if the triplet (hn, rn, tn) exists, else yn = 0. Typical loss functions include the logistic loss, square loss, pairwise ranking loss, margin-based ranking loss and variants of them. In order to tackle the problem in (3.9) the most popular approach is stochastic gradient descent (SGD) or batch SGD [7] . Modeling a KG using a third order tensor has been considered in [2, 12, 20, 23, 25] . In these works, the first and second mode of the tensor is the concatenation of all the available entities, regardless of their type, whereas the third mode represents the different type of relations -i.e., each frontal slab of the third order tensor represents a certain interaction type between the entities of the KG. The methods in [2, 25] work with incomplete tensors, whereas [12, 20, 23] model each frontal slab as an adjacency matrix. To be more precise, let Z ∈ {0, 1} Le×Le×Kr be the third order tensor in [12, 20, 23] . Then Z(i, j, k) = 1 if entity i interacts with entity j through relation k and Z(i, j, k) = 0 if there is no interaction between entities i and j via the k relation. An important observation is that although the first and second mode of tensor Z represent the same entities, each frontal slab Z k is not necessarily symmetric. The reason is that subject-predicate-object does not necessarily imply object-predicate-subject. The works in [12, 20] compute the CPD of Z (or scaled versions of Z) and produce two embeddings for each entity, one as a subject and another as an object. Although this is not always a drawback, it can result in an overparametrized model because in many applications entities usually act either as a subject or as an object, but not both. Furthermore, a single unified representation is usually preferable. In order to overcome this issue, RESCAL [23] proposed the following model for each frontal slab: where R k ∈ R F ×F is square matrix holding the relation embeddings associated with relation k. Note that the RESCAL model is different than the traditional CPD (symmetric in mode 1 and 2) in the sense that R k is not constrained to be diagonal. Relaxing the diagonal constraints allows matrix R k to absorb in the relation embedding the direction in which different entities interact. On the downside, this type of relaxation forfeits the parsimony and uniqueness properties of the CPD. This is an important point, since uniqueness is a prerequisite for model interpretability when we are interested in exploratory / explanatory analysis (and not simply in making 'black box' predictions). Another important drawback of the tree-way model is that it models unnecessary interactions. To see this, consider a KG that describes interactions between genes and diseases. Suppose that the observed interactions are of gene-gene and gene-disease type but there are no available data for diseasedisease interactions. The tree-way model involves diseasedisease interactions in the learning process (as non-edges), even though there are no data to justify it. As we will see in the upcoming section our proposed coupled tensor-matrix modeling addresses all the aforementioned challenges. In this paper we leverage coupled tensor-matrix factorization to extract low dimensional representations of entities (head, tail) as well as representations for the interactions (relation). KGs can be naturally represented by a collection of tensors and matrices, as shown in Fig. 4 . To see this, consider the previous example of gene, compound and disease entities. Gene-compound interactions, of a certain type, can be represented by an adjacency matrix. Since there are multiple types of interactions, multiple adjacency matrices are necessary to model every interaction, resulting in a tensor X g,c ∈ {0, 1} Lg ×Lc×Kg,c , where Lg, Lc are the number of genes and compounds respectively, and Kg,c is the number of different interactions between genes and compounds. The same idea can be applied to any (entity,interaction,entity) triplet. To facilitate the discussion let X m,n ∈ {0, 1} Lm×Ln×Km,n be the tensor of interactions between entity of type-m and type-n, e.g., m codifies genes and n codifies compounds. Also let LT be the total Note that tensors X m,n and X n,m contain the same information since X k m,n = X k T n,m . Therefore we only consider (m, n) tuples where m ≤ n. Each of the tensors in the array {X m,n , (m, n) ∈ S} admits a CPD and the overall model is cast as: where An ∈ R Ln×F , Cm,n ∈ R Km,n×F . The i-th row of An represents the F -dimensional embedding of the i-th type-n entity and the k-th row of Cm,n represents the Fdimensional embedding of the k-th type relation between type-m and type-n entities. Note that in the case where entities of type-m interact with entities of type-n via only one type of relation, Xm,n ∈ {0, 1} Lm×Ln is a matrix and can be factored as: (4.13) Xm,n = Amdiag(cm,n)An T The model in (4.12) is a coupled CPD as the factors An appear in multiple tensors. For instance, type-1-type-1 interactions (gene-gene), type-1-type-2 interactions (gene-compound), type-1-type-3 interactions (genedisease), result in the factor A1 appearing in tensors X 1,1 = A1, A1, C1,1 , X 1,2 = A1, A2, C1,1 and X 1,3 = A1, A3, C1,3 . The proposed TeX-Graph exhibits several favorable properties. First, the produced embeddings are unique, provided that they appear in more than one adjacency matrices. If the coupled tensor model in (4.12) is indeed low-rank, F , there exist entity and relation embedding vectors in F dimensional space that generate the given knowledge base. Then the F −dimensional TeX-Graph embeddings for type-n entities and type-(m, n) relations are unique and permutation invariant provided that m∈S + n Km,n + p∈S − n Kn,p > 1 and Km,n > 1 respectively, where S + n , S − n are defined in (4.16). The proof of Proposition 1 utilizes the uniqueness results of Theorem 1 and is relegated to the journal version due to space limitations. In the case where Km,n = 1 and type-m entities appear in multiple tensors but type-n entities only in one, the TeX-Graph model identifies Am and Andiag(cm,n), since there is rotational freedom between An and cm,n. Another important property of the proposed TeX-Graph is that it avoids modeling of spurious 'cross-product' relations that can never be observed. The coupled tensormatrix model allows for a concise KG representation that eliminates such spurious relations from the start, contrary to the three-way model. To see this, consider the previous example of gene-disease KG that observes relational triplets between gene-gene and gene-disease type but not for disease-disease type. The proposed TeX-Graph does not model disease-disease interactions, whereas the three-way model treats them as non-edges. It is worth noticing that TeX-Graph makes the implicit assumption that X n,n are symmetric in the first and second mode. This is not always the case, since interactions between some entity types might be directed. To overcome this issue we assume that (h,r,t) implies (t,r,h) for (h,t) of the same type. Although this assumption ignores the direction in this type of interactions, it results in a more parsimonious model for the entity embeddings. The problem in (4.14) is non-convex and NP-hard in general. In order to tackle it we propose to fix all variables but one and update the remaining variable. This procedure is repeated in an alternating fashion. The update for An is a system of linear equations and takes the form: for n ∈ {1, . . . , L E } do A n ← solve (4.15) end for for (m, n) ∈ S do C (m,n) ← solve (4.17) end for until criterion is met. The update for Cm,n is the solution to the following system of linear equations: The derivations for these updates as well as implementation details are presented in Appendix A. The proposed TeX-Graph is presented in Algorithm 1. TeX-Graph is an iterative algorithm that tackles a nonconvex problem and NP-hard in general. As a result different initial points might produce different results. Although we have observed that random initialization is sufficient most of the times we propose an alternative initialization procedure that yields consistent and reproducible results. To be more specific we form a symmetric version of tensor Z as: Then we compute the semi-symmetric CPD of Y = A, A, C using sparse eigenvalue decomposition (EVD) [27] . The proposed initialization procedure is presented in Algorithm 2. of memory requirements and computational complexity, the main bottleneck of TeX-Graph lies in instantiating and computing the matricized tensor times Khatri-Rao product (MTTKRP) in the left hand side (LHS) of (4.15) and (4.17) . The number of flops needed to compute the LHS of (4. 15) and (4.17) is O F · nnz m∈S + n X m,n + p∈S − n X n,p and O F · nnz X m,n respectively. For small values of F which is usually the case in practice the complexity is linear in the number of triplets participating in each update. Furthermore the Khatri-Rao products in the (LHS) of (4.15) and (4.17) are not being instantiated as shown in Appendix A. In this section we apply TeKGraph to a recently developed KG [17] in order to perform drug repurposing for COVID-19 disease. All algorithms were implemented in Matlab or Python, and executed on a Linux server comprising 32 cores at 2GHz and 128GB RAM. Drug Repurposing Knowledge Graph (DRKG) 1 [17] . It codifies triplets of biological interactions between 97,238 different entities of 13 types, namely, genes, compounds, diseases, anatomy, tax, biological process, cellular component, pathway, molecular function, anatomical therapeutic chemical (Atc), side effect, pharmacological class, and symptom. The total number of triplets is 5,874,258 and there are 107 different types of interactions. The KG is organised in 6 adjacency tensors and 11 adjacency matrices. Detailed description of the dataset and the modeling can be found in Table 3 . Each row denotes a different adjacency tensor or matrix and # type-m entities, # type-m entities, # relation types correspond to the dimension of mode 1, mode 2, and mode 3 respectively. The last column (sparsity) denotes the sparsity of each tensor, i.e., nnz(X m,n ) LmLnKm,n 5.2 Procedure Drug repurposing refers to the task of discovering existing drugs that can effectively manage certain diseases-COVID-19 in our study. DRKG codifies relational triplets of (compound,treats,disease) and (compound,inhibits,disease). Therefore drug repurposing in the context of DRKG boils down to predicting new 'treats' and 'inhibits' edges (links) between compounds and diseases of interest. We follow the evaluation procedure proposed in [17] . In the training phase we learn low dimensional representations for the entities and relations, using all the edges in DRKG. In the testing phase, we assign a score to (compound,treats,disease) and (compound,inhibits,disease) triplets according to the scoring function used for training. For the proposed TeX-Graph, the scores assigned to the triplet (hyper-edge) (compound i,treats,disease j) and (compound i,inhibits,disease j) are: scorei,j,2 = A2(i, :)diag (C2,3 (2, :)) A2(j, :) T , scorei,j,9 = A2(i, :)diag (C2,3 (9, :)) A2(j, :) T , since 'treats' and 'inhibits' relations correspond to the second and ninth frontal slab of X 2,3 , respectively. The testing set consists of 34 corona-virus related diseases, including SARS, MERS and SARS-COV2 and 8, 103 FDAapproved drugs in Drugbank. Drugs with molecule weight less than 250 daltons are excluded from testing. Ribavirin was also excluded from the testing set, since there exist a 'treat' edge in the training set between Ribavirin and a target disease. In order to evaluate the performance of the proposed TeX-Graph and the alternatives we retrieve the top-100 ranked drugs that appear in the highest testing scoring (hyper-)edges. These are the proposed candidate drugs for COVID-19. Then we assess how many of the 32 clinical trial drugs 2 (Ribavirin is excluded) appear in the proposed candidate top-100 drugs. The methods used in the experiments are: • TeX-Graph. The proposed TeKGraph algorithm initialized with Algorithm 2. The embedding dimension is set to F = 50 and the algorithm runs for 10 iterations. • TransE-DRKG [6, 17] . TransE learns low dimensional KG embeddings using the score function shown in Table 2 . For the the task of drug repurposing we use the specifications proposed in [17] . The l2 norm is chosen in the score function and training is performed using the deep graph library for knowledge graphs [35] . To evaluate the performance of TransE-DRKG on the drug repurposing task we used the 400−dimensional pretrained embeddings in [17] , with which the drug repurposing results were better than the stand-alone code without pretraining. • 3-way KG embeddings (3-way KGE). We add as a baseline the embeddings produced by computing the CPD of tensor Y in (4.18) . Recall that we use an 2 www.covid19-trials.com algebraic CPD of Y to initialize TeX-Graph. In 3-way KGE we initialize using the same procedure and also run 10 alternating least-squares iterations to compute the CPD of Y . 3-way KGE is tested with F = 50. Table 4 shows the clinical trial drugs that appear in the top-100 recommendations along with their [rank-order]. The proposed approach retrieves 10 clinical trial drugs in the top-100 positions, and 7 in the top-50. Compared to TransE-DRKG that was the first proposed algorithm to perform drug-repurposing for COVID-19, TeX-Graph achieves 75% and 100% improvement in precision in the top-50 and top-100 respectively. It is worth emphasizing that the proposed Tex-Graph retrieves approximately 1/3 of the COVID-19 clinical trial drugs, in the top-100, among a testing set of 8, 103 drugs. This result is pretty remarkable and can essentially help cutting down the immense search space of medical research. For instance, consider the case of Dexamethasone, which is retrieved by Tex-Graph in the top ranked position (it admitted the highest score among all 8, 103 drugs). At the onset of the pandemic, the initial guidance for Dexamethasone and other corticosteroids was indecisive. Guidelines from different sources issued either a weak recommendation to use Dexamethasone (with an asterisk that further evidence was required) or a weak recommendation against corticosteroids and Dexamethasone [24] . However, recent results indicate that treatment with Dexamethasone reduces mortality in patients with COVID-19 [16] . The results of Tex-Graph coalign with the latest evidence and rank Dexamethasone as the top recommended drug. This suggests that our proposed datadriven approach could have essentially contributed in overturning the initial hesitancy to administrate Dexamethasone as a first line treatment. Dexamethasone [4] Oseltamivir [89] Methylprednisolone [6] Colchine [8] Azithromycin [13] Methylprednisolone [16] Thalidomide [18] Oseltamivir In this paper we proposed a novel coupled tensor-matrix framework for knowledge graph embedding. The proposed model is principled and enjoys several favorable properties, including parsimony and uniqueness. The developed algorithmic framework admits lightweight updates and can handle very large graphs. Finally the proposed TeX-Graph showed very promising results in a timely application to drug repurposing, a task of paramount importance in the fight against COVID-19. The authors would like to acknowledge Ioanna Papadatou, M.D., Ph.D, for contributing in the medical assessment of the produced results. Dbpedia: A nucleus for a web of open data Tensor factorization for knowledge graph completion Network science. Cambridge university press Foundations of multidimensional network analysis Freebase: a collaboratively created graph database for structuring human knowledge Translating embeddings for modeling multi-relational data Large-scale machine learning with stochastic gradient descent Toward an architecture for never-ending language learning On generic identifiability of 3-tensors of small rank On the uniqueness of the canonical polyadic decomposition of third-order tensors -part ii: Uniqueness of the overall decomposition Triplerank: Ranking semantic web data by tensor decomposition Parallel factor analysis. Computational Statistics and Data Analysis Heterogeneous network edge prediction: a data integration approach to prioritize disease-associated genes Systematic integration of biomedical knowledge prioritizes drugs for repurposing Dexamethasone in hospitalized patients with covid-19-preliminary report Drkg -drug repurposing knowledge graph for covid-19 Link prediction in multi-relational graphs using additive models Tensor decompositions and applications Higher-order web link analysis using multilinear algebra Learning entity and relation embeddings for knowledge resolution A three-way model for collective learning on multi-relational data Corticosteroids in covid-19 ards: evidence and hope during the pandemic Pairwise interaction tensor factorization for personalized tag recommendation Relation extraction with matrix factorization and universal schemas Tensorial resolution: a direct trilinear decomposition Evangelos E Papalexakis, and Christos Faloutsos. Tensor decomposition for signal processing and machine learning Introducing the knowledge graph: things, not strings. Official google blog Reasoning with neural tensor networks for knowledge base completion Coupled canonical polyadic decompositions and (coupled) decompositions in multilinear rank-(lr, n, lr, n, 1) terms-part ii Yago: a core of semantic knowledge Rotate: Knowledge graph embedding by relational rotation in complex space Embedding entities and relations for learning and inference in knowledge bases Dgl-ke: Training knowledge graph embeddings at scale