key: cord-0786779-k02cua3x authors: Vahid, Milad R.; Kurlovs, Andre; Auge, Franck; Olfati-Saber, Reza; de Rinaldis, Emanuele; Rapaport, Franck; Savova, Virginia title: DiSiR: a software framework to identify ligand-receptor interactions at subunit level from single-cell RNA-sequencing data date: 2022-03-27 journal: bioRxiv DOI: 10.1101/2022.03.25.485741 sha: 17ceffa49e8a606855989de60bfd9819abe58cfc doc_id: 786779 cord_uid: k02cua3x Most of cell-cell interactions and crosstalks are mediated by ligand-receptor interactions. The advent of single-cell RNA-sequencing (scRNA-seq) techniques has enabled characterizing tissue heterogeneity at single-cell level. Over the past recent years, several methods have been developed to study ligand-receptor interactions at cell type level using scRNA-seq data. However, there is still no easy way to query the activity of a specific user-defined signaling pathway in a targeted way or map the interactions of the same subunit with different ligands as part of different receptor complexes. Here, we present DiSiR, a fast and easy-to-use permutation-based software framework to investigate how individual cells are interacting with each other by analyzing signaling pathways of multi-subunit ligand-activated receptors from scRNA-seq data, not only for available curated databases of ligand-receptor interactions, but also for interactions that are not listed in these databases. We show that, when utilized to infer melanoma disease map on a gold-standard dataset, DiSiR outperforms other well-known permutation-based methods, e.g., CellPhoneDB and ICELLNET. To demonstrate DiSiR’s utility in exploring data and generating biologically relevant hypotheses, we apply it to COVID lung and rheumatoid arthritis (RA) synovium scRNA-seq data and highlight potential differences between inflammatory pathways at cell type level for control vs. disease samples. 127 immune cell type associated with IL6 secretion in the review -macrophages -were not major 128 secretors of IL6 in the AMP RA dataset. We did not observe a high level of expression of IL6 in the 129 macrophages from the AMP RA dataset (see S2b Fig) to account for the macrophage discrepancy. 130 The role of IL6 signaling to CD4+ T-cells has also attracted researchers' attention [30] . 153 We next calculate optimal cut-off points in ROC and precision-recall curves to select the best value 154 for the threshold parameter ℎ using Youden's J statistic and sensitivity score, respectively. Since the 155 sensitivity criterion is more important for us, we proceed with the optimal threshold obtained in this 156 case ( ℎ = 0). This low optimal threshold value suggests that DiSiR performs well even when there 158 the default value of the threshold parameter in our tool and use it to analyze other scRNA-seq data 159 sets in this paper, e.g., RA and COVID lung scRNA-seq data. 160 We have also compared the running time of DiSiR and CellPhoneDB (searching over its entire 161 database) applied to the AMP RA Phase 1 data on our system (2.4 GHz 8-Core Intel Core i9 212 We next explain each step in more details and introduce the filtering approach that we use to remove 213 weak interactions. We also describe the plotting capabilities that we have included in the DiSiR 214 framework in order to better visualize complex signaling pathways. 245 types, from ligand-expressing cells to receptor-expressing cells (Fig 2c) . 431 DiSiR also provides an interactive graph-based representation module and heatmap/bubble plots to 432 show the resulting interactions for complex signaling pathways at different levels. R" subunits by cell type "B". This 97 interaction is considered by DiSiR as significant if these products are significantly higher than the 98 result of random shuffling of cell type labels for all "L-R" subunit combinations. We filter out 99 interactions between cell types with low fractions of cells expressing the ligand/receptor subunits 100 (see This result serves as proof of concept that DiSiR is capable of quickly interrogating a 111 pathway, as well as sheds additional light on the biology of this interaction. The higher weights of the 112 links connected to macrophages (Fig. 2c) suggest that macrophages are the primary signaling cells, 113 signaling primarily to themselves and to monocytes. Note that we use the following notation in the 114 interaction graph: there is a link L1 | R1 + L2 | R2 between two nodes C1 and C2, if cell types C1 and 115 C2 communicate with each other through the ligand L-receptor R interactions, where L1, L2 and R1, 116 R2 are subunits of L and R, respectively. In addition, some signaling took place in cells that are not 117 part of the immune system (annotated as 'Fibroblasts' and 'NonImmune'). These findings are 118 consistent with the existing paradigm [28] that CLP is released by phagocytes and TLR4 binds to it 119 in phagocytic cells as well as in cells that are not immune DiSiR has predicted that in 124 RA synovium, IL6 is primarily secreted by B-cells, while CD4+ T-cells are the main immune cell 125 types on the receiving end. As with CLP, the IL6 pathway is also elevated in RA We tested DiSiR on a couple of pathways in two publicly available single-cell datasets. We obtained 217 lung COVID-19 single-cell We subsequently 219 filtered the data based on empirically customized standard procedures. We only kept cells for further 220 analysis if they had at least 1000 genes and at most 20% of the genes were mitochondrial (COVID), 221 or if they had at least 200 genes and at most 10% of the genes were mitochondrial (AMP RA Phase 222 1). We then normalized total barcode counts in each of the two datasets so that each cell had the same 223 total normalized read count. Afterwards, we used SPRING [37], a web-based software tool to 224 visualize high dimensional transcriptomic data neural-network-based cell annotation tool that uses expression data and is also aided by the 227 edges from the SPRING plot. Finally, we transferred the resulting annotated expression data into DiSiR uses a filtering approach to discard weak interactions that might be associated with noise by 231 thresholding the fraction of cells expressing each ligand/receptor subunit within its corresponding 232 cell type. The user can visualize this metric for combinations of signaling pathway components and 233 cell types using a bubble plot in which the size of each circle is associated with the corresponding 234 fraction and the color of each circle is related to the max-normalized average expression of each 235 ligand/receptor subunit within its corresponding cell type (Fig 2b). A ligand-receptor We provide an analysis framework using precision-recall 238 curves, across all possible threshold values, to determine the optimal value for Th (see the "Results 239 and discussion" section as well as S3 Fig) In DiSiR, we visualize output cell-cell interactions in two ways: graph representation and heatmap 243 plots. Graph representation is made of a directed graph in which nodes are associated with the cell 297 IL6ST interactions between different cell types identified by DiSiR Technical assessment of DiSiR using gold-standard melanoma cellular network based 300 on precision-recall curve. (a) Recall and precision, across different threshold values, are defined in 301 relation to positive and negative groups for DiSiR vs. CellPhoneDB and ICELLNET (AUC, area 302 under the curve). Best precision in DiSiR is obtained for ℎ = 0 (optimal threshold value) (b) Cell-303 cell interaction network between cancer and the normal cells in the melanoma Franck Auge, Reza Olfati-Saber, Emanuele de Rinaldis, Virginia Savova, 310 Franck Rapaport. 311 Investigation: Milad R. Vahid Project administration: Virginia Savova, Franck Rapaport Writing -original draft: Milad R. Vahid, Franck Rapaport Inference and 326 analysis of cell-cell communication using CellChat RNA velocity 330 of single cells Deciphering cell-cell interactions and 332 communication from gene expression Immune Landscape of Viral-334 and Carcinogen-Driven Head and Neck Cancer iTALK: an R package to 336 characterize and illustrate intercellular communication Autocrine-Paracrine Networks from Human Islet scRNA-Seq Defining 340 inflammatory cell states in rheumatoid arthritis joint synovial tissues by integrating single-cell 341 transcriptomics and mass cytometry SingleCellSignalR: inference of intercellular networks from single-cell transcriptomics Dissection of intercellular communication using the transcriptome-based framework ICELLNET. 347 Nature communications Giotto: a toolbox for integrative analysis 349 and visualization of spatial expression data NicheNet: modeling intercellular communication by 351 linking ligands to target genes Transcriptome analysis of individual 353 stromal cell populations identifies stroma-tumor crosstalk in mouse lung cancer model Inferring spatial and signaling relationships between cells from single cell 356 transcriptomic data Cell lineage and communication network 358 inference via optimization for single-cell transcriptomics Uncovering hypergraphs of cell-cell interaction from single 360 cell RNA-sequencing data A 362 draft network of ligand-receptor-mediated multicellular signalling in human Signaling receptome: a 367 genomic and evolutionary perspective of plasma membrane receptors involved in signal transduction Guide to PHARMACOLOGY in 2018: updates and expansion to encompass the new 371 guide to IMMUNOPHARMACOLOGY The Universal 373 Protein Resource (UniProt): an expanding universe of protein information CellPhoneDB: inferring 376 cell-cell communication from combined expression of multi-subunit ligand-receptor complexes Single-cell reconstruction of the early maternal-fetal interface in humans The PageRank Citation Ranking: Bringing 382 Order to the Web. Stanford InfoLab Nonnegative tucker decomposition Extracting intercellular signaling 386 network of cancer tissues using ligand-receptor expression patterns from whole-tumor and single-cell 387 transcriptomes Calprotectin Level and Disease Activity in Patients with Rheumatoid Arthritis The Role of Calprotectin in Rheumatoid Arthritis B Cells in Rheumatoid 394 Arthritis:Pathogenic Mechanisms and Treatment Prospects. Frontiers in Immunology. 2021;12. 395 30. Dienz O, Rincon M. The effects of IL-6 on CD4 T cell responses Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq Emerging roles for IL-11 signaling in cancer 401 development and progression: Focus on breast cancer Single-cell landscape of bronchoalveolar 404 immune cells in patients with COVID-19 Single-cell analysis of two severe COVID-406 19 patients reveals a monocyte-associated and tocilizumab-responding cytokine storm IL-1R3 409 blockade broadly attenuates the functions of six members of the IL-1 family, revealing their 410 contribution to models of disease Shotgun transcriptome, 412 spatial omics, and isothermal profiling of SARS-CoV-2 infection reveals unique host responses, viral 413 diversification, and drug interactions SPRING: a kinetic interface for visualizing high 415 dimensional single-cell expression data Cell type classification 417 and discovery across diseases, technologies and tissues reveals conserved gene signatures and 418 enables standardized single-cell readouts Fast, sensitive and 420 accurate integration of single-cell data with Harmony 422 uses 1) single-cell gene expression matrix, 2) cell type annotations, and 3) list of desired ligand-423 receptor interactions at subunit level as input data. DiSiR then characterizes the ligand "L"-receptor 424 "R" interaction between cell types "A" and "B" based on the products of expressions of "L" subunits 425 by cell type "A" and expressions of DiSiR as a significant interaction if these products are significantly higher than random shuffling 427 results for all "L-R" subunit combinations. (b) Diagram that shows the workflow of data processing, 428 filtering, and visualization of the obtained results Analysis of Calprotectin signaling pathways at cell type level from RA Synovium scRNA-435 seq data. (a) UMAP representation of RA Synovium scRNA-seq data and the corresponding cell 436 type label assigned to each cell. (b) Bubble plot illustrating max-normalized average expressions of 437 calprotectin signaling pathway components including ligand subunits, S100A8 and S100A9, and 438 receptor TLR4 per cell type (color of the circles) and fraction of cells expressing them within its 439 corresponding cell type (size of the circles). (c) Left panel is the heatmap depicting all cell-cell 440 communication through S100A8|TLR4 + S100A9|TLR4 interactions (when both interactions are 441 presented). The colormap shows the strength of interaction between two cell types Study of IL6/IL11 signaling pathways for COVID vs. healthy control samples using 446 DiSiR from lung scRNA-seq data. (a) UMAP representation of lung scRNA-seq data and DiSiR: identify LR interactions at subunit level 19 447 corresponding cell type control samples in lung scRNA-seq data. The 450 colormap shows the strength of interaction between two cell types. (c) Graph representation of the 451 significant ligand-receptor interactions, involved in the IL6/IL11 signaling pathway, i.e., IL6|IL6R + 452 IL6|IL6ST and IL11|IL116A + IL11|IL6ST, between different cell types identified by DiSiR for 453 healthy vs. COVID samples using the default threshold level control samples using DiSiR from lung scRNA-seq data. (a) Heatmap plot illustrating DiSiR: identify LR interactions at subunit level 21 458 the differences between cell-cell communications identified by DiSiR for IL1RAP signaling pathway 459 between COVID vs. control samples in lung scRNA-seq data. The colormap shows the strength of 460 interaction between two cell types. (b) Graph representation of significant ligand-receptor 461 interactions IL36RN|IL1RL2, between different cell types identified by DiSiR for healthy vs. (c) COVID samples 465 using the default threshold level