key: cord-266869-fs8dn7ir authors: Kim, So Young; Jin, Weihua; Sood, Amika; Montgomery, David W.; Grant, Oliver C.; Fuster, Mark M.; Fu, Li; Dordick, Jonathan S.; Woods, Robert J.; Zhang, Fuming; Linhardt, Robert J. title: Glycosaminoglycan binding motif at S1/S2 proteolytic cleavage site on spike glycoprotein may facilitate novel coronavirus (SARS-CoV-2) host cell entry date: 2020-04-15 journal: bioRxiv DOI: 10.1101/2020.04.14.041459 sha: doc_id: 266869 cord_uid: fs8dn7ir Severe acute respiratory syndrome-related coronavirus 2 (SARS-CoV-2) has resulted in a pandemic and continues to spread around the globe at an unprecedented rate. To date, no effective therapeutic is available to fight its associated disease, COVID-19. Our discovery of a novel insertion of glycosaminoglycan (GAG)-binding motif at S1/S2 proteolytic cleavage site (681-686 (PRRARS)) and two other GAG-binding-like motifs within SARS-CoV-2 spike glycoprotein (SGP) led us to hypothesize that host cell surface GAGs might be involved in host cell entry of SARS-CoV-2. Using a surface plasmon resonance direct binding assay, we found that both monomeric and trimeric SARS-CoV-2 spike more tightly bind to immobilized heparin (KD = 40 pM and 73 pM, respectively) than the SARS-CoV and MERS-CoV SGPs (500 nM and 1 nM, respectively). In competitive binding studies, the IC50 of heparin, tri-sulfated non-anticoagulant heparan sulfate, and non-anticoagulant low molecular weight heparin against SARS-CoV-2 SGP binding to immobilized heparin were 0.056 μM, 0.12 μM, and 26.4 μM, respectively. Finally, unbiased computational ligand docking indicates that heparan sulfate interacts with the GAG-binding motif at the S1/S2 site on each monomer interface in the trimeric SARS-CoV-2 SGP, and at another site (453-459 (YRLFRKS)) when the receptor-binding domain is in an open conformation. Our study augments our knowledge in SARS-CoV-2 pathogenesis and advances carbohydrate-based COVID-19 therapeutic development. In March 2020, the World Health Organization declared severe acute respiratory syndrome-related coronavirus 2 (SARS-CoV-2) a pandemic less than three months after its initial emergence in Wuhan, China [1, 2] . SARS-CoV-2 is a zoonotic Betacoronavirus transmitted through person-person contact through airborne and fecal-oral routes, and has caused over 693,000 confirmed coronavirus disease 2019 (COVID-19) cases and 33,000 associated deaths worldwide [2] [3] [4] [5] . While there is limited understanding of SARS-CoV-2 pathogenesis, extensive studies have been performed on how its closely related cousins, SARS-CoV and MERS-CoV (Middle East respiratory syndrome-related coronavirus), invade host cell. Upon initially contacting the surface of a host cell, SARS-CoV and MERS-CoV exploit host cell proteases to prime their surface spike glycoproteins (SGPs) for fusion activation, which is achieved by receptor binding, low pH, or both [6, 7] . The receptor binding domain (RBD) resides within subunit 1 (S1) while subunit 2 (S2) facilitates viral-host cell membrane fusion [6] . Activated SGP undergoes a conformational change followed by an initiated fusion reaction with the host cell membrane [6] . Endocytosed virions are further processed by the endosomal protease cathepsin L in the late endosome [7, 8] . Both MERS-Cov and SARS-CoV require proteolytic cleavage at their S2' site, but not at their S1-S2 junction, for successful membrane fusion and host cell entry [6, 7] . Additionally, receptors involved in fusion activation of SARS-CoV and MERS-CoV include heparan sulfate (HS) and angiotensinconverting enzyme 2 (ACE2), and dipeptidyl peptidase 4 (DPP4), respectively [9] [10] [11] . SARS-CoV and other pathogens arrive at a host cell surface by clinging, through their surface proteins, to linear, sulfated polysaccharides called glycosaminoglycans (GAGs) [12] [13] [14] . The repeating disaccharide units of GAGs, comprised of a hexosamine and a uronic acid or a galactose residue, are often sulfated (S1 Fig) [15] . GAGs are generally found covalently linked to core proteins as proteoglycans (PGs) and reside inside the cell, at the cell surface, and in the extracellular matrix (ECM) [15] . GAGs facilitate various biological processes, including cellular signaling, pathogenesis, and immunity, and possess diverse therapeutic applications [15] . For example, an FDA approved anticoagulant heparin (HP) is a secretory GAG released from granules of mast cells during infection [15, 16] . Some GAG binding proteins can be identified by amino acid sequences known as Cardin-Weintraub motifs corresponding to 'XBBXBX' and 'XBBBXXBX', where X is a hydropathic residue and B is a basic residue, such as arginine and lysine, responsible for interacting with the sulfate groups present in GAGs [17, 18] . Examination of the SARS-CoV-2 SGP sequence revealed that the GAG-binding motif resides within S1-S2 proteolytic cleavage motif (furin cleavage motif BBXBB) that is not present in SARS-CoV or MERS-CoV SGPs (Fig 1, S2 Fig, S3 Fig) [19] . Additionally, we discovered GAG-binding-like motifs within RBD and S2' proteolytic cleavage site in SARS-CoV-2 SGP (Fig 1, S2 Fig, S3 Fig) . This discovery prompted us to hypothesize that GAGs may contribute to SARS-CoV-2 fusion activation and host cell entry as a novel mechanism through SGP binding. We performed surface plasmon resonance (SPR)-based binding assays to determine binding kinetics of the interactions between various GAGs and SARS-CoV-2 SGP in comparison with SARS-CoV, and MERS-CoV SGP to address this question. Lastly, we performed blind docking on the trimeric SARS-CoV-2 SGP model to objectively identify the preferred binding GAG-binding sites on the SGP. Previous reports showed that various CoV bind GAGs through their SGPs to invade host cells [13] . In the current study, we utilized SPR to measure the binding kinetics and interaction affinity of monomeric and trimeric SARS-CoV-2, monomeric SARS-CoV and MERS-CoV with SGP-HP using a sensor chip with immobilized HP. Sensorgrams of CoV SGP-HP interactions are shown in Fig 2. The sensorgrams were fit globally to obtain association rate constant (ka), dissociation rate constant (kd) and equilibrium dissociation constant (KD) ( Table 1) using the BiaEvaluation software and assuming a 1:1 Langmuir model. SARS-CoV-2 and MERS CoV SGP exhibited a markedly low dissociation rate constant (kd ~ 10 -7 1/s) suggesting excellent binding strength. The HP binding properties of monomeric SARS-CoV-2 SGP was comparable to that of the trimeric form (KD of monomer and trimer were 40 pM and 73 pM, respectively). In comparison, previously known HP binding SARS-CoV SGP showed nearly 10-fold lower affinity, 500 nM. The extremely high binding affinity of SARS-CoV-2 SGP to HP was supported by the chip surface regeneration conditions. The immobilized HP surface could only be regenerated using a harsh regeneration reagent, 0.25% SDS, instead of the standard 2M NaCl solution used for removing HP-binding proteins. One reason for SARS-CoV-2 SGP monomer and trimer extremely high affinity to immobilized heparin is the high density of surface bound ligands might promote polyvalent interactions. The difference of binding kinetics and affinity of CoV SGPs to HP may also be due in part to the difference in protein sequence of the Cov SGPs. Based on amino acid alignment analysis using the Basic Local Alignment Search Tool (BLAST), SARS-CoV and SARS-CoV-2 SGPs share 76% similarity. Association rate constants (ka) for MERS-CoV SGP (339 (± 27) 1/M -1 s 1 ) was the lowest, followed by monomeric and trimeric SARS-CoV-2 SGP (2.5 6 × 10 3 (± 62.7) M -1 s -1 and 1.6 × 10 3 (± 127) M -1 s -1 , respectively) ( Table 1) . SARS-CoV SGP had the highest Ka, which was 4.12 × 10 4 (± 136) M -1 s -1 . The differences in ka values suggest a different mechanism when each SGP binds HP in addition to differences in binding strengths. Solution/surface competition experiments were performed by SPR to examine the effect of the saccharide chain length of HP on the SARS-CoV-2 SGP-HP interaction. HP-derived oligosaccharides of different lengths, from tetrasaccharide (dp4) to octadecasaccharide (dp18), were used in these competition studies. The same concentration (1000 nM) of HP oligosaccharides were mixed in the SARS-CoV-2 SGP protein (50 nM)/ HP interaction solution. Negligible competition was observed (S4 Fig) when 1000 nM of oligosaccharides (from dp4 to dp18) were present in the protein solution suggesting that the SARS-CoV-2 SGP-HP interaction is chainlength dependent and it prefers to bind full chain (~dp30) HP. Competition levels measured by SPR for chemically modified HP derivatives are shown in Using a modified version of Autodock Vina tuned for use with carbohydrates (Vina-Carb) [20, 21] , we performed blind docking on the trimeric SARS-CoV-2 SGP model to discover objectively the preferred binding GAG-binding sites on the SGP protein surface. The SGP contains three putative GAG-binding motifs with the following sequences: 453-459 (YRLFRKS), 681-686 (PRRARS), and 810-816 (SKPSKRS), which we define as sites 1, 2, and 3, respectively (Fig 1, S2 Fig, S3 Fig) . An HS hexasaccharide fragment (GlcA(2S)-GlcNS(6S)) binds site 2 in each monomer chain in the trimeric SGP ( Fig 5C, S3 Fig) . The docking results also indicates that HS may bind to site 1 when the apex of the S1 monomer is in an open conformation, as this allows basic residues to be more accessible to ligand binding. The site 1 residues are less accessible for GAG binding when the domain is in a closed conformation (Fig 5D) . The electrostatic potential surface representation of the trimeric SGP confirms that the GAG-binding poses generally prefer regions of positive charge, as expected, and illustrates that basic residues within site 3 are not exposed for binding to HS on any of the chains (Fig 5A) . Finally, our blind docking analysis reveals that a longer HS polymer may span an inter-domain channel that contains site 2. The original SARS-CoV and numerous pathogens exploit host cell surface GAGs during the initial step of host cell entry [12] [13] [14] . Based on our discovery of GAG-binding and GAGbinding-like motifs at site 1 (within the RBD, Y453-S459), site 2 (at the proteolytic cleavage site at S1/S2 junction, P681-S686), and site 3 (at the S2' proteolytic cleavage site, S810-S816), we hypothesized that SARS-CoV-2 may also interact with host cell surface GAGs through its SGPs to invade host cell (Fig 1, S2 Fig, S3 Fig) . The predominant GAG in normal human lung is HS followed by CS [22] and it is noteworthy that lung tissue is rich in mast cells and has been a source of commercial HP [23] . Using unbiased docking, we found that TriS HS hexasaccharide 5D ). Docking results indicated that the HS hexasaccharides could span an inter-domain channel that includes site 2, suggesting a mechanism for the binding of a longer HS sequence ( Fig 5C) . Next, we experimentally determined binding kinetics for the interactions between HP (rich (60-80%) in TriS domains) and monomeric SARS-CoV-2, trimeric SARS-CoV-2, monomeric SARS-CoV, and monomeric MERS-CoV SGPs using SPR binding assays (Fig 2 and Table 1 ). GAG-protein interactions are mainly electrostatically driven [24] , thus, HS-binding proteins generally bind HP due to its higher degree of sulfation [15] . We discovered that HP binds both monomeric and trimeric SARS-CoV-2 SGP with remarkable affinity (KD = 40 pM and 73 pM, respectively) (Fig 2 and Table 1 ). This was unexpectedly tight binding for a GAG-protein interaction as even one of one of the prototypical HP-binding proteins, fibroblast growth factor 2 (FGF2), has a KD of 39 nM [25] . In comparison, SARS-Cov and MERS-CoV SGPs also bind HP, however, much more weakly with binding strengths of KD = 500 nM and 1 nM, respectively (Fig 2 and Table 1 ). While HS facilitates SARS-CoV host cell entry and is an essential host cell surface receptor, its involvement in MERS-CoV host cell entry or binding kinetics for SARS-Cov and MERS-CoV SGPs had not previously been reported [10] . After discovering the high binding affinity between HP and SARS-CoV-2 SGP, we next found that the degree and position of sulfation within HP was important for its successful binding to monomeric SARS-CoV-2 SGP (Figs 3 and 4) . The low IC50 of these GAGs suggest that the FDA approved anticoagulant HP, or its nonanticoagulant derivatives, might have therapeutic potential against SARS-CoV-2 infection as competitive inhibitors. The location of proposed GAG-binding sites is also of interest. Unlike SARS-CoV and MERS-CoV SGPs, SARS-CoV-2 SGP has a novel insert in the amino acid sequence (681-686 (PRRARS)) that fully follows GAG-binding Cardin-Weintraub motif (XBBXBX) and a furin-cleavage motif (BBXBB) at the S1/S2 junction (Fig 1) . This site was also shown to be a preferred GAG-binding motif by our unbiased docking study (Fig 5) . Proteolytic cleavage at S1/S2 is not required for successful viral-host cellular membrane fusion in SARS-CoV and MERS-CoV SGPs [6, 7] . Proteolytic cleavage primes the SGP for fusion activation and may additionally influence cell-cell fusion, host cell entry, and/or the infectivity of the virus [13, 27] . possess both GAG-binding and furin cleavage motifs at their S1/S2 junction in their SGPs [13] . In the cases of MHV and IBV spike proteins, a single amino acid mutation near the GAG-binding and furin cleavage motifs resulted from a cell culture adaptation, and determines whether a virion binds GAGs or exploits host cell surface protease, but not both [28] . While not within the CoV family, human immunodeficiency virus type 1 (HIV-1) requires HS-binding to achieve optimal furin processing because HS binding allows selective exposure of furin cleavage site [29] . While the idea of repurposing HP as COVID-19 therapeutic sounds appealing, further questions, including in vitro relevance of GAGs as host cell surface receptors, proteolytic processing of SGPs at S1/S2 junction, and their relationship in host cell entry and infectivity, must first be carefully evaluated. Based on our findings, we propose a model on how GAGs may facilitate host cell entry of SARS-CoV-2 (Fig 6) . First, virions land on the epithelial surface in the airway by binding to HS through their SGPs (Fig 6A) . Host cell surface proteoglycans utilize their long HS chains to securely wrap around the trimeric SGP ( Fig 6A) . During this step, heavily sulfated HS chains span inter-domain channel containing GAG-binding site 2 on each monomer in the trimeric SGP and binds site 1 within the RBD in an open conformation (Fig 5) . Host cell surface and extracellular proteases, such as furin and transmembrane serine protease 2 (TMPRSS2), may process site 2 (S1/S2 junction) and/or 3 (S2') and GAG chains come off from site 2 upon cleavage (Fig 6B) . HS and ACE2 binding to more readily accessible RBD containing site 1 may drive conformational change of SGP and activate viral-cellular membrane fusion [30] . Finally, SGP on the endocytosed virion may utilize an endosomal host cell protease, such as cathepsin L, to further execute viralcellular membrane fusion. (Fig 6C) . In conclusion, we have discovered that GAGs can facilitate host cell entry of SARS-CoV-2 by binding to SGP in the current work. SPR studies demonstrate that both monomeric and trimeric SARS-CoV-2 SGP bind HP with remarkably high affinity and it prefers long, heavily sulfated (TriS rich) structures. Additionally, we reported low IC50 of HP and derivatives against HP and SARS-CoV-2 SGP interactions suggesting therapeutic potential of HP as COVID-19 competitive inhibitors. Lastly, unbiased computational ligand docking indicated that a TriS HS oligosaccharide preferably interacts with GAG-binding motifs at the S1/S2 junction and within receptor binding domain and hinted at mechanism of binding. This study adds to our current understanding of SARS-CoV-2 pathogenesis and serves a foundation for designing glycoconjugate vaccines and therapeutics to successfully contain and eliminate COVID-19. [31] . The 6-O-desulfated HP derivative, 6-DeS HP, Mw = 13 kDa, was generously provided by Prof. Lianchun Wang (University of South Florida). Nonanticoagulant low molecular weight HP (NACH) was synthesized from dalteparin, a nitrous acid depolymerization product of porcine intestinal HP, followed by periodate oxidation as described in our previous work [26] . TriS HS (NS2S6S) was synthesized from N-sulfo heparosan with subsequent modification with C5-epimerase and 2-O-and 6-O-sulfotransferases (2OST and 6OST1/6OST3) [32] . HP oligosaccharides included tetrasaccharide (dp4), hexasaccharide (dp6), octasaccharide (dp8), decasaccharide (dp10), dodecasaccharide (dp12), tetradecasaccharide (dp14), hexadecasaccharide (dp16) and octadecasaccharide (dp18) and were prepared from porcine intestinal HP controlled partial heparin lyase 1 treatment followed by size fractionation. The chemical structures of the GAGs are shown in S1 Fig. Sensor SA chips were from GE Healthcare (Uppsala, Sweden). SPR measurements were performed on a BIAcore 3000 operated using BIAcore 3000 control and BIAevaluation software (version 4.0.1). Biotinylated HP was prepared by conjugating its reducing end to amine-PEG3-Biotin (Pierce, Rockford, IL). In brief, HP (2 mg) and amine-PEG3-Biotin (2 mg, Pierce, Rockford, IL) were dissolved in 200 µl H2O, 10 mg NaCNBH3 was added. The reaction mixture was heated at 70 °C for 24 h, after that a further 10 mg NaCNBH3 was added and the reaction was heated at 70 °C for another 24 h. After cooling to room temperature, the mixture was desalted with the spin column (3,000 MWCO). Biotinylated HP was collected, freeze-dried and used for SA chip preparation. The biotinylated HP was immobilized to streptavidin (SA) chip based on the manufacturer's protocol. The successful immobilization of HP was confirmed by the observation of a 200-resonance unit (RU) increase on the sensor chip. The control flow cell (FC1) was prepared by 2 min injection with saturated biotin. L/min, respectively. After each run, the dissociation and the regeneration were performed as described above. Solution competition studies between surface HP and soluble glycans (HP, TriS HS and NACH) to measure IC50 were performed using SPR [33] . In brief, SARS-CoV-2 S-protein (50 nM) samples alone or mixed with different concentrations of glycans in SPR buffer were injected over the HP chip at a flow rate of 30 l/min, respectively. After each run, dissociation and regeneration were performed as described above. For each set of competition experiments, a control experiment (only protein without glycan) was performed to ensure the surface was completely regenerated. The 3D coordinates for the SGP trimer (NCBI reference sequence YP_009724390.1) were downloaded from the SWISS-MODEL homology modeling server [34] . The selected model was generated with the Cryo-EM structure PDB ID 6VSB as a template, which has a 99.26% sequence identity and 95% coverage for amino acids 27 to 1146. The template and resulting model is the "prefusion" structure with one of the three receptor binding domains (Chain A) in the "up" or "open" conformation [30] . Cryo-EM studies have revealed that the SARS-CoV-2 SGP trimer exists in two conformational states in approximately equal abundance [35] . In one state, all SGP monomers have their hACE2-binding domain closed, and in the other, one monomer has its hACE2-binding domain open, where it is positioned away from the interior of the protein. Initial coordinates for a hexasaccharide fragment of HS (GlcA(2S)-GlcNS(6S))3 were generated using the GAG-Builder tool [36] at GLYCAM-Web (glycam.org) and used for unbiased (blind) docking. A hexasaccharide was chosen as being sufficiently long to represent a typical GAG length found in protein co-complexes [36] and to avoid introducing so many degrees of internal flexibility that the efficiency of the docking conformational search algorithm was impaired. Docking was performed using a version of Vina-Carb [21] that has been modified to improve its performance for GAGs. A grid box with dimensions (x = 190, y = 223, z = 184 Å) was placed at the geometric center the protein enclosing its entire surface. Docking was performed with default values, with the following exceptions: exhaustiveness = 80, chi_cutoff = 2, and chi_coeff = 0.5. All sulfate and hydroxyl groups and glycosidic torsion angles were treated as flexible, resulting in 83 ligand poses. World Health Organization. WHO Director-General's opening remarks at the mission briefing on COVID-19 -11 A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster Aerosol and Surface Stability of SARS-CoV-2 as Compared with SARS-CoV-1 Enteric involvement of coronaviruses: is faecal-oral transmission of SARS-CoV-2 possible? World Health Organization. Coronavirus disease 2019 (COVID-19) Situation Report. In: World Health Organization Activation of the SARS coronavirus spike protein via sequential proteolytic cleavage at two distinct sites Middle East Respiratory Syndrome Coronavirus Spike Protein Is Not Activated Directly by Cellular Furin during Viral Entry into Target Cells SARS coronavirus, but not human coronavirus NL63, utilizes cathepsin L to infect ACE2-expressing cells Receptor and viral determinants of SARS-coronavirus adaptation to human ACE2 Inhibition of SARS pseudovirus cell entry by lactoferrin binding to heparan sulfate proteoglycans Dipeptidyl peptidase 4 is a functional receptor for the emerging human coronavirus-EMC Glycosaminoglycans in infectious disease Mechanisms of coronavirus cell entry mediated by the viral spike protein Interaction of Zika Virus Envelope Protein with Glycosaminoglycans Proteoglycans and Sulfated Glycosaminoglycans Copper regulates the interactions of antimicrobial piscidin peptides from fish mast cells with formyl peptide receptors and heparin Molecular modeling of protein-glycosaminoglycan interactions Glycosaminoglycan-protein interactions: definition of consensus sites in glycosaminoglycan binding proteins The spike glycoprotein of the new coronavirus 2019-nCoV contains a furin-like cleavage site absent in CoV of the same clade Software News and Updates AutoDock Vina: Improving the Speed and Accuracy of Docking with a New Scoring Function, Efficient Optimization, and Multithreading Vina-Carb: Improving Glycosidic Angles during Carbohydrate Docking Matrix proteoglycans as effector molecules for epithelial cell function Comparison of Low-Molecular-Weight Heparins Prepared From Bovine Lung Heparin and Porcine Intestine Heparin Thermodynamic Analysis of the Heparin Interaction with a Basic Cyclic Peptide Using Isothermal Titration Calorimetry Kinetic Model for FGF, FGFR, and Proteoglycan Signal Transduction Complex Assembly Further evidence that periodate cleavage of heparin occurs primarily through the antithrombin binding site Furin cleavage of the SARS coronavirus spike glycoprotein enhances cell-cell fusion but does not affect virion entry Cleavage of Group 1 Coronavirus Spike Proteins: How Furin Cleavage Is Traded Off against Heparan Sulfate Binding upon Cell Culture Adaptation Heparin enhances the furin cleavage of HIV-1 gp160 peptides Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. Science (80-) 1H and 13C NMR spectral assignments of the major sequences of twelve systematically modified heparin derivatives Enzymatic synthesis of glycosaminoglycan heparin Structural Characterization of Pharmaceutical Heparins Prepared from Different Animal Tissues SWISS-MODEL: Homology modelling of protein structures and complexes Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein GAG Builder: a web-tool for modeling 3D structures of glycosaminoglycans VMD: Visual Molecular Dynamics Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan We appreciate Prof. Jason McLellan from University of Texas Austin for providing trimeric SARS-CoV-2 SGP. Additionally, we thank professor Lianchun Wang from University of South Florida for providing 6-O-desulfated HP derivative.