key: cord-0910928-oc3sttwb authors: Ryzhikov, Alexandr B.; Onkhonova, Galina S.; Imatdinov, Ilnaz R.; Gavrilova, Elena V.; Maksyutov, Rinat A.; Gordeeva, Elena A.; Pazynina, Galina V.; Ryzhov, Ivan M.; Shilova, Nadezhda V.; Bovin, Nicolai V. title: Recombinant SARS-CoV-2 S Protein Binds to Glycans of the Lactosamine Family in vitro date: 2021-02-25 journal: Biochemistry (Mosc) DOI: 10.1134/s0006297921030019 sha: 31c1d976888e062b4ec7e640fe3cdb20574fc60c doc_id: 910928 cord_uid: oc3sttwb Many viruses, beside binding to their main cell target, interact with other molecules that promote virus adhesion to the cell; often, these additional targets are glycans. The main receptor for SARS-CoV-2 is a peptide motif in the ACE2 protein. We studied interaction of the recombinant SARS-CoV-2 spike (S) protein with an array of glycoconjugates, including various sialylated, sulfated, and other glycans, and found that the S protein binds some (but not all) glycans of the lactosamine family. We suggest that parallel influenza infection will promote SARS-CoV-2 adhesion to the respiratory epithelial cells due to the unmasking of lactosamine chains by the influenza virus neuraminidase. ELECTRONIC SUPPLEMENTARY MATERIAL: Supplementary material is available in the online version of this article at 10.1134/S0006297921030019 and on the journal website (http://protein.bio.msu.ru/biokhimiya). The key role in the adhesion of coronaviruses (CoVs) to the host cells belongs to the spike (S) protein. The S protein is a homotrimer, each monomer consisting of two subunits, S1 and S2. The S2 subunit is anchored in the viral membrane and is responsible for the virus fusion with the host cell [1, 2] . The S1 ectodomain consists of four subdomains (S1A -S1D). Each of these subdo mains can bind to the receptor, but it is still unclear whether they act in concert or separately [3] . In the MERS (Middle East respiratory syndrome) CoV, the S1B subdomain is responsible for the interaction with dipeptidyl peptidase 4 and is critical for the virus pene tration into the host cell, while the S1A lectin like sub domain binds O acetylated sialoglycan* [4, 5] . The S1B subdomain of SARS CoV 1 and SARS CoV 2, recog nizes angiotensin converting enzyme 2 (ACE2) [6] . The specificity of other subdomains has yet to be found, although ACE2 is not the only possible target of the S proteins of SARS CoV 1 and SARS CoV 2 (e.g., it also interacts with CD147 glycoprotein) [7, 8] . The S proteins of other human coronaviruses, HKU1 and OC43, as well as of bovine BCoV, have the lectin like S1A domain that binds 9 O acetylated sialic acid; this interaction is low affine, but nevertheless contributes to the virus penetra tion to its main target [9] . The data on the sialo binding activity of the SARS CoV 2 S protein are contradictory: the study [10] reported the absence of such interaction; however, other studies have suggested the possibility of the S protein binding to gangliosides [11] . It should be noted that all SARS CoV 2 surface proteins are sialylat ed, especially the S protein with its 22 glycosylation sites. Since the virus should not bind itself, its glycan binding protein cannot bind regular sialic acid (Sia) containing BIOCHEMISTRY (Moscow) terminal motifs, such as Neu5Acα2 3Gal and Neu5Acα2 6Gal; although this does not exclude its interaction with O acetylated sialosides, which are uncommon for the host epithelial cells (as it takes place for the MERS CoV S protein). In other words, if SARS CoV 2 binds Sia containing glycosides, latter should dif fer from the usual Neu5Acα2 3Gal and Neu5Acα2 6Gal motifs. It was shown [12] that low molecular weight heparin exerts a significant anti inflammatory effect in COVID 19 patients due to a pronounced decrease in the content of the pro inflammatory cytokine IL 6. It is also believed that heparin can directly interact with SARS CoV 2. Glycosaminoglycans (GAGs) can act as adhesion factors for adenoviruses, herpes viruses, papillomavirus, cytomegalovirus and others, and this binding is efficient ly inhibited by the soluble form of GAG [13] . Coronaviruses are not an exception -NL63 and SARS CoV 1 (in a form of the pseudovirus) use GAG for the adhesion on the host cell along with the ACE2 receptor [14] . The trimeric form of the S protein from the pan demic SARS CoV 2 binds to the full length heparin with an amazingly high affinity of 40 pM, which is several orders of magnitude better than for the MERS CoV S protein [15] . The presence of 2 O and 6 O sulfates is important for the binding [16] . Considering all the above mentioned, we examined the glycan binding specificity of the recombinant SARS CoV 2 S protein. Although we expected to discover the ability of the S protein to bind some other carbohydrate receptors, such as sulfated or uncommon sialylated mammalian glycans (see above), we found that the S protein bound with a high affinity inter action primarily lactosamine type glycans. Glyc PAA biot glycoconjugates (where Glyc is gly can, PAA is polyacrylamide, biot is biotin with the C5 spacer; 20 kDa, containing 20 mol% glycan and 5 mol% biotin) were from GlycoNZ (Auckland, New Zealand). S protein homotrimer. Recombinant trimeric SARS CoV 2 S protein was prepared using original expressing plasmid vector SBW4G_S FdT4 carrying the chimeric gene encoding the ectodomain of glycoprotein S and beta propeller trimeric domain of bacteriophage T4 fib ritin with the C terminal eight histidine tag (His 8 tag) under control of the CAG promoter. Transient expression was achieved by transfection of cultured HEK293 cells with the recombinant plasmid SBW4G_S FdT4. Affinity chromatography purification was performed on a Ni NTASuperFlow resin (Qiagen, Germany). The purity and integrity of all purified recombinant proteins were confirmed by SDS PAGE; the molecular weight of pro tein subunit is 139 kDa (see Supplement for the amino acid sequence). Screening by enzyme linked immunosorbent assay (ELISA). The recombinant S protein (5 μg/ml in phos phate buffered saline, PBS) was used for coating the microplate wells (Nunc MaxiSorp, Thermo Fisher, USA) overnight at room temperature. After coating, the plate was left to dry and then washed with PBS, containing 0.1% Tween 20 (Merck, USA) (PBS T). Glyc PAA biot glycoconjugates were serially diluted in the serum dilution solution (Cat. # PPO0520, Epitek, Novosibirsk, Russia) starting with 10 μg/ml concentration and then added to the microplate wells. The microplate was incubated for 90 min on a shaker at 37°C, washed, and 100 μl of horse radish peroxidase conjugate (Epitek, Novosibirsk, Russia; dilution, 1 : 20 in PBS T), was added to each well. The plate was incubated for another 40 min on the ther moshaker at 37°C and washed. Next, 100 μl of tetram ethylbenzidine solution (Thermo Fisher, USA), was added to each well. The plate was incubated for 30 min in the dark at room temperature. The reaction was stopped with 5% H 2 SO 4 , and the absorbance of the colored prod uct was measured with a Bio Rad Model 680 microplate reader (Bio Rad, USA) at a wavelength of 450 nm. Eleven glycans that demonstrated the highest binding activity in ELISA screening were added to a strip 96 well Nunc MaxiSorp plate (Thermo Fisher, USA) coated with the S protein (see above). The strips were incubated on a shaker at 37°C; every 10 min, one of the strips was washed with PBS T and placed in a refrigerator (4°C). After 100 min, all strips were processed as described above, and the data for the strips incubated for different time periods were compared. The concentration of the glycan S protein complex is described by the ligand receptor interaction equations (1) (4): where [LR] is the complex concentration; [R] 0 and [L] 0 are initial concentrations of glycan and S protein, respec tively; k + and kare rate constants. The k exponent was calculated by approximation of the experimental depend ence of the glycan S protein complex concentration on time by function (1) using the OriginPro software pack age. Since k was determined at several conjugate concen trations, plotting the dependence of the calculated k value against concentration of the added glycoconjugates pro duced linear equation (3). K d was calculated according to equation (4) . Although our group makes extensive use of printed glycan array (PGA) [17] to study glycan binding proteins (including viral ones), in case of the SARS CoV 2 S pro tein, we used ELISA, for which the polystyrene plates where coated with the S protein and then incubated with Glyc PAA biot glycopolymers to study their binding to the immobilized protein. This is a more time consuming method, and the result analysis is complicated by the contribution of the non specific binding due to the pres ence of excessive biotin residues in Glyc PAA biot conju gates (>10 per one PAA chain on average [18] ). However, we believe that this "reverse" method has a definite advantage, as the comparison of the optical density (OD) values for the S protein binding to different glycans should give us reliable relative values of the interaction affinity, while the PGA [19] , where ligands are immobi lized, does not guarantee equal degree of ligand immobi lization to the solid phase. Using ELISA, we were able to examine S protein interaction with 155 glycoconjugates, including sialoglycans typical for human cells, various sulfated glycans, N acetyllactosamine oligosaccharides, glycans representing the ABH blood group antigens, and many others. The maximum reliable OD value in this ver sion of ELISA is a little over 3, so signals within the 2.0 to 3.2 range were referred to as strong and OD values less than 1.0 were interpreted as a lack of binding. Despite the fact that the assay conditions were intentionally "tuned" to the amplification of the lowest signals, the assay revealed no significant S protein binding to sulfated or common sialylated glycans, i.e., those with the terminal Neu5Acα2 3Gal, Neu5Acα2 6Gal, or Neu5Acα2 6GalNAc moieties or 9 NAc Neu5Acα structural motifs (9 NAc derivatives were taken as mimetics of the corre sponding O Ac sialosides). The only sialoligand that exhibited high affinity binding with the S protein was Neu5Acα2 8Neu5Acα disaccharide (#11, table), although the tetrasaccharide of GD2 ganglioside (Neu5Acα2 8Neu5Acα2 3Galβ1 4Glcβ) and the trisac charide Neu5Acα2 8Neu5Acα2 8Neu5Acα showed no binding in our assay. The binding of the Neu5Acα2 8Neu5Acα disaccharide is mainly due to two carboxyl groups of two Neu5Ac residues (i.e., to Coulomb interac tions), which is consistent with the recently published paper on the virus binding to the clusters of Neu5Ac monosaccharides [20] . These two residues happen to be positioned at a distance that favors their interaction with the lectin like site; this interaction is abolished by a bulky substituent R in Neu5Acα2 8Neu5Acα R (as in ganglio The structure of glycans that bind the S protein with a high affinity in ELISA sides, where R is the inner carbohydrate core). The table shows all glycans binding S protein with a high affinity (OD value over 2.0). Out of 10 top glycans, eight (shown in grey in the table) were typical ligands for human galectins. These glycans were oligolactosamines, glycan part of glycosph ingolipid asialoGM1, and tetrasaccharides of blood groups A (type 4) and B (type 4). Among the "second tier" glycans (OD value in the range from 2.0 to 2.6) four compounds were lactosamine derivatives and two were galactose monosaccharide and its derivative with sulfate group at position 3. Therefore, the S protein preferential ly bound beta galactosides. Interestingly, the unsubstitut ed Galβ1 4GlcNAcβ disaccharide (monomeric LN) showed lower binding efficiency compared to the top lac tosamines from the table, which is also typical for human galectins [21] . The undoubted similarity of the recognition patterns of the SARS CoV 2 S protein and galectins can be explained by the structural features of the former, name ly, the presence of specific galectin fold in the S1A domain in the S proteins from various CoVs (MERS CoV, HCoV OC43, BCoV, and TGEV) [9, 22] . However, until now, no binding to galactose containing gly cans -galectin ligands -has been reported. It is assumed that the galectin fold in the S protein has been evolution arily borrowed from the host cell in a truncated form, and therefore, does not function as a full fledged lectin. That is, this domain is unable to bind βGal motifs; however, in the course of evolution of some viruses, it has acquired the ability to recognize other carbohydrate motifs [23, 24] . Not all of the top glycans are structurally similar to galectin ligands. We have already discussed above the disialoside. The trisaccharide Le X (Galβ1 4(Fucα1 3)GlcNAc) does not bind to any of the known human galectins, but at the same time, contains N acetyllac tosamine. Therefore, the recognition of a not entirely conventional ligand by a not entirely conventional galectin does not seem surprising to us. The blood group A trisaccharide GalNAcα1 3(Fucα1 2)Gal is also known not to be a galectin ligand. We explain the fact that it got on the list of top S protein ligands as follows: this glycan differs from other ligands in the array by its hydrophobic ity, as it has the O(CH 2 ) 3 NHCO(CH 2 ) 5 NH spacer. The same trisaccharide A without the hydrophobic spacer is a poor ligand for the S protein. Taken together, these state ments suggest that the observed binding of the trisaccha ride A* is a result of a fortunate combination of two com ponents -nonspecific (hydrophobic) binding and specif ic (GalNAcα) binding, the latter being relatively weak and therefore insufficient for the manifestation of affinity for the S protein without the "assistance" from the unusual spacer. The revealed ability to recognize glycans with the terminal glucosamine moiety, for instance GlcNAcα1 3GalNAcβ, as well as other glycans that do not belong to the lactosamine family (see above), at first glance does not fit the hypothesis of the galectin like binding site in the S protein. Nevertheless, there are examples when replacing just one or two amino acids in a lectin not only abolished the binding of carbohydrates, but led to the ability to recognize another glycan [25] . In our case, the difference in the amino acid sequence is very significant (data not shown), so we are talking only about the similarity of their folds. Hence, the difference between the S protein and glycan recognizing galectin is not surprising in itself. More surprising is the "restora tion" of the ability of the galectin fold in the S protein of the latest coronavirus to bind typical galectin ligands, which has been lost by other coronaviruses. We also used ELISA in the kinetic mode (see Materials and methods section) to approximately esti mate the affinity of the recombinant S protein interaction with the top 11 glycans in the form of Glyc PAA biot conjugates. The highest affinity was found for the disialo side (K d = 10 nM, all values refer to glycan and not to its polymeric conjugate) and Galβ1 4GlcNAcβ1 6(Galβ1 3)GalNAcα (LN6TF) (K d = 20 nM). The K d values for ACE2 reported in the literature range from 1.2 nM [2] to 95 nM [26] . Although it would be incorrect to directly compare the data obtained in completely different exper imental systems, nevertheless, the measured K d values suggest the contribution of carbohydrate mediated reception in vivo. Obviously, it is especially important to know which glycans are actual or potential/additional targets of the virus in the human epithelium. Based on the data obtained, it can be assumed that this is the N acetyl lactosamine glycan of the glycoprotein O chain, since the tetrasaccharide that showed one of the best K d values was LN6TF, whereas oligo lactosamine glycans typical for the N chains demonstrated lower affinity. Most lactosamine motifs on the surface of the lung epithelium cells are masked by sialic acid attached to them; therefore, in a healthy person, an additional affin ity of the virus (for lactosamines) is unlikely to make a significant contribution to the primary virus adhesion. However, neuraminidase, which is present in many pathogens (first of all, in the influenza virus), may release sialic residues, thus exposing lactosamine residues. Hence, we hypothesize that parallel infection of a pathogen with strong neuraminidase activity will promote adhesion of SARS CoV 2 and, therefore, increase its vir ulence. Funding. The work was supported by the Russian Foundation for Basic Research (project no. 20 04 60335). * Here, we should mention without further discussion that the SARS CoV 2 virus preferentially infects individuals with the group A blood type [27] . Ethics declarations. The authors declare no conflict of interest. This article does not contain description of studies with the involvement of humans or animal sub jects. Electronic supplementary material. Supplementary material is available in the online version of this article at https://doi.org/10.1134/S0006297921030019 and on the journal website (http://protein.bio.msu.ru/biokhimiya). Structural basis for human coronavirus attach ment to sialic acid receptors 2020) Structure, function, and antigenicity of the SARS CoV 2 spike glycoprotein A human monoclonal antibody blocking SARS CoV 2 infection Identification of sialic acid binding func tion for the Middle East respiratory syndrome coronavirus spike glycoprotein Structures of MERS CoV spike glycoprotein in complex with sialoside attachment receptors Characterization of spike glycoprotein of SARS CoV 2 on virus entry and its immune cross reactivity with SARS CoV Function of HAb18G/CD147 in invasion of host cells by severe acute respiratory syndrome coronavirus SARS CoV 2 invades host cells via a novel route: CD147 spike protein, bioRxiv Distinct roles for sialoside and protein receptors in coronavirus infection Binding of the SARS CoV 2 spike protein to glycans, bioRxiv Dual function of sialic acid in gastrointestinal SARS CoV 2 infection Clinical observations of low molecular weight heparin in relieving inflammation in COVID 19 patients: a retrospective cohort study, medRxiv Glycan engagement by viruses: receptor switches and specificity Human Coronavirus NL63 utilizes heparan sulfate proteoglycans for attachment to target cells Glycosaminoglycan binding motif at S1/S2 proteolytic cleavage site on spike glycoprotein may facilitate novel coronavirus (SARS CoV 2) host cell entry, bioRxiv Heparin inhibits cellular invasion by SARS CoV 2: structural dependence of the interaction of the sur face protein (spike) S1 receptor binding domain with heparin, bioRxiv Printed glycan array: a sensitive technique for the analysis of the repertoire of circulating anti carbohydrate antibodies in small ani mals Polyacrylamide based glycoconjugates as tools in glycobiology Printed covalent glycan array for ligand profiling of diverse glycan binding proteins The SARS COV 2 spike protein binds sialic acids and enables rapid detection in a lateral flow point of care diagnostic device Solid phase assays for study of carbohydrate specificity of galectins Receptor recognition mechanisms of coron aviruses: a decade of structural studies Crystal structure of mouse coron avirus receptor binding domain complexed with its murine receptor Crystal structure of bovine coronavirus spike protein lectin domain Biology of animal lectins Structural and functional basis of SARS CoV 2 entry by using human ACE2 The ABO blood group locus and a chromosome 3 gene cluster associate with SARS CoV 2 respiratory failure in an Italian Spanish genome wide asso ciation analysis, medRxiv