key: cord-0740904-dsc6gsj2 authors: Acosta-Gutiérrez, Silvia; Buckley, Joseph; Battaglia, Giuseppe title: The role of host cell glycans on virus infectivity: The SARS-CoV-2 case date: 2021-05-09 journal: bioRxiv DOI: 10.1101/2021.05.08.443212 sha: 061204a4f06b39898bc295969b8ead9484594025 doc_id: 740904 cord_uid: dsc6gsj2 Long and complex chains of sugars, called glycans, often coat both the cell and protein surface. Glycans both modulate specific interactions and protect cells. On the cell surface, these sugars form a cushion known as the glycocalyx. Here, we show that Heparan Sulfate (HS) chains – part of the glycocalyx – and other glycans – expressed on the surface of both host and virus proteins – have a critical role in modulating both attractive and repulsive potentials during viral infection. We analyse the SARS-CoV-2 virus, modelling its spike proteins binding to HS chains and two key entry receptors, ACE2 and TMPRSS2. We include the volume exclusion effect imposed on the HS chains impose during virus insertion into glycocalyx and the steric repulsion caused by changes in the conformation of the ACE2 glycans involved in binding to the spike. We then combine all these interactions, showing that the interplay of all these components is critical to the behaviour of the virus. We show that the virus tropism depends on the combinatorial expression of both HS chains and receptors. Finally, we demonstrate that when both HS chains and entry receptors express at high density, steric effects dominate the interaction, preventing infection. The specificity of infection for certain viruses to hosts and host tissues, viral-host tropism, can 45 be described in terms of the number of viruses bound to a given combination of receptors, 46 or phenotype. The binding to the cell surface can be estimated using an Langmuir isotherm. We define two surface coverages, the glycocalyx surface coverage, θ G , which represents the 48 fraction of the glycocalyx occupied by virions, and the receptor surface coverage, θ R , which 49 represents the fraction of the cell-surface occupied by viruses, [18, 6, 19] equal to where ρ is the bulk viral titre (the number of virus copies per unit volume or viral load), ρ G is the 51 viral titre in the glycocalyx, v B is the binding volume, and Q R and Q G are the grand canonical (Figure 2a) . Only a fraction of these units bind to the virion spike proteins, normally 8 76 to 10 monomers or binding motifs [20] . Therefore, the total binding energy between a single 77 HS-chain and the virus, ϵ HS , (Supporting Information) is dependent on the number of binding 78 motifs, N L . The more binding motifs the higher the attractive energy (Figure 2b) . This en-79 ergy also increases with the number of HS chains, proportional to the density of proteoglycans (Figure 2b) . The repulsive energy generated by the insertion of the virus into the HS, q steric HS , (Supporting Information) increases dramatically with the number of monomers per chain and 82 the number of chains (proteoglycan density) (Figure 2c ). 83 Once a virus has inserted into the glycocalyx, it can bind to the ACE2 receptor via the RBD of 84 the spike glycoprotein (Figure 1 ). Upon binding, glycans at the binding site are compressed, 85 inducing a repulsive component to this interaction (Figure 1 (lower right panel), Figure 3a, In Figure 3 we also analyse the influence of the trapped glycans on the dissociation con- The authors declare no competing financial interests. Viral titre, copies/ml 3.01E+07 6.02E+07 3.01E+07 3.01E+07 3.01E+07 3.01E+07 6.02E+07 6.02E+07 6.02E+07 As discussed in the main text, binding between coronaviruses and the host cells proceeds in 396 a two step process. First, the virus binds to Heperan Sulfate, the long-chain sugars that coat 397 the cell surface. After binding to the Heperan Sulfate, the virus then can form bonds with the 398 receptors on the surface, particularly ACE2 and TMPRSS2. 399 We will define the glycocalyx surface coverage, θ G , as the probability that a virus will be 400 bound to a HS chain at a given site, and the cell-surface coverage, θ R , as the probability that a 401 virus will be bound to a receptor at a given site. In equation 1 in the main text, we give the gen-402 eral form of the surface coverage, derived from the Langmiur isotherm, which we will specialise 403 to deal with each case, giving where the binding volume, v B , is determined from the viral radius, R, and the spike length, d, as and ρ is the viral bulk concentration, and ρ G is the viral concentration in 406 the glycocalyx. The cell-surface coverage can be linked to the glycocalyx surface coverage as , however, in order to limit the complexity for our calculation, we treat ρ G as a parameter, 408 effectively decoupling the system. Likewise, for Q R we have contributions from the receptor binding q R binding , as well as the steric 416 repulsion generated by the surrounding HS q HS insertion , It should be noted that q HS insertion and q HS * insertion are slightly different, due to differences in the 418 geometry of binding in the different cases. The individual partition functions will be discussed 419 in more detail in the following subsections. Then, we can calculate the total partition function for HS-spike binding from the binding energy As multiple HS chains will bind to a single proteoglycan, we expect the density of the HS chains 433 to be non-uniformly distributed, with areas of locally high density around each proteoglycan. This means when the virus is bound only to HS chains, there will be effectively a high local 435 density then when the virus is bound to the receptors. 436 We, therefore, have to consider two cases for the steric repulsion. First, when the virus is 437 bound only to HS chains, and the second when the virus is bound to the receptors. For the 438 receptor binding case, we will take the global HS density as ρ HS , while for the HS binding case 439 we will take the higher local density. 440 We can then calculate the energy of insertion in terms of the density of chains, the inserted 441 volume, V µ (z), and the insertion parameter,δ This gives From this expression, we can calculate the partition function as . The inserted volume can be calculated geometrically by considering the distance from the 444 surface, z, as In the presence of receptor-spike binding, the cell-virus distance will be equal to the receptor 451 tether length, d r . However, in the absence of receptor-spike binding, situation is more complex. We will number the binding sites of a HS chain, from n = 1, ..., N L , where n = 1 is the site 453 farthest from the cell surface. We can then define the distance between two sites, n and n+1, as 454 d n . It is not possible to identify each distance between sites, so we will assume that the distances 455 can be approximated by the average distance, an assumption that holds for long chain lengths 456 with roughly equal distribution of sites. We can then express the inter-site distance as As we cannot know the distance of each virus from the cell surface, we will apply a mean field 461 approximation, treating the repulsion as the repulsion at the average distance. To do this, we must find the probability that a given site, n, is the closest bound site to the The expectation value of the cell-virus distance is then given by Then, we can define the energy of binding in terms of the probability that a given site is The probability of a linker being unbound can be cacluated by closure (i.e. all probabilities 475 must sum to 1). Defining p ij as the probability linker i is bound to linker j, we have Finally, we can calculate p ij can be defined as the product of the probability that both i and j 477 are unbound and the probability that a bond will form, which is given by the Boltzmann factor 478 -the negative exponential of the binding energy. This gives the expression Where f R−Spike is the Receptor-Spike binding energy. . ACE2, meanwhile, has three binding sites, and we expect to see the presence of 490 glycans upon binding. Therefore we need to calculate the repulsive contribution form. 491 We start by considering the Kuhn length of the glycans, b G ; the unit size at which each unit 492 can freely move, and orient in any direction. All details of chemical arrangements, bond rota-493 tion constraints, monomer-monomer interaction and monomer interaction with its local envi-494 ronment, are described by this parameter. A larger Kuhn length corresponds to a more rigid, 495 and therefore less coiled, chain. The mean squared end-to-end distance of a coiled polymer is then given by 497 while the length of a fully extended chain is given by mined from a Gaussian distribution, 500 p(r, N G ) = . Each branch can explore a hemispherical volume of radius r. Therefore, integrating equation (22) over this volume gives the partition function for a given branch, . (23) Hence the energy loss associated with the compression of a glycan with B branches upon 503 binding is determined by the change in r from the average chain length r 0 to the binding dis- The total free energy of repulsion can now be found by summing over each glycan, u R = 506 ∑ N Glycan U glycans , where N Glycan is the number of glycans. Assuming that each glycan contains 507 a similar amount of units, we can approximate Coronavirus biology In situ structural analysis of sars-cov-2 spike reveals flexibility mediated by three 370 hinges Structural basis for the recognition of 372 sars-cov-2 by full-length human ace2 Comparison 376 of simple potential functions for simulating liquid water Charmm36m: an improved force field for folded and intrinsically dis-380 ordered proteins Gromacs: 382 High performance molecular simulations through multi-level parallelism from laptops to 383 supercomputers Canonical sampling through velocity rescaling Polimorphic transitions in single crystals: A new molecular dynamics 387 method Promoting transparency and reproducibility in enhanced molecular sim-389 ulations Plumed 2: New 391 feathers for an old bird