key: cord-0851983-xvcx26x7 authors: Arutyunova, Elena; Khan, Muhammad Bashir; Fischer, Conrad; Lu, Jimmy; Lamer, Tess; Vuong, Wayne; van Belkum, Marco J.; McKay, Ryan T.; Tyrrell, D. Lorne; Vederas, John C.; Young, Howard S.; Lemieux, M. Joanne title: N-Terminal finger stabilizes the reversible feline drug GC376 in SARS-CoV-2 Mpro date: 2021-02-16 journal: bioRxiv DOI: 10.1101/2021.02.16.431021 sha: df5dd7c0d984ae80f3cd3104ad628c0a16026c53 doc_id: 851983 cord_uid: xvcx26x7 The main protease (Mpro, also known as 3CL protease) of SARS-CoV-2 is a high priority drug target in the development of antivirals to combat COVID-19 infections. A feline coronavirus antiviral drug, GC376, has been shown to be effective in inhibiting the SARS-CoV-2 main protease and live virus growth. As this drug moves into clinical trials, further characterization of GC376 with the main protease of coronaviruses is required to gain insight into the drug’s properties, such as reversibility and broad specificity. Reversibility is an important factor for therapeutic proteolytic inhibitors to prevent toxicity due to off-target effects. Here we demonstrate that GC376 has nanomolar Ki values with the Mpro from both SARS-CoV-2 and SARS-CoV strains. Restoring enzymatic activity after inhibition by GC376 demonstrates reversible binding with both proteases. In addition, the stability and thermodynamic parameters of both proteases were studied to shed light on physical chemical properties of these viral enzymes, revealing higher stability for SARS-CoV-2 Mpro. The comparison of a new X-ray crystal structure of Mpro from SARS-CoV complexed with GC376 reveals similar molecular mechanism of inhibition compared to SARS-CoV-2 Mpro, and gives insight into the broad specificity properties of this drug. In both structures, we observe domain swapping of the N-termini in the dimer of the Mpro, which facilitates coordination of the drug’s P1 position. These results validate that GC376 is a drug with an off-rate suitable for clinical trials. In late 2019, a respiratory infection initially detected in China, was sparking fear of a viral 72 outbreak [1] . This respiratory infection attributed to severe acute respiratory syndrome coronavirus 73 2 (SARS-CoV-2), led to an ongoing coronavirus disease 2019 (COVID-19) pandemic with 74 millions infected worldwide (https://coronavirus.jhu.edu/map.html). This respiratory illness was 75 similar to a previous infection by SARS-CoV that led to a SARS outbreak in 2002/3 as well as the 76 Middle East respiratory infection (MERS) outbreak in 2012 [2,3]. All of these outbreaks stem from 77 related betacoronavirus infections, suggesting these strains will likely lead to future viral 78 outbreaks. Vaccines have been developed and will be important for prevention of new infections 79 in the future. However, even with a 95% immunity rate, there will be a significant proportion of 80 people worldwide who will require therapeutic treatment. Antiviral development remains a priority 81 because of importance of immediate mitigation of acute infections, vaccine hesitancy, and the 82 inability to vaccinate some individuals. The outbreak of SARS in 2003 and MERS in 2012 along 83 with the current pandemic reminds us that pan-inhibitors may provide a means for initial control 84 of outbreaks, thereby preventing or quickly controlling pandemics in the future [4] . 85 SARS-CoV-2 is a 30-kb positive-sense single-stranded RNA virus that is translated by the 86 host's cellular machinery to generate two alternatively spliced long polypeptides, PP1a and PP1ab. 87 These long polypeptides release non-structural proteins (nsps), including the RNA-dependent 88 RNA polymerase, that are essential for viral replication after proteolytic cleavage by proteases 89 from domain nsp3 and nsp5, respectively, a papain-like (PL pro ) protease and a chymotrypsin-like 90 main protease (M pro or 3CL pro ) [5] . Similar to SARS-CoV, the SARS-CoV-2 M pro enzyme 91 recognises the sequence of Leu-Gln↓Ser-Ala-Gly, where ↓ marks the cleavage site and this 92 sequence is widely employed for generation of substrates for kinetic analysis and for development 93 Reversibility was tested by measuring catalytic activity post dialysis. Incubation of SARS-CoV 163 M pro and SARS-CoV-2 M pro with the GC376 followed by dialysis resulted in increase in enzymatic 164 activity over time, indicative of a reversible dissociation of inhibitor (Fig. 3) . We observed a 165 recovery of 10% of activity after 22 hours of dialysis, which reached 30 -40% of initial activity 166 for SARS-CoV and 40-60% for SARS-CoV-2 after 4 days of dialysis, suggesting over time the 167 substrate competed for the enzyme binding site. To ensure the proteins remained stable over this 168 time period, we also monitored the stability of uninhibited enzymes, which was compared with the 169 activity of recovered enzymes. After 4 days the residual protease activity for the uninhibited M pro 170 of SARS-CoV and SARS-CoV-2 was 30-40%, which allowed us to conclude that the drug was 171 fully reversible. 172 173 After observing the high kinetic stability of both viral proteases at room temperature, we 175 characterized their thermal stability and assessed their thermodynamic parameters including 176 activation energies of inactivation. Thermal stability is a characteristic used to describe the kinetic 177 stability of enzymes, and many individual proteins or protein complexes are known to have high 178 kinetic stability [27] [28] [29] [30] [31] . For viral proteins, particularly the structural ones, this feature is crucial 179 because virus particles must be able to resist harsh environmental conditions until they find a new 180 host to infect and also remain stable during infection [10,13,32]. For example, determination of 181 thermodynamic parameters of the HIV protease in the presence of various inhibitors was used to 182 reveal the differences in protein stability upon forming inhibitor-protein complexes, which 183 informed on inhibitor design [33] . 184 Thermal inactivation of SARS-CoV M pro (Fig 4A and 4B ) and SARS-CoV-2 M pro (Fig 4D 185 and 4F) was studied at the temperature range of 24-70 º C in a time-dependant manner. The 186 semilogarithmic plots of residual activity versus incubation time were linear at all temperatures 187 for both proteins, which was indicative of a simple first-order monophasic kinetic process. From 188 the slopes of semilogarithmic plots inactivation rate constants were calculated and are given in 189 Table 2 . For both proteases, the rate constant progressively increased with increasing temperatures 190 whereas half-life (t1/2) and the decimal reduction time (Dt), two important parameters used in 191 characterization of enzyme stability, decreased. 192 The dependence of inactivation rate constants on temperature was plotted using the 193 Arrhenius equation (Fig 4C and 4F) , from which apparent activation energies of inactivation (Ea) 194 were calculated. Interestingly, Arrhenius plots for both proteases were not linear and showed 195 upward curvature suggesting two denaturation processes each with its own temperature 196 dependence and activation energy. At temperatures above 37 º C inactivation is a result of protein 197 unfolding with high activation energy, with the rate of this process strongly dependant on 198 temperature. At temperatures of 37 º C and below this rate becomes insignificant and other 199 processes with low activation energy prevail. The activation energies for the high temperature 200 range were found to be high and similar for SARS-CoV M pro (Ea=243.6 kJ/mol) and SARS-CoV-201 2 M pro (Ea=234.2 kJ/mol). However, for the low temperature range the activation energies were 202 10-20% of those determined at high temperature, confirming that M pro inactivation involves both 203 high-and low-activation energy processes. Interestingly, the parameters of the inactivation process 204 at low temperature range (24-37 º C) are different for M pro from SARS-CoV and SARS-CoV-2, 205 showing Ea of 16.4 kJ/mol and 41.4 kJ/mol and t1/2 (at 24 º C) of 38.5 h and 57.7 h respectively, 206 suggesting higher stability for SARS-CoV M pro . 207 Determination of all thermodynamic parameters of inactivation can provide further 208 information on enzyme stability. ΔG value, the Gibbs free energy, which is the energy barrier for 209 enzyme inactivation, is directly related to protein stability. We see a significant decrease in ΔG for 210 the temperatures above 55 º C indicating that the destabilization process occurs rapidly in this 211 temperature range ( Table 2) . 212 To gain a deeper insight into the driving forces of SARS-CoV M pro and SARS-CoV-2 M pro 213 stability, the Gibbs free energy was decomposed into its enthalpic and entropic contributions. 214 Enthalpy, ΔH, measures the number of non-covalent bonds broken during transition state 215 formation for enzyme inactivation, allowing us to compare the energy landscapes of both SARS-216 CoV M pro and SARS-CoV-2 M pro . For temperature ranging from 37 º C to 70 º C we observed 217 consistent high ΔH values, which is in agreement with a temperature-dependent inactivation 218 process. Interestingly, at the 24 º C and 37 º C temperature interval a significant jump in ΔH occured 219 for both proteases, however, with different initial enthalpy values for SARS-CoV M pro and SARS-220 CoV-2 M pro at 24 º C (13.9 and 38.9 kJ/mol respectively), again highlighting higher stability of 221 latter at physiological temperatures ( Table 2 ). The compactness in the protein molecular structure 222 as well as enzyme and solvent disorder can be inferred through the quantitative analysis of entropy 223 ΔS values [34, 35] . Small negative entropy values at 24 º C for both SARS-CoV M pro and SARS-224 CoV-2 M pro confirmed no disorder in protein structure upon inactivation; however, at higher 225 temperatures all values of ΔS were positive and similar, suggesting that unfolding is a rate-limiting 226 step at this range ( Table 2) . 227 228 We previously reported increased catalytic activity of SARS-CoV-2 M pro in comparison to 230 SARS-CoV M pro with the catalytic turnover rate being almost 5 times higher for the former using 231 a FRET-peptide as substrate [13] . We were interested in structural comparison of the M pro from 232 SARS-CoV and SARS-CoV-2, for both apo and drug-bound forms to reveal differences that 233 account for the enhancement in activity. Crystal structures of apo-M pro from SARS-CoV and 234 SARS-CoV-2, and bisulphite prodrug (GC376) and the aldehyde drug (GC373) bound forms were 235 determined. The two proteins share 96% sequence identity with only 12 out of 306 residues being 236 different (S1 Fig) . Therefore, as expected, there is little change in the overall structures of apo-237 SARS-CoV and SARS-CoV-2 M pro (Fig 5) , with an RMSD of 0.6 Å. We observed a new helical 238 feature at ƞ2 (residues 47-50) in SARS-CoV-2, which is unfolded in SARS-CoV, (S1 and S2 Fig) . analysis of the biological dimer of the two proteases revealed that the main differences are located 246 at the dimer interface. In the M pro of SARS-CoV-2, we observed a slight shift of the chymotrypsin-247 like domains away from each other, compared to the M pro of SARS-CoV (Fig 5B) . However, the 248 biggest change is the difference in association between the dimerization domains (Fig 5C and 249 5D ). The dimer interface of SARS-CoV and SARS-CoV-2 M pro is facilitated by several 250 interactions between the two protomers, one of which is between the helical domain III of each 251 protomer comprising of residues 284-286, specifically Ser-Thr-Ile (STI) in SARS-CoV M pro and 252 Ser-Ala-Leu (SAL) in SARS-CoV-2 M pro . This unstructured loop self-associates between 253 protomers in the dimer. Importantly, this region harbors a non-conservative residue in sequence at 254 the dimer interface, where the Thr285 in SARS-CoV M pro is altered to Ala285 in SARS-CoV-2 255 M pro (Fig 5E and 5F ). The SAL-motif forms a tight van der Waals interaction and the residues 256 from each protomer interdigitate to form a complementary interface that readily explains the 257 observed enhanced stability. 258 We recently presented the structure of GC373 with the SARS-CoV-2 M pro [13]. The 262 structure of SARS-CoV-2 M pro with drug GC373, as well as prodrug GC376 that converts to 263 GC373, reflects the specificity of the enzyme for a glutamine surrogate in the P1 position and a 264 leucine, which is preferred in the P2 position. A benzyl group is in the P3 position. Here we 265 determined the crystal structure of the SARS-CoV M pro with the prodrug GC376 and drug GC373 266 to examine features that determine its efficacy and compare this with the previously determined 267 SARS-CoV-2 structure (Fig 6) . 268 SARS-CoV M pro was incubated with GC373 and GC376, prior to crystallization. The best 269 crystals diffracted to 2.0 Å, and the data was refined with good statistics (Table 3) . Overall 270 comparison of SARS-CoV M pro and SARS-CoV-2 M pro structures with GC373 showed similar 271 agreements with the apo-M pro structures, with an RMSD of 0.6 Å (Fig 6) . The drug binding is 272 supported by H-bonding with the main chain of oxyanion hole residues Asn142, Gly143 and 273 Ser144, which are identical for both proteases (Fig 6B, S4 Fig and S5 Fig) . A good fit was 274 observed for both the P1 and P2 positions, supported structurally by hydrogen bonding and van 275 der Waals interactions respectively with H-bonds for the P1 position being identical for M pro from 276 SARS-CoV and SARS-CoV-2 (Fig 6C, S4 Fig and S5 Fig) . CoV-2 M pro and SARS-CoV M pro , we observe the N-termini interact with residues near S1 283 substrate-binding subsite in a hairpin adjacent to the oxyanion hole of the active site (Fig 7) . (Fig. 8) , likely 291 adding to its increased catalytic activity. The proper conformation of S1 pocket is also important 292 for the drug binding and importantly, P1 position of GC373 is also stabilized by hydrogen bonding 293 between the side chain of Glu166 (3.3 Å) and backbone carbonyl of Phe140 (3.3 Å) residues (Fig 294 8) . Thus, a hydrogen bond network between the dimer in M pro stabilizes the S1 substrate for 295 substrate binding and hence inhibitor binding. CoV-2 M pro are very similar, however, one key change at the domain III interface, namely 338 Thr285Ala in SARS-CoV-2 M pro , results in a significant alteration in the distance between the 339 domains of the protomers in the SARS-CoV-2 M pro dimer compared to SARS-CoV M pro (Fig 5) . 340 This mutation leads to residues in the domain III interface forming a hydrophobic zipper clearly 341 aligning the two domains, and thus likely enhancing the t1/2 at low temperatures as we have 342 observed above. The high degree of stability of the enzymes for both SARS-CoV and SARS-CoV-343 2 is an interesting feature that likely contributes to viral potency. 344 Another structural feature that might explain the increased activity and stability is a closer 345 association between the N-finger Ser1 and Phe140 in the oxyanion loop in the M pro of SARS-CoV-346 2 compared to SARS-CoV (Fig 8) . This interaction plays a critical role for activity since it sustains 347 the correct conformation of the oxyanion loop, therefore precise coordination of the N-finger in We demonstrated that the NH group of Ser 1 donates H-bonds to Phe140 and Glu166, the 360 residues that coordinate the N-termini of each protomer in the dimer. Importantly, these residues 361 also interact with the P1 position of GC373 in both SARS-CoV and SARS-CoV-2, demonstrating 362 a strong hydrogen bond network near the active site, and stabilization of the S1 subsite pocket. The 13 C-labelled GC376 inhibitor was synthesized according to previously documented 422 procedures, and initial HSQC NMR experiments involving only enzyme, only inhibitor, and both 423 co-incubated were prepared as previously described [13] . The sample used for the reversibility 424 experiment was prepared by subjecting a previously co-incubated sample containing both enzyme 425 and inhibitor to washing steps with buffer (D2O, 50 mM phosphate, pD 7.5 with 20 mM DTT). 426 This involved depositing the sample in an Amicon micro-spinfilter with a 10 kDa cutoff and 427 spinning down the sample at 6600 g for 15 min. The sample was then diluted to 300 µL and the 428 spin down and dilution steps were repeated once more, to a final volume of 300 µL. This sample 429 was then analyzed by NMR in an HSQC experiment, following protocols identical to those 430 previously described [13] . 431 432 Reversibility of 3CL protease inhibition with GC376 was determined by dialysis method. The 434 proteases were incubated with a single concentration (20 µM) of the GC376 compound for 15 min 435 at RT to allow for full inhibition. Then the enzyme-inhibitor mixture was placed in a 6-8 kDa 436 MWCO dialysis membrane (Fisher Scientific, Canada) and dialyzed against 2 L of 50 mM Tris-437 HCl, pH 7.8, 150 mM NaCl, 5% glycerol, 1mM DTT at RT. The dialysis buffer was changed every 438 24 hours. Control experiments, which included dialyzing apo-proteases at the same concentration 439 in the same dialysis buffer but different beakers, were performed simultaneously. The aliquots of 440 dialyzing samples were taken out at certain time points and used for activity measurements. The 441 data was represented as a percent of initial protease activity at a zero time point. 442 The thermal stability was determined by heating 2 µM solution of M pro SARS CoV or M pro SARS-443 CoV-2 in 50 mM Tris-HCl, pH 7.8, 150 mM NaCl, 5% glycerol, 1mM DTT buffer in a 444 thermostatted water-bath at various temperatures. 30 µl protein samples were taken out at specific 445 time points and immediately incubated on ice until activity measurements were performed as 446 described above. Residual activities were expressed as relative to the maximal activity, which was 447 the activity of proteases at zero time point. 448 The enzyme inactivation over time is described by a first-order equation: 449 where A represents enzyme activity at time t, A0 is the initial activity at time zero, k is the rate 451 constant (min −1 ), and t is time (min). Inactivation rate constants (kd) were obtained from slopes of 452 semi-logarithmical plots of residual activity versus incubation time at each temperature. Calculated 453 rate constants were replotted in Arrhenius plots as natural logarithms of k versus the reciprocal of 454 absolute temperature. Arrhenius law describes the temperature dependence of rate constant as 455 where Ea is the activation energy, R is the universal gas constant (8.31 J mol -1 K -1 ), and T is the 457 absolute temperature. Ea was calculated from the slope of Arrhenius plot. 458 The half-life of proteases (t1/2), defined as time after which activity is reduced to 50% of initial 459 value [46] , was determined as 460 Another common way to present inactivation rate is as D value -decimal reduction time, which is 462 the time required to reduce activity to 10% of the original value and calculated as: A new coronavirus associated with 539 human respiratory disease in China Genomic characterisation and epidemiology of 541 2019 novel coronavirus: implications for virus origins and receptor binding Genome Composition and Divergence 544 of the Novel Coronavirus (2019-nCoV) Originating in China The species Severe acute respiratory syndrome-related 547 coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2 From SARS to MERS: crystallographic studies on coronaviral proteases 550 enable antiviral drug design Evaluating the 3C-like 552 protease activity of SARS-Coronavirus: recommendations for standardized assays for drug 553 discovery SARS-CoV 3CL protease 555 cleaves its C-terminal autoprocessing site by novel subsite cooperativity SARS-CoV-2 M(pro) 558 inhibitors and activity-based probes for patient-sample imaging Malleability of the 561 SARS-CoV-2 3CL M(pro) Active-Site Cavity Facilitates Binding of Clinical Antivirals Crystal structure of SARS-CoV-2 564 main protease provides a basis for design of improved alpha-ketoamide inhibitors Structure of M(pro) from SARS-CoV-2 and 567 discovery of its inhibitors In silico prediction of potential inhibitors for the main 569 protease of SARS-CoV-2 using molecular docking and dynamics simulation based drug-570 repurposing Feline coronavirus drug 572 inhibits the main protease of SARS-CoV-2 and blocks virus replication Viral Protease Inhibitors Boceprevir, GC-376, and calpain 577 inhibitors II, XII inhibit SARS-CoV-2 viral replication by targeting the viral main protease Broad-spectrum 580 antivirals against 3C or 3C-like proteases of picornaviruses, noroviruses, and 581 coronaviruses Reversal 583 of the Progression of Fatal Coronavirus Infection in Cats by a Broad-Spectrum Coronavirus 584 3C-like protease 586 inhibitors block coronavirus replication in vitro and improve survival in MERS-CoV-587 infected mice Structure-guided design and optimization of dipeptidyl inhibitors of norovirus 3CL 590 protease. Structure-activity relationships and biochemical, X-ray crystallographic, cell-591 based, and in vivo studies Broad-spectrum 593 inhibitors against 3C-like proteases of feline coronaviruses and feline caliciviruses Structure-guided design of potent and permeable inhibitors of MERS coronavirus 3CL 597 protease that utilize a piperidine moiety as a novel design element Protease inhibitors broadly effective against feline, ferret and mink coronaviruses Antiviral Drug Discovery: 603 Norovirus Proteases and Development of Inhibitors Emerging principles in protease-based drug discovery Based Covalent Inhibitors of Coronavirus 3CL Proteases for the Potential Therapeutic 608 Treatment of COVID-19 Targeting proteases: successes, failures and future prospects Kinetic partitioning during protein folding yields 612 multiple native states A soluble domain 614 of the membrane-anchoring chain of influenza virus hemagglutinin (HA2 Escherichia coli into the low-pH-induced conformation Guanidine hydrochloride-induced 618 denaturation and refolding of transthyretin exhibits a marked hysteresis: equilibria with 619 high kinetic barriers The domain-swapped dimer of 621 cyanovirin-N is in a metastable folded state: reconciliation of X-ray and NMR structures SNARE assembly and disassembly 624 exhibit a pronounced hysteresis Peptide aldehyde 626 inhibitors of hepatitis A virus 3C proteinase Kinetic and 628 thermodynamic characterisation of HIV-protease inhibitors against E35D upward arrowG 629 upward arrowS mutant in the South African HIV-1 subtype C protease Kinetics and thermodynamics of the thermal 632 inactivation of polyphenol oxidase in an aqueous extract from Agaricus bisporus Characterization of three-phase partitioned exo-polygalacturonase 635 from Aspergillus sojae with unique properties Two adjacent mutations on the dimer 638 interface of SARS coronavirus 3C-like protease cause different conformational changes in 639 crystal structure Residues on the dimer interface of 641 SARS coronavirus 3C-like protease: dimer stability characterization and enzyme catalytic 642 activity analysis Production of authentic SARS-CoV 644 M(pro) with enhanced activity: application as a novel tag-cleavage endopeptidase for 645 protein overproduction Crystallographic structure 647 of wild-type SARS-CoV-2 main protease acyl-enzyme intermediate with physiological C-648 terminal autoprocessing site The advantages 650 of describing covalent inhibitor in vitro potencies by IC50 at a fixed time point. IC50 651 determination of covalent inhibitors provides meaningful data to medicinal chemistry for 652 SAR optimization The Discovery and Development of Boceprevir: A Novel First-generation Inhibitor of the Hepatitis C Virus NS3/4A Serine Protease Molecular and 657 Dynamic Mechanism Underlying Drug Coronavirus main 660 proteinase (3CLpro) structure: basis for design of anti-SARS drugs The catalysis of the SARS 3C-like protease is under extensive 663 regulation by its extra domain Structural Basis for Inhibiting Porcine 665 Epidemic Diarrhea Virus Replication with the 3C-Like Protease Inhibitor GC376 Web-Ice: integrated 670 data collection and analysis for macromolecular crystallography Macromolecular 674 structure determination using X-rays, neutrons and electrons: recent developments in 675 Phenix PHENIX: a 677 comprehensive Python-based system for macromolecular structure solution Table 3. Data collection and refinement statistics (molecular replacement) for SARS-CoV M pro 706 with drug GC373 and prodrug GC376 Table 1 ). For the complex of SARS-500CoV-2 M pro with GC376 and GC373, the data set collected, was processed at a resolution of 1.9 Å 501 and 2.0 Å and in space group C2 (Supplementary Table 1 ). All three structures were determined 502 by molecular replacement with the crystal structure of the free enzyme of the SARS-CoV-2 M pro 503 (PDB entry 6Y7M as search model, using the Phaser program from Phenix[49], version v1.18.1-504