key: cord-298326-f5q7j3iu authors: Nick, Benjamin C.; Pandya, Mansi C.; Lu, Xiaotao; Franke, Megan E.; Callahan, Sean M.; Hasik, Emily F.; Berthrong, Sean T.; Denison, Mark R.; Stobart, Christopher C. title: Identification of a critical horseshoe-shaped region in the nsp5 (Mpro, 3CLpro) protease interdomain loop (IDL) of coronavirus mouse hepatitis virus (MHV) date: 2020-06-19 journal: bioRxiv DOI: 10.1101/2020.06.18.160671 sha: doc_id: 298326 cord_uid: f5q7j3iu Human coronaviruses are enveloped, positive-strand RNA viruses which cause respiratory diseases ranging in severity from the seasonal common cold to SARS and COVID-19. Of the 7 human coronaviruses discovered to date, 3 emergent and severe human coronavirus strains (SARS-CoV, MERS-CoV, and SARS-CoV-2) have recently jumped to humans in the last 20 years. The COVID-19 pandemic spawned by the emergence of SARS-CoV-2 in late 2019 has highlighted the importance for development of effective therapeutics to target emerging coronaviruses. Upon entry, the replicase genes of coronaviruses are translated and subsequently proteolytically processed by virus-encoded proteases. Of these proteases, nonstructural protein 5 (nsp5, Mpro, or 3CLpro), mediates the majority of these cleavages and remains a key drug target for therapeutic inhibitors. Efforts to develop nsp5 active-site inhibitors for human coronaviruses have thus far been unsuccessful, establishing the need for identification of other critical and conserved non-active-site regions of the protease. In this study, we describe the identification of an essential, conserved horseshoe-shaped region in the nsp5 interdomain loop (IDL) of mouse hepatitis virus (MHV), a common coronavirus replication model. Using site-directed mutagenesis and replication studies, we show that several residues comprising this horseshoe-shaped region either fail to tolerate mutagenesis or were associated with viral temperature-sensitivity. Structural modeling and sequence analysis of these sites in other coronaviruses, including all 7 human coronaviruses, suggests that the identified structure and sequence of this horseshoe regions is highly conserved and may represent a new, non-active-site regulatory region of the nsp5 (3CLpro) protease to target with coronavirus inhibitors. Importance In December 2019, a novel coronavirus (SARS-CoV-2) emerged in humans and triggered a pandemic which has to date resulted in over 8 million confirmed cases of COVID-19 across more than 180 countries and territories (June 2020). SARS-CoV-2 represents the third emergent coronavirus in the past 20 years and the future emergence of new coronaviruses in humans remains certain. Critically, there remains no vaccine nor established therapeutics to treat cases of COVID-19. The coronavirus nsp5 protease is a conserved and indispensable virus-encoded enzyme which remains a key target for therapeutic design. However, past attempts to target the active site of nsp5 with inhibitors have failed stressing the need to identify new conserved non-active-site targets for therapeutic development. This study describes the discovery of a novel conserved structural region of the nsp5 protease of coronavirus mouse hepatitis virus (MHV) which may provide a new target for coronavirus drug development. we used a combination of alanine-scanning mutagenesis and C-terminal additions and deletions 187 to initially mutate the MHV nsp5 IDL ( Table 1) . Of the 16 amino acids comprising the loop, a 188 total of 8 virus mutants were successfully recovered (P184A, R186A, A188I, V190I, V191I, 189 P194A, Q196A, and Y198A), 5 amino acid residues failed to permit virus recovery despite 190 multiple attempts at rescue (Y185A, D187A, Q189A, Q192A, and T199A), and 3 amino acid 191 residues were not evaluated (L193, V195, and D197). Among the unrecovered mutants, 192 additional attempts to rescue using more conservative amino acid substitutions at residues D187 193 (D187E) and Q192 (Q192N) were also unsuccessful. A total of four different C-terminal 194 modifications were also attempted, which included 2 different C-terminal additions (a 195 duplication of residues 197 -199 and a duplication of residue 199) and 2 different C-terminal 196 deletions (a deletion of residues 197 -199 and a deletion of residue 199). All four of these C-197 terminal modifications to the nsp5 IDL failed to permit virus recovery. 198 199 Analyses of plaque formation, replication, and protease activity reveal a novel 200 temperature-sensitive mutant in the MHV nsp5 IDL. To evaluate the replication kinetics of 201 each of the recovered MHV nsp5 IDL mutants, we infected confluent DBT-9 cells with an MOI 202 of 0.01 of each of the IDL mutants and titered aliquots over a 24 h period ( Fig. 2A) . All 8 203 recovered MHV IDL mutants exhibited indistinguishable replication kinetics compared to WT 204 MHV. Previously, we described a total of 3 separate temperature-sensitive mutations (tsV148A, 205 tsS133A, and tsF219L) in the MHV nsp5 protease whose phenotypes could be suppressed 206 through long-distance second-site suppressor mutations (28, 29, 42). To evaluate whether any of 207 the recovered MHV nsp5 IDL mutants may exhibit a temperature-sensitive phenotype, we 208 performed an efficiency of plating (EOP) analysis by comparing the titers of each IDL virus by 209 plaque assay determined at a physiologic (37°C) and elevated temperature (40°C) (Fig. 3A) . 210 Average EOP values were determined by the average ratios of titers at 40°C compared to 37°C, 211 with those EOP values less than 10 -1 indicating a greater than 10-fold reduction in titers at the 212 elevated temperature as being temperature-sensitive (ts). WT MHV exhibited an average EOP of 213 7.6 x 10 -1 . In contrast, previously described ts nsp5 mutant virus S133A, exhibited an average 214 EOP of 1.49 x 10 -4 , consistent with the EOP previously reported (29). Two separate MHV nsp5 215 IDL mutants exhibited average EOP values less than 10 -1 and were significantly lower than WT 216 MHV (p<0.05): P184A and R186A. Mutant P184A exhibited an average EOP of 1.39 x 10 -2 . In 217 contrast, IDL mutant R186A resulted in a much lower average EOP of 7.6 x 10 -4 , which was not 218 significantly different from the known ts mutant S133A. No other IDL mutants exhibited average 219 EOPs significantly different from WT MHV. These data suggested that mutagenesis of two 220 separate IDL residues (P184A and R186A) have resulted in novel temperature-sensitive 221 phenotypes. To determine whether the observed differences in phenotype for IDL mutants 222 P184A and R186A are due specifically to defects in nsp5 protease activity or some other long-223 distance effect, we performed a Western blot to evaluate the ability for the P184A and R186A 224 nsp5 proteases to process the maturation cleavage of a downstream replicase (pp1ab) protein, 225 nsp8, during virus replication (Fig. 3B) . Lysates from WT-, P184A-, and R186A-infected DBT-9 226 cells were compared for nsp5-mediated nsp8 processing at 37°C compared to 40°C. WT-MHV 227 and P184A exhibited approximately equivalent levels (ratios of 1.08 and 0.99, respectively) of 228 nsp8 protein detected at both temperatures. Consistent with its temperature-sensitive EOP, virus 229 mutant R186A exhibited reduced nsp8 protein detected at 40°C compared to 37°C (ratio of 0.78) 230 and when normalized to WT, exhibited an approximate 27% reduction in mature nsp8 protein 231 produced at the elevated temperature. These data demonstrate that MHV nsp5 IDL mutation 232 R186A is associated with reduced nsp5 activity at 40°C, whereas no appreciable difference in 233 processing at 40°C was detected for mutant P184A. 234 To assess the impact of elevated temperature on replication of the recovered MHV IDL mutant 236 viruses, we repeated the MOI 0.01 replication assay in DBT-9 cells at 40°C (Fig. 2B) . In contrast 237 to replication at 37°C, the replication kinetics among the MHV IDL strains were far more 238 variable, with most strains exhibiting a delay in logarithmic growth compared to WT MHV. 239 Mutant P184A, which had shown a temperature-sensitive EOP of 1.39 x 10 -2 , failed to exhibit 240 replication kinetics that were significantly different for wild-type or the other MHV IDL strains. 241 In contrast, mutant strain R186A showed significantly delayed replication kinetics to reach the 242 maximal logarithmic growth rate (p<0.05) compared to WT MHV consistent with its 243 temperature-sensitive EOP of 7.6 x 10 -4 . Collectively, these data indicate that mutant R186A 244 exhibits both significantly reduced capacity to form plaques and delayed replication kinetics at 245 the elevated temperature of 40°C compared to WT MHV. 246 247 Reversion analysis of ts MHV nsp5 IDL mutant R186A reveals three compensatory second-248 site suppressor mutations. To identify potential interacting residues and novel regulatory 249 networks within the MHV nsp5 protease associated with residue R186, we performed reversion 250 analysis at 40°C by expanding and sequencing formed plaques at the inhibitory temperature ( Fig. 251 4A). A total of 10 plaques were selected at expanded in T25 flasks for virus collection and 252 sequencing. Of these, 6 of these plaques resulted in the original R186A mutant virus while 3 of 253 these plaques yielded R186A in addition to one of each of three different second-site putative 254 suppressor mutations in nsp5: P184S, L141V, and L141I (Fig. 4B) . Additional sequencing was 255 performed on these 3 recovered viruses throughout the ORF1ab coding region and no other 256 mutations were identified. The P184S mutation arose within the MHV nsp5 IDL, while residue 257 L141 is located on the same loop housing the C145 catalytic residue of the active site. 258 To evaluate whether the emergence of these second-site suppressor mutations aids in viral 260 growth at 40°C, an EOP analysis was performed using these viruses at 37°C and 40°C (Fig. 4C) . bottom part of the binding pocket for residues P2 -P5 of the substrate (Fig. 5A and B) . 280 Modeling using the crystal structure of SARS-CoV-2, residues D187 and T188 formed a distinct 281 pocket in and around the P2 residue of Leu, residues T188 and Q189 establish the back wall of 282 the P3 binding pocket, and residues Q189, T190, and Q192 are responsible for forming the back 283 (Q189 and T190) and base (Q192) of the P4 and P5 binding pockets. 284 285 Among the MHV IDL mutants which failed to rescue were D187A, Q189A, and Q192A. Amino 286 acid residues D187 and Q192 are structurally conserved in all sequenced nsp5 proteases to date 287 ( Fig. 1B) . Both D187 and Q192 are located in a conserved horseshoe-shaped region in the N-288 terminus of the IDL. The D187 side chain projects from the top of the horseshoe-shaped region 289 towards domain 1 and the protease active site and forms the inner wall pocket for the P2 binding 290 site. In an alignment of the D187 residues of MHV, SARS-CoV, MERS-CoV, and SARS-CoV-291 2, the positioning and orientation of the side chain are highly conserved with predicted polar 292 contacts with two additional highly conserved residues R40 (which is immediately adjacent to 293 the catalytic H41) and Y54 (Fig. 5C) . The Q192 side chain is conserved in its positioning 294 towards the center of the horseshoe-shaped region where it shares predicted polar contacts with 295 several other IDL residues including A188 and R186 (in MHV), R186 and R188 (in SARS-CoV-296 2), K191 (in MERS-CoV), and T190 (in SARS-CoV) (Fig. 5D) (including the current SARS-CoV-2 pandemic) which collectively highlight both the importance 308 for rapid development of effective therapeutics for the treatment of COVID-19, but also the need 309 to be prepared for potential future coronavirus outbreaks. In the present study, we evaluated the 310 structure and function of the nsp5 protease IDL, a poorly studied and structurally conserved 311 region of the protease. Using site-directed mutagenesis, we demonstrated that some residues and 312 regions of the protease were capable of accepting mutations without apparent defects in viral 313 replication, however a number of residues mostly located within a horseshoe-shaped region in 314 the N-terminus of the protease either failed to permit virus recovery or resulted in a viral 315 temperature-sensitivity. Of the 16 amino acid residues comprising the loop, we were able to 316 successfully recover viral mutants at 8 different locations ( Table 1) . 317 318 Despite the overall structural conservation of the entirety of the loop, the majority of these 319 mutations resulted in no apparent defects in viral replication compared to WT. A few of these 320 residues (A188, V190, and V191) with no apparent viral defects are known to form the basis of 321 part of the P3 -P5 substrate binding pockets of the protease (24, 26). Yet, compared to the rest of the IDL, these residue positions showed among the least sequence conservation (Figure 1B) , 323 which may explain the plasticity with which these residues could tolerate mutagenesis as well as 324 cleavage site variability among coronaviruses (16). Similarly, more C-terminal residues P194, 325 Q196, and Y198 are also found in more variable sequence locations within the IDL. Collectively, 326 these 8 residue positions may simply represent flexible linker residues than serving additional 327 structural supportive or enzymatic roles within the protease. 328 329 Residues P184 and R186, while rescued when mutated to alanine amino acids, exhibited reduced 330 capacity to form plaques at 40°C. P184 is found at a bend leading into the horseshoe shaped 331 region of the IDL and may be responsible for helping stabilize the N-terminal anchor of the loop 332 within domain 2. Replication analysis and Western blots of the P184A mutant virus failed to 333 show significant differences from WT MHV, however the selection of a P184S mutation in 334 reversion analysis of R186A may suggest that these two residues represent stabilizing and 335 interacting nodes within the protease (Figure 4B ). We previously described 3 different 336 temperature-sensitive mutations in MHV-A59 (S133A, V148A, and F219L) which all shared 337 overlapping compensatory second-site suppressor mutations (28, 29, 42). All 3 viruses selected 338 for an H134Y mutation, while the temperature-sensitive V148A mutation selected for an S133N 339 mutation. Furthermore, second-site mutations were identified for F219L which were located 340 greater than 20 Å away from the initial mutation. P184A is located on an adjacent loop in 341 domain 2 to both S133 and H134 (less than 6 Å) in distance (not shown). MHV viral mutant 342 R186A was found to exhibit delayed replication kinetics (Figure 2) , reduced capacity to form 343 plaques (Figure 3) , and reduced nsp5-mediated proteolytic processing at the elevated 344 temperature of 40°C, consistent with a temperature-sensitive phenotype (Figure 3) . Perhaps surprising, the R186 residue position was the most variable and least conserved structurally 346 among all 7 HCoVs evaluated ( Figure 1B) . Structural analysis of the MHV, SARS-CoV, SARS-347 CoV-2, and MERS-CoV revealed that the side chain of the 100% conserved Q192 appears to 348 form conserved polar interactions with the backbone amino and carboxyl termini of the residue 349 186 position (Figure 5D) . These data may suggest that Q192 is stabilized within the horseshoe 350 show a high level of amino acid conservation with two of these residues (D187 and Q192) being 372 100% conserved across all known coronavirus nsp5 protease sequences to date (Figure 1B) . All 373 four of these residues are found within a conserved horseshoe-shaped region within the N-374 terminus of the nsp5 IDL. We propose that this horseshoe-shaped region is a critical region of 375 the protease for both structure and function based on the following observations: (1) residues with the catalytic dyad H41 and C145 residues labeled. Predicted polar contacts 607 between Q192 and other residues of the IDL are shown. SARS-CoV has an additional and 608 unique predicted polar interaction with T190 (shown in red Coronaviruses: an overview of their replication and pathogenesis Conservation of substrate specificities among coronavirus main Inhibition of SARS-CoV 3CL protease by 484 flavonoids Prediction of Novel Inhibitors of the Main Protease (M-pro) of SARS-CoV-2 through Consensus Docking and Drug Reposition Crystal structure of SARS-CoV-2 main protease provides a basis for design of 491 improved α-ketoamide inhibitors Evaluation of a Non-Prime Site Specificities of 3C and 3C-like Proteases by Zinc-coordinating and Peptidomimetic Insights for Wide Spectrum Anti-Coronavirus Drug Design Structure of the main 529 protease from a global infectious human coronavirus, HCoV-HKU1 Human coronavirus OC43 3CL protease and the 531 potential of ML188 as a broad-spectrum lead compound: Homology modelling and 532 molecular dynamic studies Modeling of the 534