key: cord-257058-wf6oxzrk authors: Kim, Sinae; Lee, Jong Ho; Lee, Siyoung; Shim, Saerok; Nguyen, Tam T.; Hwang, Jihyeong; Kim, Heijun; Choi, Yeo-Ok; Hong, Jaewoo; Bae, Suyoung; Jhun, Hyunjhung; Yum, Hokee; Lee, Youngmin; Chan, Edward D.; Yu, Liping; Azam, Tania; Kim, Yong-Dae; Yeom, Su Cheong; Yoo, Kwang Ha; Kang, Lin-Woo; Shin, Kyeong-Cheol; Kim, Soohyun title: The Progression of SARS Coronavirus 2 (SARS-CoV2): Mutation in the Receptor Binding Domain of Spike Gene date: 2020-10-26 journal: Immune Netw DOI: 10.4110/in.2020.20.e41 sha: doc_id: 257058 cord_uid: wf6oxzrk Severe acute respiratory syndrome coronavirus 2 (SARS-CoV2) is a positive-sense single-stranded RNA (+ssRNA) that causes coronavirus disease 2019 (COVID-19). The viral genome encodes twelve genes for viral replication and infection. The third open reading frame is the spike (S) gene that encodes for the spike glycoprotein interacting with specific cell surface receptor – angiotensin converting enzyme 2 (ACE2) – on the host cell membrane. Most recent studies identified a single point mutation in S gene. A single point mutation in S gene leading to an amino acid substitution at codon 614 from an aspartic acid 614 into glycine (D614G) resulted in greater infectivity compared to the wild type SARS-CoV2. We were interested in investigating the mutation region of S gene of SARS-CoV2 from Korean COVID-19 patients. New mutation sites were found in the critical receptor binding domain (RBD) of S gene, which is adjacent to the aforementioned D614G mutation residue. This specific sequence data demonstrated the active progression of SARS-CoV2 by mutations in the RBD of S gene. The sequence information of new mutations is critical to the development of recombinant SARS-CoV2 spike antigens, which may be required to improve and advance the strategy against a wide range of possible SARS-CoV2 mutations. A positive-sense single-stranded RNA (+ssRNA) coronavirus causing severe acute respiratory syndrome (SARS-CoV) was first reported nearly two decades ago (1) . Indeed, beta coronaviruses (CoV) have caused zoonotic epidemics or pandemics in man thrice in the past 20 years: SARS-CoV in 2002 , MERS-CoV in 2012 , and SARS-CoV2 beginning in late 2019. However, the current outbreak SARS-CoV2 is highly contagious and has caused the worst pandemic on earth in the past 100 years, with the capacity to cause a severe respiratory illness with high mortality. In the absence of an effective treatment or vaccine, SARS-CoV2 has become a global pandemic with nearly 34.14 million cases and 1.018 million deaths as of early October 2020 (https://www.worldometers.info/coronavirus/). The genome sequence of SARS-CoV2 shares approximately 80% identity with that of SARS-CoV. Furthermore, SARS-CoV2 uses the same cell entry receptor ACE2 as SARS-CoV (2) . The result of SARS-CoV2 has emerged a pandemic although they share high similarity in genome sequence as well as the same receptor on host cell membrane. The S gene of SARS-CoV, which encodes for the spike glycoprotein on the viral envelope, recognizes a receptor on the membrane of specific host cells. The spike protein is cleaved into S1 and S2 subunits by the protease during SARS-CoV interaction with the cell surface ACE2 receptor (3) (4) (5) (6) . Recent study showed that human ACE2 is the receptor for SARS-CoV2 by using S gene of pseudovirus system and spike pseudovirus entered 293T/hACE2 cells mainly via endocytosis [7] . The data also suggested that spike glycoprotein of SARS-CoV2 is less stable than that of SARS-CoV. Also, there is no antigen cross reactivity between SARS-CoV and SARS-CoV2 using polyclonal anti-SARS S1 subunit antibodies. The entry of SARS-CoV spike pseudovirions was sufficiently blocked, but the same polyclonal antibodies failed to block SARS-CoV2 infection. In addition, data showing that coronavirus disease 2019 patients' sera with a limited neutralization effect implicate that recovery from a first infection may not be protective against subsequent infection (7). The receptor binding domain (RBD 318-513 amino acids) within the S1 subunit of the SARS-CoV spike protein shares 73% identity with that the RBD (307-527 amino acids) of SARS-CoV2 (8) , which is crucial for binding to ACE2 (9, 10) . While the RBD of S1 subunit binds to the peptidase domain (PD) of ACE2 on host cell membrane, the cleaved site of S2 subunit is exposed for further cleavage by host protease, which is essential for SARS-CoV2 infection (3, 5, 11) . A recent study compared the cleavage site of SARS-CoV to that of SARS-CoV2 and found a difference in the cleavage of the S1 and S2 subunits of SARS-CoV and SARS-CoV2, which likely affects the viral infectivity (12) . The extracellular domain of the SARS-CoV2 spike protein binds to the PD of ACE2 with a high affinity (13) . Although ACE2 is the known receptor for SARS-CoV2, its primary role in host cells is in processing angiotensin, which controls vasoconstriction and blood pressure. ACE2 is a membrane protein expressed ubiquitously, but mainly lungs, heart, kidneys, and intestine (14, 15) . The reduction of ACE2 expression in cells may exacerbate cardiovascular diseases (1, 16) . The claw-like structure of PD in ACE2 and the complex structure of the RBD of SARS-CoV spike glycoprotein is supported by the molecular interactions in the binding sites between the RBD of spike protein and the PD of ACE2 (6, 9, 10, 13, 15) . The original report of the receptor for SARS-CoV is unclear because there were no solid biochemical results except immunoprecipitation and immunohistochemistry studies following cell infection (17) albeit there are different crystal structure studies (6, 9, 10, 13, 15) . The present study with novel mutations in critical RBD of S gene may explain the high pathogenicity of SARS-CoV2 through precise biochemistry result with recombinant spike proteins as well as cell infectivity experiment. This study was approved by the Institutional Review Board of Yeungnam University Medical Center, Korea (approval No. 2020-07-063 Briefly, the synthesis of cDNA from RNA template of COVID-19 patients was performed using the MMLV Reverse Transcription kit (Millipore Sigma, St. Louis, MO, USA) according to the manufacturer's instructions. The forward and reverse primer sequences were used to amplify S gene of SARS-CoV2 are shown in Table 1 . Normalization was based on the GAPDH using primers as follows; forward: 5′-ACCACAGTCCATGCCATCAC-3′ and reverse: 5′-TCCACCACCCTGTTGCTGTA-3′. The PCR product of the S gene of SARS-CoV2 was ligated into yT&A cloning vector according to the manufacturer's instructions (DonginBiotech, Seoul, Korea). The positive plasmid vector (0.8 µg) was digested with Hind III restriction enzyme (7.5 units; Takara, Japan) with provided buffer to release the S gene of SARS-CoV2. 5′ CCGCCGAGGAGAATTAGTCT The forward and reverse primer sequence were shown in three panels. The expected PCR product size of SARS-CoV2 spike gene was indicated on the top of each panel. The positive clone from colony PCR screening was cultured in 3 ml volume of LB broth for overnight at 37°C. The bacterial cells were used to isolate plasmid DNA according to manufacturer's instructions (Promega, Madison, WI, USA). The plasmid was digested with Hind III restriction enzyme to confirm the insert cDNA of SARS-CoV2 S gene. The positive SARS-CoV2 S gene plasmid was sent for DNA sequencing analysis (Cosmotech, Seoul, Korea). The nasopharyngeal swabs from patients were used to isolate the RNA samples to test SARS-CoV2 infection. The RNA samples from Yeungnam (YN) University Hospital were examined by RT-PCR according to the guidelines of Korea Centers for Disease Control and Prevention. SARS-CoV2 infection was confirmed by ~Ct value as shown in Table 2 . We used four positive SARS-CoV2 RNA samples to isolate the S gene. The full length (FL) open reading frame (ORF) was comprised of 3,822 bp that encode 1,273 amino acid residues including the last stop "TAA" codon. The sense and reverse primers were designed ( Table 1 , upper panel) to amplify the FL S gene of SARS-CoV2. The pair of primer failed to amplify FL spike ORF of 3,822 bp (Fig. 1A) whereas a housekeeping gene GAPDH from the YN patients' RNA was amplified adequately in the bottom panel of Fig. 1A . This finding suggested the presence of an unstable RNA transcript of SARS-CoV2 spike or positive-sense single-stranded RNA (+ssRNA SARS-CoV2) genome was not abundant enough to amplify the FL SARS-CoV2 S gene. Therefore, we focused on the specific mutation region 4/11 https://doi.org/10.4110/in.2020.20.e41 https://immunenetwork.org of S gene in which aspartic acid 614 was mutated into glycine (D614G). Additional two pairs of forward and reverse primers were designed ( Table 1 , middle and bottom panel) to amplify the mutated region of the S gene. The forward CoV2sF-2 and reverse CoV2sR-2 primers successfully amplified a 719 bp PCR product from all four YN patients (Fig. 1B) , but the forward CoV2sF-3 and CoV2R-3 primer failed to do it (Fig. 1C) . The yield of PCR product in Fig. 1B was correlated to the GAPDH, but it was not corresponded to ~Ct value ( Table 2 ). The PCR product of four patients was ligated into TA cloning vector and then the ligated samples were transformed into bacterial competent cells. Five white bacterial colonies from each patient were used to find a positive plasmid. The same primers were used for PCR screening to identify a positive plasmid clone containing SARS-CoV2 S gene cDNA. PCR screening identified different number of positive clones for example four clones of YN1 ( Fig. 2A) , five clones of YN2 (Fig. 2B) , five clones of YN3 (Fig. 2C) , and three clones of YN4 patient (Fig. 2D) . Two positive plasmids indicated with red number from each patient were selected for the mini prep of plasmid purification (Fig. 2) . As shown in Fig. 3A , Hind III restriction enzyme site is in both 15 bp of 5′ upstream and 44 bp of 3′ downstream in SARS-CoV2 spike mutation region. The isolated cDNA plasmid was digested with Hind III restriction enzyme to release the insert of S gene cDNA. The result showed a slightly higher band comparing to the 719 bp PCR product because additional 59 bp of vector sequence was released together with the S gene insert of SARS-CoV2 (Fig. 3B) . DNA sequencing of the S gene cDNA revealed three mutations in addition to the known D614G mutation. The N-terminal region of the SARS-CoV2 S gene was translated into amino acid sequence and deposited into the NCBI database (GenBank accession number: MW052550). The alignment of four SARS-CoV2 spike amino acid sequences compared to the wild type sequence revealed that there are two additional mutations in the critical RBD and another mutation in subdomain (SD) 2, which is very close to the known mutation residue D614G (Figs. 4) . The known mutation (D614G) residue highlighted by light blue with red amino acid residue. New three mutations residues were highlighted by green with red amino acid residue. Eight cysteine residues were marked by yellow as well as the potential glycosylation was indicated by gray (Fig. 4) . The abundant cysteine residues are probably involved in an inter-or intra-disulfide bond for the ternary structure of spike glycoprotein. Also, the three potential glycosylation sites may influence structure and stability of spike glycoprotein on lipid envelope of SARS-CoV2. Interestingly, the known mutation site D614G residue including three novel mutation sites are conserved between severe acute respiratory syndrome (SARS) and SARS-CoV2 in Fig. 4B . The known single point mutation at D614G in SD2 of spike has been reported by different groups. aspartic acid 614 codon "GAT" (reference sequence) was substituted by glycine 614 codon "GGT" (D614G) by chromatographic DNA sequencing result (Fig. 5A) . This known D614G mutation was present in all four patients (Fig. 4) . Additional three amino acid residue mutations were analyzed. Interestingly, these mutations were present as common mutation in all four patients (Fig. 4) . The chromatographic DNA sequencing result of the four patients showed glycine 504 codon "GGT" (reference sequence) was substituted by aspartic acid 504 codon "GAT" (G504D) (Fig. 5B) . Chromatographic DNA sequencing data of the four patients exhibited valine 524 codon "GTT" (reference sequence) that was substituted by aspartic acid 524 codon "GAT" (V524D) (Fig. 5C) . The last mutation site was due to proline 579 codon "CCA" (reference sequence) being replaced by leucine 579 codon "CTA" (P579L) in all four patients ( Fig. 5A and D) . The COVID-19 pandemic began near the end of 2019 and has caused extraordinary medical, social, and economic destruction in many parts of the world (12) . So far there is no specific treatment and vaccine for COVID-19 although there are promising leads on several lines of investigations. Elucidating more precisely of how SARS-CoV2 enters host cells is a priority NNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQ 498 ************************************************************ Cov2 PTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKK 558 YN1 PTNGVDYQPYRVVVLSFELLHAPATDCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKK 558 YN2 PTNGVDYQPYRVVVLSFELLHAPATDCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKK 558 YN3 PTNGVDYQPYRVVVLSFELLHAPATDCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKK 558 YN4 PTNGVDYQPYRVVVLSFELLHAPATDCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKK 558 *****.******************* ********************************** Cov2 FLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQDVNCT 618 YN1 FLPFQQFGRDIADTTDAVRDLQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCT 618 YN2 FLPFQQFGRDIADTTDAVRDLQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCT 618 YN3 FLPFQQFGRDIADTTDAVRDLQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCT 618 YN4 FLPFQQFGRDIADTTDAVRDLQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCT 618 ******************** **********************************.**** Cov2 EVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQ 677 YN1 EVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQ 677 YN2 EVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQ 677 YN3 EVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQ 677 YN4 EVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQ 677 *********************************************************** B Cov RNIDATSTGNYNYKYRYLRHGKLRPFERDISNVPFSPDGKPCTP-PALNCYWPLNDYGFY 484 Cov2 NNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQ 498 YN1 NNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQ 498 .*:*:. ***** ** :*:.:*:*******. :. ...**. .:***:**:.*** Cov TTTGIGYQPYRVVVLSFELLNAPATVCGPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKR 544 Cov2 PTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKK 558 YN1 PTNGVDYQPYRVVVLSFELLHAPATDCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKK 558 *.*:.**************:**** **** **:*:**:**************** *.*: Cov FQPFQQFGRDVSDFTDSVRDPKTSEILDISPCAFGGVSVITPGTNASSEVAVLYQDVNCT 604 Cov2 FLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQDVNCT 618 YN1 FLPFQQFGRDIADTTDAVRDLQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCT 618 * ********::* **:*** :* *****:**:************:*.:******.**** Cov DVSTAIHADQLTPAWRIYSTGNNVFQTQAGCLIGAEHVDTSYECDIPIGAGICASYHTV 663 Cov2 EVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQ 677 YN1 EVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQ 677 :* .*********:**:****.*****:**********:.****************:* as disruption of this entry can mitigate the replication and spread of SARS-CoV2. It is fairly well-established that following the cleavage of the SARS-CoV2 envelope spike protein by the host cell protease (3, 5, 11) , and the interaction of RBD to the host cell receptor ACE2, the virus gains entry into cells (4, 6, 7, 9, 10, 12, 13) . A recent study reported a single point mutation in which amino acid residue aspartic acid 614 replaced by glycine (D614G) in S gene, resulting in enhanced infectivity of SARS-CoV2 (18) . However, the significance of this mutation is uncertain since the D614G mutation is located in the SD2 -which has no known specific function -rather than the RBD of S gene (Fig. 6) . We investigated the RBD of SARS-CoV2 S gene from four Korean COVID-19 patients. The analysis of DNA sequencing revealed the D614G mutation in all four Korean COVID-19 patients (Fig. 4) suggesting that Korean COVID-19 SARS-CoV2 probably came from the same origin. The D614G mutation has been reported from different regions of the world including continental Europe, Asia, North America, and other continents with fewer number of COVID-19 (18) . The D614G mutation in the S gene of SARS-CoV2 dramatically increased after March 21, 2020 compared with before March 1, 2020 [18] . There were many SARS-CoV2 isolates with the D614G mutation in Europe prior to March 1, 2020, but there are not many in China and North America. However, the D614G mutation is dominant all over the world between March 21-30, 2020. What is occurring now? The present study with respiratory samples from Korean COVID-19 patients suggest a rapid progression of SARS-CoV2. It is interesting that the D614G mutation exists in all four Korean COVID-19 patients. In addition, we found two additional mutations G504D/V524D within the critical RBD domain of S gene (4, 6, 7, 9, 10, 12, 13) . A third P579L mutation is located near the known D614G mutation in SD2 but without known specific function. Therefore, it is necessary to investigate the function of different domains in spike gene to explain the significance of the mutations in SARS-CoV2 pathogenicity. Fig. 6 illustrates the genetic map of SARS-CoV2, which is a positive-sense single-stranded RNA (+ssRNA) virus with 29,903 bp. The genome of SARS-CoV2 consists of twelve ORF encoding an independent transcript that was depicted according to its location in the singlestranded genomic RNA. The S gene is located in the 3rd ORF that contains 3,822 bp encoding for 1,273 amino acid residues (Fig. 6 , green bar). The large glycoprotein spike protein is divided by 16 subdomains including a hydrophobic signal peptide and transmembrane domain as described by its physical map in genetics. An exact subdomain was marked by the number of amino acid residue using recent study (8, 19) . The progression of SARS-CoV2 in Korean COVID-19 patients was demonstrated by novel mutations in the critical RBD of the S gene of SARS-CoV2. The D614G mutation was indicated on the top of spike gene with red letter. The D614G and P579L mutations present in the SD2 (Fig. 6 , pink bar) region of the C-terminus end of the RBD. However, two novel mutations in S gene of Korean COVID-19 in the Korean patients were present in RBD (Fig. 6 , green bar), a site that is important for binding to ACE2 receptor on the host cell membrane. There were no mutations in the SD1 region (Fig. 6 , light blue) between RBD and SD2. Unfortunately, there is not enough functional study of the SD1 and SD2 regions of S gene. Previous report analyzing the architecture of SARS-CoV2 transcriptome using advance technology sequenced the whole genome of SARS-CoV2 from Korean COVID-19 patients, but this study did not mention the four mutations found in S gene from Korean COVID-19 patients (20) . Studies are underway to identify the precise functional significance of novel mutations in S gene by expressing these mutants in mammalian as well as in bacterial cells. The furin or other host enzyme cleavage site was indicated as S cleavage site on N-terminal of Arg (R) 685. Unlike SARS-CoV2, SARS-CoV does not need proteolytic cleavage to enter host cells suggesting a low infectivity compared to SARS-CoV2 (3, 5) . Interestingly, the novel P579L mutation is close to S cleavage site that may influence the infectivity of SARS-CoV2 as reported in previous study (12) . Since viruses can proliferate only in live cells, they generally do not want to eliminate the host prior to replication and infection of another host. While viral mutation is generally associated with increasing pathogenicity -through immune evasion or drug resistance -it is important to be cognizant of the teleological concept that viruses also need the host to survive. Therefore, it is necessary to thoroughly investigate whether SARS-CoV2 mutations enhance or decrease viral infectivity and pathogenicity. Future study with S gene mutation eventually defines how virus and human adapt each other as well as helps understand viral mutation is beneficial or deleterious to the host. Angiotensin-converting enzyme 2 is an essential regulator of heart function A pneumonia outbreak associated with a new coronavirus of probable bat origin Activation of the SARS coronavirus spike protein via sequential proteolytic cleavage at two distinct sites Coronavirus spike proteins in viral entry and pathogenesis Proteolytic activation of the SARS-coronavirus spike protein: cutting enzymes at the cutting edge of antiviral research Cryo-EM structure of the SARS coronavirus spike glycoprotein in complex with its host cell receptor ACE2 Characterization of spike glycoprotein of SARS-CoV-2 on virus entry and its immune cross-reactivity with SARS-CoV Exploring the genomic and proteomic variations of SARS-CoV-2 spike glycoprotein: a computational biology approach Structure of SARS coronavirus spike receptor-binding domain complexed with receptor Structural basis for the recognition of SARS-CoV-2 by fulllength human ACE2 Inhibitors of cathepsin L prevent severe acute respiratory syndrome coronavirus entry Cell entry mechanisms of SARS-CoV-2 Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Multiple Amino Acid Residues Mutations in Spike Heart block, ventricular tachycardia, and sudden death in ACE2 transgenic mice with downregulated connexins Assessing ACE2 expression patterns in lung tissues in the pathogenesis of COVID-19 Angiotensin-converting enzyme 2 is an essential regulator of heart function Angiotensin-converting enzyme 2 is a functional receptor for the SARS coronavirus Tracking changes in SARS-CoV-2 spike: evidence that D614G increases infectivity of the COVID-19 virus Cryo-EM structures of MERS-CoV and SARS-CoV spike glycoproteins reveal the dynamic receptor binding domains The architecture of SARS-CoV-2 transcriptome https://immunenetwork.org