key: cord-0698338-sefrctgd authors: Bai, Chongzhi; Zhong, Qiming; Gao, George Fu title: Overview of SARS-CoV-2 genome-encoded proteins date: 2021-08-10 journal: Sci China Life Sci DOI: 10.1007/s11427-021-1964-4 sha: 0653277b2654da9b489fcda64ee3f67c3fa91177 doc_id: 698338 cord_uid: sefrctgd Severe acute respiratory syndrome Coronavirus 2 (SARS-CoV-2) has spread rapidly throughout the world. SARS-CoV-2 is an enveloped, plus-stranded RNA virus with a single-stranded RNA genome of approximately 30,000 nucleotides. The SARS-CoV-2 genome encodes 29 proteins, including 16 nonstructural, 4 structural and 9 accessory proteins. To date, over 1,228 experimental structures of SARS-CoV-2 proteins have been deposited in the Protein Data Bank (PDB), including 16 protein structures, two functional domain structures of nucleocapsid (N) protein, and scores of complexes. Overall, they exhibit high similarity to SARS-CoV proteins. Here, we summarize the progress of structural and functional research on SARS-CoV-2 proteins. These studies provide structural and functional insights into proteins of SARS-CoV-2, and further elucidate the daedal relationship between different components at the atomic level in the viral life cycle, including attachment to the host cell, viral genome replication and transcription, genome packaging and assembly, and virus release. It is important to understand the structural and functional properties of SARS-CoV-2 proteins as it will facilitate the development of anti-CoV drugs and vaccines to prevent and control the current SARS-CoV-2 pandemic. In December 2019, an outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was reported that rapidly spread to nearly every country in the world Wu et al., 2020) . It was listed as a "public health emergency of international concern" on January 30, 2020, and a "Pandemic" on March 11, 2020 by the World Health Organization (WHO). As a result, coronavirus disease 2019 has become a serious threat to humans worldwide. As of April 29, 2021, more than 148.3 million patients and 3.1 million deaths were reported by the WHO (https://www.who.int/emergencies/diseases/ novel-coronavirus-2019). The most likely explanation for the emergence of SARS-CoV-2 is animal-to-human interspecies transmission, followed by human-to-human transmission . Multiple research teams from China and other countries successfully isolated the virus and completed whole genome sequencing . The data indicate that SARS-CoV-2 belongs to the family of β-coronaviruses (Xu et al., 2020b) and provides theoretical support for subsequent epidemic prevention, control and clinical treatment. SARS-CoV-2 displays the typical genome organization of the β-coronaviruses. The genome consists of 14 functional open reading frames (ORFs), including two noncoding regions at both ends and multiple regions that encode for nonstructural proteins (NSPs), accessory proteins and structural proteins Wu et al., 2020) (Table 1) . ORF1a and ORF1b encode 16 NSPs (nsp1-nsp16) that are required for viral RNA synthesis (Figure 1 ). There are four structural proteins, which include the spike (S) protein, envelope (E) protein, membrane (M) protein and nucleocapsid (N) protein, essential for viral assembly. Nine accessory proteins provide a selective advantage in the infected host (Bartlam et al., 2007; Ziebuhr, 2004) (Figure 2) . Blocking virus entry into host cells and preventing viral replication and transcription are two primary strategies for the design of drugs that target SARS-CoV-2. The systematic determination of the three-dimensional structure of each protein encoded by SARS-CoV-2 contributes to clarifying their function and identifying potential therapeutic targets. In this review, we summarize the structural and functional studies conducted on SARS-CoV-2 proteins with a focus on the proteins associated with viral genome replication and transcription, as well as those that mediate the fusion of the viral membrane with the host cell membrane. These results are significant to combating SARS-CoV-2, and structural characterization of these proteins will be useful for vaccine design and for the development of antiviral therapeutics against SARS-CoV-2. SARS-CoV-2 ORF1a and ORF1b encode 16 NSPs, which contain multiple enzymatic functions ( Figure 1 , Table 1 ). These NSPs are involved in the regulation of viral RNA replication and transcription, as well as genome replication and transcription as subunits of transcription/replication complexes. Some of them are common enzymes that perform key functions, including papain-like protease activity (PLpro, nsp3), 3C-like cysteine protease activity (Mpro or 3CLpro, nsp5), RNA-dependent RNA polymerase activity (RdRp, nsp12) , and superfamily 1-like helicase and ATPase activity (nsp13). However, others represent less common enzymes and may be related to the unique characteristics of the coronavirus, including primase activity (nsp8), exoribonuclease activity (ExoN homolog, nsp14), nidoviral RNA uridylate-specific endoribonuclease activity (NendoU homolog, nsp15), and ribose 2′-O-methyltransferase activity (2′-O-MT, nsp16). Two main ORFs of SARS-CoV-2, ORF1a and ORF1b, account for the majority (~70%) of the viral genome. ORF1ab is translated to the polyproteins, pp1a or pp1ab, through a ribosomal frameshift (Arya et al., 2021; Helmy et al., 2020) . nsp1 is the N-terminal cleavage product of ORF1ab and one of the least characterized NSPs of the coronaviruses, and it functions to suppress host gene expression through ribosome association. The cryo-electron microscopy (cryo-EM) structure of the nsp1 and human 40S ribosomal subunit complex reveals that the nsp1 C terminus binds to and obstructs the mRNA entry tunnel. This research supports the fact that nsp1 can specifically bind to the 40S ribosome of the host cell and accelerate the degradation of mRNA. This causes a reduction in cellular protein synthesis and promotes virus survival (Kamitani et al., 2006; Thoms et al., 2020) . However, the mechanism through which the virus circumvents the nsp1-mediated translation barrier to produce its Figure 1 Structures of SARS-CoV-2 nonstructural proteins. The full SARS-CoV-2 genome and its 16 nonstructural proteins are shown as a cartoon representation with rainbow colors. The SARS-CoV-2 proteins with resolved structures are shown in black, while those with unresolved structure or no Protein Data Bank (PDB) ID (nsp4 and nsp14) are shown in red (using resolved structures of the SARS-CoV proteins instead here). own proteins needs to be further investigated. nsp1 effectively blocks the RIG-I-dependent innate immune response, which hinders clearance of the viral infection (Hu et al., 2017; Sparrer et al., 2012; Strahle et al., 2007; Thoms et al., 2020) . Therefore, nsp1 may be a potential antiviral target and may act as a virulence factor because of its participation in multiple stages of the coronavirus life cycle. The role of nsp2 is uncertain and it is the most variable among the coronaviruses (Graham et al., 2006) . The nsp2 of SARS-CoV is not required for viral replication in culture, but deletion of the nsp2 coding sequence attenuates viral growth and RNA synthesis (Graham et al., 2006) . A recent study indicates that the structural similarity of this region is subject to positive selective pressure to protect against mutations occurring in the endosome-associated-protein-like domain of the nsp2 protein, which may explain, in part, why SARS-CoV-2 is more contagious than SARS-CoV . The amino acid sequences are highly conserved between SARS-CoV-2 nsp3 and SARS-CoV nsp3 (75% homology) . nsp3 is the largest NSP of the SARS-CoV-2 with a molecular mass of 213 kD. It is generated by proteolytic cleavage of the pp1a polyprotein at two sites and contains at least seven subdomains: an N-terminal acidic domain (Ac, also called nsp3a), an X-domain (ADRP or nsp3b), nsp3c, a papain-like proteinase (PLpro, also called nsp3d), and additional domains (nsp3e-g) (Tan et al., 2009) . The crystal structure of SARS-CoV-2 nsp3 macrodomain-X reveals that the unique domain of the coronavirus nsp3 core forms a head-to-tail homodimer, and each monomer consists of two subdomains: subdomain A and subdomain B (Alhammad et al., 2021) ( Figure 3A and B). nsp3 acts as a scaffold, which binds to host proteins to promote virus survival (Ma-Lauer et al., 2016) . It is also involved in mediating viral replication or the host cells' response to viral infection (Tan et al., 2009) . The PLpro is a cysteine protease that recognizes the LXGG tetrapeptide motif between nsp1 and nsp2, nsp2 and nsp3, and nsp3 and nsp4. The hydrolysis of the peptide leads to the release of nsp1, nsp2 and nsp3, which are essential for viral replication (Harcourt et al., 2004) . Recently, the crystal structures of two inhibitors in complex with SARS-CoV-2 PLpro revealed their inhibitory mechanisms, providing a molecular basis for the observed substrate specificity profiles (Rut et al., 2020) . Notably, PLpro is an attractive antiviral drug target, and the crystal structure of SARS-CoV-2 PLpro in complex with ubiquitin-like interferon-stimulated gene 15 protein (ISG15) reveals the high affinity and specificity of the interaction of these two proteins. The inhibition of the PLpro domain of nsp3 inhibits the virus-induced cytopathogenic effect, maintains the antiviral interferon pathway, and reduces viral replication in infected cells (Shin et al., 2020) . Although nsp3 shows a high rate of genetic diversity during coronavirus evolution (Forni et al., 2016) , it still retains several crucial functions that warrant indepth study. To date, of the 16 coronavirus NSPs, only nsp3, nsp4 and nsp6 contain transmembrane domains (Oostra et al., 2008; Ziebuhr, 2005 ) that participate in the interaction of the replication complex with cytoplasmic membranes and function as membrane anchors or as replication complex scaffolds (Imbert et al., 2008) . nsp4 plays a significant role in coronavirus replication and the formation of the reticulovesicular network, which integrates convoluted membranes and numerous interconnected double-membrane vesicles (DMV) of approximately 500 amino acid residues in length (Knoops et al., 2008; Wolff et al., 2020) . Studies have indicated that comprehensive membrane pairing results from full-length coexpression of nsp3 and nsp4, in which paired membranes are kept at the same distance as the authentic DMV (Angelini et al., 2013) . Overall, nsp4 is a transmembrane protein with a complex structure. The transmembrane regions of nsp4 may mediate the interaction of the coronavirus replication complex with cellular membranes. nsp5 is a 33 kD cysteine protease, also known as the main protease (Mpro). Coronavirus ORF1a and ORF1b are processed by nsp5 at 11 conserved cleavage sites to generate 12 functional proteins, which subsequently form a complex responsible for many aspects of virus genome replication (Masters, 2006; Perlman and Netland, 2009 ). For all coronaviruses in the β-coronavirus subgroup, nsp5 is required for coronaviral polyprotein processing and only active upon ligand-induced dimerization (Tomar et al., 2015) . Similar to other β-coronavirus nsp5 proteins, SARS-CoV-2 nsp5 functions as an active homodimer with an approximate C2 symmetry ( Figure 3C and D). Compared with the dimer of SARS-CoV nsp5, the SARS-CoV-2 nsp5 dimer is tighter and closer because of a T285A substitution, which results in a higher catalytic efficiency . Some inhibitors targeting the SARS-CoV nsp5 are effective against all coronavirus-related diseases (Yang et al., 2005) . Therefore, many compounds, including approved drugs and natural products, have been screened, and some of them such as carmofur and ebselen exhibited favorable IC 50 values when tested against SARS-CoV-2 (Jin et al., 2020) . The crystal structures of SARS-CoV-2 nsp5 in complex with various inhibitors have been deposited in the Protein Data Bank (PDB). These inhibitors are useful for understanding the molecular basis of the interaction. Overall, nsp5 is an attractive target for the development of anticoronavirus therapeutics because of its indispensable role during virus replication. nsp6 is a transmembrane protein of the coronavirus lineages. It locates to the endoplasmic reticulum (ER) and generates autophagosomes Forni et al., 2017) . The overexpression of nsp6 disturbs intracellular membrane trafficking leading to the accumulation of single-membrane vesicles around the complex of microtubules (Cottam et al., 2011; Oostra et al., 2008) . nsp6 most likely restricts autophagosome expansion by starvation or chemical inhibition of mTOR signaling (Tang et al., 2006) . However, there is no direct evidence that autophagy and mutations associated with nsp6 contribute to viral replication and escape from cellular immunity. nsp6 may lead to changes in critical host antiviral defenses, such as the autophagic lysosomal machinery. Changes in these uncommon regions should be constantly monitored because the pathogenicity of SARS-CoV-2 may be modified upon mutation. nsp9 is a single-stranded RNA-binding protein implicated in the virulence of the virus (Miknis et al., 2009) . The crystal structure of SARS-CoV-2 nsp9 shows that it contains a dimer in the asymmetric unit ( Figure 3E and F). In each monomer, seven β-strands and one α-helix are arranged into a single compact domain to form a cone-shaped β-barrel flanked by the C-terminal α-helix. The dimeric form of nsp9 provides a single, uninterrupted nucleic acid-binding site. This supports the view that nsp9 functions as a nucleic acid-binding protein ( Figure 3E and F) (Littler et al., 2020) . nsp9 can bind to nsp12 nidovirus RdRp-associated nucleotidyltransferase (NiRAN), enabling its N terminus to be inserted into the catalytic center of nsp12 NiRAN. This inhibits its activity (Yan et al., 2021) and disrupts the dimer interface of nsp9, which may be a useful strategy for anticoronavirus drug design. The crystal structure of SARS-CoV-2 nsp15 has been determined (Kim et al., 2020) (Figure 3G and H) , and the nsp15 proteins are conserved in coronaviruses, with each containing a C-terminal catalytic domain belonging to the EndoU enzyme family (Kim et al., 2020) in which the active biological unit is a hexamer (Ricagno et al., 2006) (Figure 3I ). These types of enzymes are found in all living entities, where they perform various biological functions related to RNA processing. It has been reported that the NendoU activity of nsp15 is related to protein interference with the innate immune response, because nsp15 can degrade viral RNA to hide it from host defenses (Deng et al., 2017) . Mutations in the nsp15 protein may reduce viral replication, resulting in greatly attenuated disease in mice (Kindler et al., 2017) . Therefore, nsp15 is vital to the viral life cycle and may also be a potential target for therapeutic intervention against coronavirus infections. The structure of the SARS-CoV-2 polymerase complex consisting of nsp12 and the cofactors nsp7-nsp8 were determined by several research groups Hillen et al., 2020; Peng et al., 2020; . The SARS-CoV-2 nsp7-nsp8 complex displays a similar structure to that of the SARS-CoV nsp7-nsp8 complex (Kirchdoerfer and Ward, 2019; Zhai et al., 2005) (Figure 4A -C), and it shares 97% amino acid sequence homology with that of SARS-CoV . The function of nsp8 is to catalyze the synthesis of RNA primers for the primer-dependent primase, nsp12, which is a typical characteristic of coronaviruses. nsp7 most likely acts as a "mortar" to stabilize nsp8 (Bartlam et al., 2007) . nsp12 is a key component of the SARS-CoV-2 replication and transcription machinery, and SARS-CoV-2 nsp12 shares~96% amino acid sequence homology with its SARS-CoV homolog , and adopts the conserved architecture of the viral polymerase family. In addition, a new β-hairpin domain at its N terminus was identified (Figure 4D ). nsp12 catalyzes the synthesis of viral RNA with the assistance of nsp7-nsp8 (Bartlam et al., 2007) and is a key target for nucleotide analog antiviral inhibitors such as remdesivir, which has been reported to effectively inhibit SARS-CoV-2 proliferation Holshue et al., 2020) . The model of remdesivir binding to nsp12 and its possible inhibitory mechanisms have been revealed. Part of the double-stranded RNA template inserts into the central channel of nsp12 where remdesivir covalently incorporates into the primer strand at the first replicated base pair to terminate chain elongation. This provides rational insight into the anti-SARS-CoV-2 mechanism of remdesivir Hillen et al., 2020; Yin et al., 2020) . The SARS-CoV-2 nsp12 protein shares high structural similarity with the hepatitis C virus (HCV) apo state of ns5b. Sofosbuvir is an effective drug for the treatment of HCV infection and targets HCV ns5b (Gane et al., 2013) . Based on the structural similarity of the nucleoside analogs, such as remdesivir and sofosbuvir, and the structural conservation of the catalytic site between SARS-CoV-2 nsp12 and the HCV ns5b polymerase (Appleby et al., 2015; Gane et al., 2013) , nucleoside analogs are recommended candidates in the searching for anti-SARS-CoV-2 drugs. In addition to antiviral research, the relatively weaker activity and lower thermal stability of SARS-CoV-2 core polymerase complex compared with those of SARS-CoV suggest an adaptation of SARS-CoV-2 toward humans with lower body temperatures compared with natural bat hosts . SARS-CoV-2 nsp13 shares~99% amino acid sequence homology with its SARS-CoV homolog and their structures are similar. nsp13 exhibits helicase and ATPase activity (Jia et al., 2019) , and the helicase activity of nsp13 is enhanced about two fold by nsp12 by increasing the step size of nucleic acid unwinding (Adedeji et al., 2012) . nsp13 consists of five domains arranged in a triangular pyramidal shape, and five well-ordered domains of nsp13 enable them to participate in helicase activity (Jia et al., 2019) . The cryo-EM structure of SARS-CoV-2 nsp13 complexed with nsp7, nsp8 and nsp12 revealed that nsp13 interacts with nsp8 and nsp12 ( Figure 4C and H) and may be involved in the modulation of helicase activity . The atomic cryo-EM structure of an extended replication and transcription complex (RTC), which is comprised of nsp7-nsp8 2 -nsp12-nsp13 2 -RNA and nsp9, reveals an intermediate state of RTC toward mRNA synthesis, and it is very important for understanding the architecture of RTC (Yan et al., 2021) . Because of its high sequence conservation in coronaviruses and its pivotal role in catalyzing the unwinding of duplex oligonucleotides into single strands in an NTP-dependent manner, nsp13 may be considered a perfect target for the development of anti-coronavirus drugs. The crystal structure of the SARS-CoV-2 nsp16-nsp10 complex has been elucidated (to be published) ( Figure 4E -G). nsp10 consists of an antiparallel β-sheet, a helical domain and two zinc-finger motifs (Rogstam et al., 2020 ) ( Figure 4E ), it interacts with both nsp14 and nsp16 as a cofactor to stimulate their respective activities of 3′-5′ exoribonuclease and 2′-O-methyltransferase (Bouvet et al., 2014) . nsp10, nsp14 and nsp16 are highly conserved within the coronaviridae family. They share approximately 98%, 95% and 93% amino acid homology with their SARS-CoV homologs, respectively Kim et al., 2020) . Since nsp10 is a vital regulator of SARS-CoV-2 replicase function, and its interaction with nsp14, nsp16 or possibly other proteins associated with viral replication is essential for optimal activity (Bouvet et al., 2014) , nsp10 may be a target for the development of anti-SARS-CoV-2 drugs. nsp14 plays a key role in decreasing the incidence of mismatched nucleotides through its exoribonuclease domain (ExoN) (Eckerle et al., 2010) , and the ExoN activity of nsp14 can be released only by binding to nsp10 (Bouvet et al., 2012) . nsp14 has three zinc-finger motifs, which is crucial for its ExoN activity and its interaction with nsp10 (Ma et al., 2015) . nsp10 interacts with the ExoN domain of nsp14 to stabilize its structure and facilitate its proofreading function. nsp16 2′-O-MTase activity is stimulated by nsp10 through stabilization of the S-adenosyl-L-methionine (SAM)-binding pocket and extending the substrate RNA-binding groove of nsp16 (Bouvet et al., 2010; Decroly et al., 2011) . This process facilitates virus genome replication and evasion from innate immunity. The nsp10-nsp16 complex has a robust 2′-O-methyltransferase activity and large conformational changes associated with substrate binding as the enzyme transitions from a binary to a ternary state were observed (Viswanathan et al., 2020) , which provides mechanistic insights into the 2′-O-methylation of the viral mRNA cap. The complex structures of the nsp16-nsp10 heterodimers bound to the methyl donor, S-adenosyl methionine, the reaction product S-adenosyl homocysteine (SAH) and the SAH analog sinefungin (SFG) reveal the location of the ribonucleic acid backbone phosphates in the ribonucleotide-binding groove and other newly identified nucleotide-binding sites on the face of the protein opposite the active site (Viswanathan et al., 2020) . These sites may be developed as antiviral drug targets. MTase function is associated with the nsp16 protein. The crystal structure of the nsp10-nsp16 complex bound to the pan-MTase inhibitor, sinefungin, reveals the mechanism of the methyltransferase domain and the atomic details of how sinefungin inhibits nsp16 MTase (Krafcikova et al., 2020) . Mutation of SARS-CoV nsp16 results in a ten-fold reduction in the synthesis of viral RNA (Ma et al., 2015) , therefore it is an attractive drug target for anti-CoV strategies. The SARS-CoV-2 genome encodes four structural proteins, S protein, N protein, M protein and E protein, which are essential for viral assembly and the production of a structurally whole viral particle. The three-dimensional structures of the S protein and two main domains of the N protein have been determined. The structures of the M protein and E protein have not been resolved because of their complicated transmembrane motifs. The S protein mediates attachment of the virus to host cell surface receptors and promotes subsequent fusion between the viral and host cell membranes to facilitate viral entry into the host cell. The cryo-EM structure of the ectodomain trimer of the SARS-CoV-2 S protein has been determined (Walls et al., 2020) (Figure 5A and B) . The SARS-CoV-2 S protein shares~78% amino acid sequence homology with the SARS-CoV S protein (Walls et al., 2020) . Similar to its SARS-CoV homolog and other type I virus membrane fusion proteins, the SARS-CoV-2 S protein is composed of two functional subunits: an N-terminal half (S1) ( Figure 5C and D) and a C-terminal half (S2) (Walls et al., 2020) (Figure 5E and F). SARS-CoV-2 contains two cleavage sites at the boundary of the S1/S2 subunits. Cleavage at S1/S2 of the spike protein is essential for efficient viral entry into its target cells. The furin site is cleaved by host proteases primarily during maturation, and S2`site cleavage can occur upon binding to the surface of the host cell. This is a very significant difference between SARS-CoV and other β coronaviruses (Örd et al., 2020; Robson, 2020) . The SARS-CoV-2 S1 subunit contains the receptor-binding domain (RBD), which is responsible for binding to cellular receptors and uses angiotensin-converting enzyme 2 (ACE2) to enter target cells (Lei et al., 2020; Walls et al., 2020; Yan et al., 2020) (Figure 5D ). Importantly, the ACE2binding affinity of RBD in the S1 subunit of SARS-CoV-2 is 3-4-fold higher than that of SARS-CoV (Wrapp et al., 2020; Yan et al., 2020) , which may explain the higher infectivity and transmissibility of SARS-CoV-2 compared with SARS-CoV. The SARS-CoV-2 S2 subunit is responsible for mediating the membrane fusion process. The cleavage site on S2 is exposed and cleaved by host proteases after S1 RBD binding to the host receptor, ACE2. This process is pivotal to viral infection. There are two regions within the S2 subunit, heptad repeat 1 region (HR1) and heptad repeat 2 region (HR2), and their sequences are highly homologous to those of other coronaviruses. The HR1 and HR2 domains interact with each other to form an antiparallel six-helix bundle (6-HB) structure known as the fusion core. This allosteric process begins immediately after RBD-ACE2 binding and S2 cleavage, leading to the viral-cellular membrane fusion (Bosch et al., 2004) . A broad-spectrum coronavirus fusion inhibitor (EK1C4) has been developed, which targets the HR1 domain and is highly effective against membrane fusion and infection by coronaviruses, including SARS-CoV-2 . The structural basis for membrane fusion mediated by the RBD and 6-HB fusion core structure of the SARS-CoV-2 proteins is important for the rational design of novel coronavirus fusion inhibitors. Currently, the S protein is the major target for vaccine development, but various mutations in the antigenic epitopes of S protein may result in conformational changes in the protein structure. This should be carefully considered during the design of a universal vaccine. The M protein is a type III glycoprotein, the most abundant structural protein of SARS-CoV-2, and it defines the shape of the viral envelope (de Haan et al., 2000) . The M protein interacts with other structural viral proteins and has a central organizing role in coronavirus assembly as it directs envelope formation and provides the matrix to which the nucleocapsid can attach for budding (Neuman et al., 2011) . The M protein is recognized by the humoral and cellular immune responses as a significant immunogen (Liu et al., 2010) , it was also identified as a negative regulator of the innate immune response because of its interaction with a common adaptor protein, MAVS . These features suggest that the M protein is a potential target for the development of SARS-CoV-2 interventions, including vaccines. The E protein is the smallest of the major structural proteins. It is a short polypeptide (76-109 amino acids) with a single α-helical transmembrane domain (Parthasarathy et al., 2008) that forms homopentameric ion channels with poor ion selectivity (Verdiá-Báguena et al., 2012) . E proteins are primarily located in the ER-Golgi intermediate compartment (Liao et al., 2006) and are expressed abundantly inside infected cells during the replication cycle. However, only a small portion of the E proteins merge into the virion envelope (Venkatagopalan et al., 2015) . Because of the importance of the E protein to pathogenesis, live attenuated vaccines based on the deletion of E protein have been developed for SARS-CoV and SARS-CoV-2 (Abdelmageed et al., 2020; Jimenez-Guardeño et al., 2015; Surya et al., 2018; . N protein is one of the most abundant structural proteins in SARS-CoV-2 and contains three intrinsically disordered regions (the N-arm, central linker region and C-tail) and two structural domains: the N-terminal domain (Kang et al., 2020) (Figure 2 ) and the C-terminal domain (Zinzula et al., 2021) (Figure 2 ). Unlike other major structural proteins, the N protein is the only protein that primarily functions to bind to the coronavirus RNA genome to constitute the nucleocapsid (Chang et al., 2016; Schoeman and Fielding, 2019) . Thus, N protein binds to the viral RNA genome to form a ribonucleoprotein complex, which is essential for keeping RNA in an ordered conformation for replication and transcription (Liang et al., 2020; Masters et al., 1990) . N protein acts as a molecular chaperone, it is indispensable for viral RNA synthesis (Zuniga et al., 2010) and regulation of cellular processes, such as cell apoptosis and cell cycle progression (Du et al., 2008) . Furthermore, the N protein is stable, conserved and highly immunogenic, in addition to being less prone to mutation during infection. It is considered a pivotal diagnostic molecular marker and prophylactic target (Chang et al., 2016; Rahman et al., 2021; Shekhar et al., 2020; Takeda et al., 2008) . The SARS-CoV-2 genome encodes nine accessory proteins (3a, 3b, 6, 7a, 7b, 8, 9b, 9c and 10) , which play important roles in its interaction with host cells to help the virus evade the immune system and enhance its virulence. The cryo-EM structure of SARS-CoV-2 ORF3a indicates that it adopts a novel fold and is captured in a closed or inactivated state, forming three transmembrane regions to constitute large conductance cation channels in the form of either a dimer or a tetramer ( Figure 2 ) (Hachim et al., 2020) . ORF3a is one of the three putative ion channels, which is encoded by SARS-CoV (Castaño-Rodriguez et al., 2018) , it is highly conserved within the β-coronavirus subgenus, Sarbecovirus (Andersen et al., 2020) . ORF3a plays an important role in virus release and is associated with inflammasome activation and cell death (Nieva et al., 2012) . The deletion of ORF3a reduces viral titer and morbidity in animal models (Castaño-Rodriguez et al., 2018) , suggesting that ORF3a may be a potential therapeutic target for COVID-19. The structure of ORF3b could not be resolved because of its disorder. However, it can be detected in the cytoplasm, nucleolus and outer membrane of mitochondria of host cells. Its overexpression is associated with the activation of AP-1 through the ERK and JNK pathways . ORF6, ORF7a and ORF7b are conserved with respect to amino acid substitutions similar to the E and M proteins (Laha et al., 2020) . SARS-CoV-2 ORF6 interacts with the mRNA export proteins, RAE1 and NUP98, and may inhibit cellular translation (Gordon et al., 2020a) . The crystal structure of ORF7a, an accessory protein of SARS-CoV-2, has been determined (to be published) (Figure 2 ). The ORF7a protein is also known as X4. It is a 122-amino acid long, type 1 transmembrane protein containing a 15-residue N-terminal signal peptide sequence, an 81-residue ectodomain and a 21residue C-terminal transmembrane domain, followed by a short 5-residue cytoplasmic tail. ORF7a interacts with the viral structural proteins, M, E, S and the ribosomal transport proteins, HEATDR3 and MDN1 (Gordon et al., 2020a; Huang et al., 2006) . It has been shown to inhibit cellular translation in SARS-CoV (Kopecky-Bromberg et al., 2006), implying that ORF7a plays an important role in the viral life cycle and may function early in the course of viral infection. ORF7a also interacts with the accessory protein, ORF3a, which interacts with the M, E and S structural proteins, indicating that these viral proteins likely form complexes in coronavirus-infected host cells (McBride and Fielding, 2012; Tan et al., 2004) . ORF8 is a rapidly evolving accessory protein exhibiting frequent mutations and thought to interfere with the immune response (Laha et al., 2020) . The crystal structure of SARS-CoV-2 ORF8 reveals a~60 residue core with two unique dimerization interfaces. This reveals how SARS-CoV-2 ORF8 is able to form unique large-scale assemblies, which are not observed during SARS-CoV infection. This unique higher-order assembly may mediate immune suppression and evasion activities (Flower et al., 2021) (Figure 2 ). The structures of both SARS-CoV ORF9b and SARS-CoV-2 ORF9b reveal that these proteins have a dimeric tent-like β sheet structure with an amphipathic outer surface and a long hydrophobic lipid-binding tunnel (Meier et al., 2006) (Figure 2 ). The unique structural properties imply that ORF9b may interact with components of the ER-Golgi network as an accessory protein during the assembly of the coronavirus virion (Meier et al., 2006) . The structure of human mitochondrial import receptor Tom70 in complex with SARS-CoV2 ORF9b implies that the loss of mitochondrial import efficiency resulting from the binding of ORF9b to the Tom70 substrate-binding pocket may induce mitophagy. This provides insight into how ORF9b may modulate the host immune response (Gordon et al., 2020b) . These results suggest that ORF9b could be crucial as a pancoronavirus therapeutic target. ORF9c is annotated as a hypothetical protein and not predicted to encode any functional protein, but it might modify the mitochondrial activity of the host cell (Gordon et al., 2020b; Jungreis et al., 2021) . ORF10 is supposed to be unique to SARS-CoV-2 and codes for a 38 amino acid peptide . However, there is no evidence to indicate that ORF10 is expressed during SARS-CoV-2 infection. The COVID-19 pandemic erupted and rapidly spread around the world. The quick transmission and more than 2% mortality rate associated with SARS-CoV-2 have had particularly devastating effects on human health and the global economy. The proteins of SARS-CoV-2 perform the function of protecting this pathogen, assisting the virus in attaching to host cells, blocking the expression of host genes, and evading the innate immune response to facilitate virus replication and proliferation. In the past year, the important structures of the protein and protein complexes of SARS-CoV-2 have been elucidated by structural biologists, providing an atomic view of the various processes that occur in the viral life cycle. These structures are significant not only to understand the mechanisms by which proteins and their complexes perform specific switching or regulatory functions, but also to provide insight into viral immune evasion strategies. The direct source of the COVID-19 pathogen has not yet been determined. Finding the potential intermediate host of SARS-CoV-2 is essential to prevent the further spread of the epidemic. Through structural biology, the binding characteristics of viral proteins and different receptors may be used to evaluate the effectiveness of currently available antiviral therapies and conduct viral tracing studies. Scientists are re-screening previously known antiviral compounds against SARS-CoV, MERS-CoV and other viruses, and are also using convalescent plasma from patients with SARS-CoV-2 and monoclonal antibodies to inhibit the virus. Although these drugs have been shown to be effective in vitro, their clinical usefulness has not yet been clearly established. Structural studies of viral targets will advance the development of safe and effective anti-COVID-19 therapies. Inducing a high NAb titer is a crucial factor for COVID-19 vaccine development, and structure-guided design has further generalized the single-chain (sc-dimer) strategy to design coronavirus vaccines against SARS-CoV-2 and SARS-CoV, achieving 10-to 100-fold enhancement of NAb titers. This framework of immunogen design can be universally applied to other coronavirus vaccines to counter any emerging threats. As available protein structures of SARS-CoV-2 are determined, the importance of structural biology is an important tool to solve significant health-related issues, including functional annotation of proteins, identifying drug targets, and the development of antibodies and vaccines. In the face of this novel coronavirus, vaccines are still the most effective preventive measure; however, viral genome mutations of SARS-CoV-2 could pose significant challenges to existing vaccines that target the prevalent viral strains. Extensive studies on the safety and antibody-dependent enhancement of vaccines are necessary, and the determination of the international biomedical community to unite and vigorously fight is the only way to achieve success. The author(s) declare that they have no conflict of interest. Design of a multiepitope-based peptide vaccine against the E protein of human COVID-19: An immunoinformatics approach Mechanism of nucleic acid unwinding by SARS-CoV helicase The SARS-CoV-2 conserved macrodomain is a mono-ADP-ribosylhydrolase The proximal origin of SARS-CoV-2 COVID-2019: The role of the nsp2 and nsp3 in its pathogenesis Severe acute respiratory syndrome coronavirus nonstructural proteins 3, 4, and 6 induce double-membrane vesicles Structural basis for RNA replication by the hepatitis c virus polymerase Structural insights into SARS-CoV-2 proteins Structural proteomics of the SARS coronavirus: A model response to emerging infectious diseases Evolutionary analysis of SARS-CoV-2: How mutation of non-structural protein 6 (nsp6) could affect viral autophagy Severe acute respiratory syndrome coronavirus (SARS-CoV) infection inhibition using spike protein heptad repeatderived peptides In vitro reconstitution of SARS-coronavirus mRNA cap methylation RNA 3′-end mismatch excision by the severe acute respiratory syndrome coronavirus nonstructural protein nsp10/nsp14 exoribonuclease complex Coronavirus nsp10, a critical co-factor for activation of multiple replicative enzymes Role of severe acute respiratory syndrome coronavirus viroporins E, 3a, and 8a in replication and pathogenesis A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: A study of a family cluster Recent insights into the development of therapeutics against coronavirus diseases by targeting N protein Coronavirus nsp6 proteins generate autophagosomes from the endoplasmic reticulum via an omegasome intermediate Assembly of the coronavirus envelope: Homotypic interactions between the m proteins Crystal structure and functional analysis of the SARS-coronavirus RNA cap 2′-O-methyltransferase nsp10/nsp16 complex Coronavirus nonstructural protein 15 mediates evasion of dsRNA sensors and limits apoptosis in macrophages A guideline for homology modeling of the proteins from newly discovered betacoronavirus, 2019 novel coronavirus (2019-nCoV) Priming with rAAV encoding RBD of SARS-CoV S protein and boosting with RBD-specific peptides for T cell epitopes elevated humoral and cellular immune responses against SARS-CoV infection Infidelity of SARS-CoV nsp14-exonuclease mutant virus replication is revealed by complete genome sequencing Structure of SARS-CoV-2 ORF8, a rapidly evolving immune evasion protein Extensive positive selection drives the evolution of nonstructural proteins in lineage C betacoronaviruses SARS-CoV-2 membrane glycoprotein M antagonizes the MAVS-mediated innate antiviral response Nucleotide polymerase inhibitor sofosbuvir plus ribavirin for hepatitis C Structure of the RNA-dependent RNA polymerase from COVID-19 virus CoV-2 protein interaction map reveals targets for drug repurposing Comparative host-coronavirus protein interaction networks reveal panviral disease mechanisms The nsp2 proteins of mouse hepatitis virus and SARS coronavirus are dispensable for viral replication Orf8 and orf3b antibodies are accurate serological markers of early and late SARS-CoV-2 infection Identification of severe acute respiratory syndrome coronavirus replicase products and characterization of papain-like protease activity The COVID-19 pandemic: A comprehensive review of taxonomy, genetics, epidemiology, diagnosis, treatment, and control Structure of replicating SARS-CoV-2 polymerase First case of 2019 novel coronavirus in the united states The severe acute respiratory syndrome coronavirus nucleocapsid inhibits type I interferon production by interfering with TRIM25-mediated RIG-I ubiquitination Severe acute respiratory syndrome coronavirus 7a accessory protein is a viral structural protein The SARS-coronavirus PLnc domain of nsp3 as a replication/transcription scaffolding protein Delicate structural coordination of the severe acute respiratory syndrome coronavirus Nsp13 upon ATP hydrolysis Identification of the mechanisms causing reversion to virulence in an attenuated SARS-CoV for the design of a genetically stable vaccine Structure of Mpro from SARS-CoV-2 and discovery of its inhibitors SARS-CoV-2 gene content and COVID-19 mutation impact by comparing 44 Sarbecovirus genomes Severe acute respiratory syndrome coronavirus nsp1 protein suppresses host gene expression by promoting host mRNA degradation Crystal structure of SARS-CoV-2 nucleocapsid protein RNA binding domain reveals potential unique drug targeting sites Crystal structure of Nsp15 endoribonuclease NendoU from SARS-CoV-2 Early endonuclease-mediated evasion of RNA sensing ensures efficient coronavirus replication Structure of the SARS-CoV nsp12 polymerase bound to nsp7 and nsp8 co-factors SARS-coronavirus replication is supported by a reticulovesicular network of modified endoplasmic reticulum 7a protein of severe acute respiratory syndrome coronavirus inhibits cellular protein synthesis and activates p38 mitogen-activated protein kinase Structural analysis of the SARS-CoV-2 methyltransferase complex involved in RNA cap creation bound to sinefungin Characterizations of SARS-CoV-2 mutational profile, spike protein stability and viral transmission Activation and evasion of type I interferon responses by SARS-CoV-2 Highlight of immune pathogenic response and hematopathologic effect in SARS-CoV, MERS-COV, and SARS-CoV-2 infection Biochemical and functional characterization of the membrane association and membrane permeabilizing activity of the severe acute respiratory syndrome coronavirus envelope protein Crystal structure of the SARS-CoV-2 non-structural protein 9 The membrane protein of severe acute respiratory syndrome coronavirus acts as a dominant immunogen revealed by a clustering region of novel functionally and structurally defined cytotoxic Tlymphocyte epitopes Genomic characterisation and epidemiology of 2019 novel coronavirus: Implications for virus origins and receptor binding p53 down-regulates SARS coronavirus replication and is targeted by the SARS-unique domain and PL pro via E3 ubiquitin ligase RCHY1 Structural basis and functional analysis of the SARS coronavirus nsp14-nsp10 complex The molecular biology of coronaviruses Structure and function studies of the nucleocapsid protein of mouse hepatitis virus The role of severe acute respiratory syndrome (SARS)-coronavirus accessory proteins in virus pathogenesis The crystal structure of ORF-9b, a lipid binding protein from the SARS coronavirus Severe acute respiratory syndrome coronavirus nsp9 dimerization is essential for efficient viral growth A structural analysis of M protein in coronavirus assembly and morphology Viroporins: Structure and biological functions Topology and membrane anchoring of the coronavirus replication complex: Not all hydrophobic domains of nsp3 and nsp6 are membrane spanning The sequence at spike s1/s2 site enables cleavage by furin and phospho-regulation in SARS-CoV2 but not in SARS-CoV1 or MERS-COV Structural flexibility of the pentameric SARS coronavirus envelope protein ion channel Structural and biochemical characterization of the nsp12-nsp7-nsp8 core polymerase complex from SARS-CoV-2 Coronaviruses post-SARS: Update on replication and pathogenesis Evolutionary dynamics of SARS-CoV-2 nucleocapsid protein and its consequences Crystal structure and mechanistic determinants of SARS coronavirus nonstructural protein 15 define an endoribonuclease family COVID-19 Coronavirus spike protein analysis for synthetic vaccines, a peptidomimetic antagonist, and therapeutic drugs, and analysis of a proposed achilles' heel conserved region to minimize probability of escape mutations and drug resistance Crystal structure of non-structural protein 10 from severe acute respiratory syndrome coronavirus-2 Activity profiling and crystal structures of inhibitor-bound SARS-CoV-2 papainlike protease: A framework for anti-COVID-19 drug design Coronavirus envelope protein: Current knowledge In silico Structure-based repositioning of approved drugs for spike glycoprotein S2 domain fusion peptide of SARS-CoV-2: Rationale from molecular dynamics and binding free energy calculations Papain-like protease regulates SARS-CoV-2 viral spread and innate immunity Measles virus C protein interferes with beta interferon transcription in the nucleus Activation of the beta interferon promoter by unnatural sendai virus infection requires RIG-I and is inhibited by viral c proteins Structural model of the SARS coronavirus E channel in LMPG micelles Solution structure of the C-terminal dimerization domain of SARS coronavirus nucleocapsid protein solved by the sail-NMR method The SARS-unique domain (SUD) of SARS coronavirus contains two macrodomains that bind G-quadruplexes A novel severe acute respiratory syndrome coronavirus protein, U274, is transported to the cell surface and undergoes endocytosis The large 386-nt deletion in SARS-associated coronavirus: evidence for quasispecies? Structural basis for translational shutdown and immune evasion by the nsp1 protein of SARS-CoV-2 Ligand-induced dimerization of middle east respiratory syndrome (MERS) coronavirus nsp5 protease (3CLpro) Protein-protein interactions of viroporins in coronaviruses and paramyxoviruses: New targets for antivirals? Viruses Coronavirus envelope (E) protein remains at the site of assembly Coronavirus E protein forms ion channels with functionally and structurally-involved membrane lipids Structural basis of RNA cap modification by SARS-CoV-2 Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein Bivalent binding of a fully human IgG to the SARS-CoV-2 spike proteins reveals mechanisms of potent neutralization Structural basis for RNA replication by the SARS-CoV-2 polymerase A molecular pore spans the double membrane of the coronavirus replication organelle Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation A new coronavirus associated with human respiratory disease in China Inhibition of SARS-CoV-2 (previously 2019-nCoV) infection by a highly potent pan-coronavirus fusion inhibitor targeting its spike protein that harbors a high capacity to mediate membrane fusion Systematic comparison of two animal-to-human transmitted human coronaviruses: SARS-CoV-2 and SARS-CoV Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmission Cryo-EM structure of an extended SARS-CoV-2 replication and transcription complex reveals an intermediate state in Cap synthesis Structural basis for the recognition of SARS-CoV-2 by full-length human ACE2 Design of wide-spectrum inhibitors targeting coronavirus main proteases Structural basis for inhibition of the RNA-dependent RNA polymerase from SARS-CoV-2 by remdesivir Mitochondrial location of severe acute respiratory syndrome coronavirus 3b protein Insights into SARS-CoV transcription and replication from the structure of the nsp7-nsp8 hexadecamer Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved α-ketoamide inhibitors Molecular biology of severe acute respiratory syndrome coronavirus Coronavirus Replication and Reverse Genetics. Current Topics in Microbiology and Immunology High-resolution structure and biophysical characterization of the nucleocapsid phosphoprotein dimerization domain from the COVID-19 severe acute respiratory syndrome coronavirus 2 Coronavirus nucleocapsid protein facilitates template switching and is required for efficient transcription