key: cord-312509-m3p9fuq0 authors: Tohidinia, Maryam; Sefid, Fatemeh title: Identification B and T-Cell epitopes and functional exposed amino acids of S protein as a potential vaccine candidate against SARS-CoV-2/COVID-19 date: 2020-08-21 journal: Microb Pathog DOI: 10.1016/j.micpath.2020.104459 sha: doc_id: 312509 cord_uid: m3p9fuq0 Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus that it disease spreads in over the world. Coronaviruses are single-stranded, positive-sense RNA viruses with a genome of approximately 30 KD, the largest genome among RNA viruses. Most people infected with the COVID-19 virus will experience mild to moderate respiratory illness and recover without requiring special treatment. Older people and those with underlying medical problems like cardiovascular disease, diabetes, chronic respiratory disease, and cancer are more likely to develop serious illness. At this time, there are no specific vaccines or treatments for COVID-19. So, there is an emergency need for vaccines and antiviral strategies. The spike protein is the major surface protein that it uses to bind to a receptor of another protein that acts as a doorway into a human cell. The putative antigenic epitopes may prove effective as novel vaccines for eradication and combating of COV19 infection. A combination of available bioinformatics tools are used to synthesis of such peptides that are important for the development of a vaccine. In conclusion, amino acids 250–800 were selected as effective B cell epitopes, T cell epitopes, and functional exposed amino acids in order to a recombinant vaccine against coronavirus. At the end of 2019, a cluster of pneumonia cases, caused by a newly identified coronavirus (covid19) emerged 65 in Wuhan, China, and approximately has spread to all of the countries in the world [1] [2] [3] . The outbreak of 66 coronavirus COVID-19 has become a most challenging health emergency in over the world [4, 5] . Coronaviruses is lipid-enveloped, positive-sense and single-stranded RNA viruses [6] . This virus enter cells 68 through a two-step process: they first recognize a host-cell-surface receptor for viral attachment and then fuse 69 viral and host membranes for entry [7] . Receptors not only determine the viral attachment step, but also play 70 important roles in the membrane fusion process [8] . Coronaviruses can be divided into four genera: α, β, γ, and 71 δ. For coronaviruses from all four genera, an envelope-anchored spike protein guide's coronavirus entry into 72 host cells [9] . ACE2, found in the lower respiratory tract of humans, is known as cell receptor for SARS-CoV. The S-glycoprotein on the surface of coronavirus can attach to the receptor, ACE2 on the surface of human cells 74 and causes membrane fusion between cells and viruses [3, 10, 11] . After membrane fusion, the viral genome 75 RNA is released into the cytoplasm. The genomic RNA is used as a template to directly translate two 76 polyproteins, pp1a and pp1ab, which encodes nonstructural proteins (nsps) to form the replication-transcription 77 complex (RTC) in a double-membrane vesicles to replicate and synthesize a nested set of subgenomic RNAs, 78 which encode accessory proteins and structural proteins. Lastly, the virion-containing vesicles fuse with the 79 plasma membrane to release the virus. So, the binding of SARS-CoV-2 Spike (S) glycoprotein and ACE2 80 receptor is a critical step for virus entry [12, 13] . The number of deaths associated COVID-19 is greatly higher than the other two coronaviruses (SARS-CoV 82 and MERS-CoV), and the outbreak is still ongoing, which global concern to the public health and economics 83 [14] . The most common symptoms of coronavirus disease are fever, tiredness, and dry cough. Most people (about 80%) recover from the disease without needing special treatment [15] . But Coronavirus is 85 considered a major threat to older people and people with pre-existing medical conditions such as asthma, 86 diabetes, heart disease [16] . There are no specific vaccines or treatments for COVID-19 [17] . So, there is an emergency need for 88 vaccines and antiviral strategies [18] . Infection control measures are necessary to prevent the virus from further 89 spreading and to help control the epidemic situation and at present, it is a serious challenge. Based on this 90 approach and given the important role of Spike (S) glycoprotein for virus entry [19] . The main aim of the 91 current is to use of bioinformatics tool to identify potential B-and T-cell epitope(s) of S protein with high 92 antigenicity that could be used to develop promising vaccines [20] . This paper briefly discusses and explores 93 bioinformatics tools in vaccine design. The best antigenic region of S protein has been determined as a novel Homology modeling is the construction of an atomic model of a purpose protein based solely on the target's 122 amino acid sequence and the experimentally determined structures of homologous proteins, referred to as 123 templates. There are many tools and servers that are used for homology modeling. There is no single modeling 124 program or server which is superior in every aspect to others. Since the functionality of the model depends on 125 the quality of the generated protein 3D structure, maximizing the quality of homology modeling is crucial. Phyre2 [24] Analysis, is a composite scoring function describing the major geometrical aspects of protein structures. All 3D 135 models of the protein built, were qualitatively estimated by GMQE and QMEAN scores. Qualitative evaluation 136 of 3D models was done by ProSA at https://prosa.services.came. sbg.ac.at [26] . ProSA specifically faces the 137 J o u r n a l P r e -p r o o f needs confronted in the authentication of protein structures acquired from X-ray analysis, NMR spectroscopy, 138 and hypothetical estimations. Rampage [27] at http://mordred.bioc.cam.ac. uk/rapper/rampage.php was also 139 employed for estimation of model quality using Ramachandran plot which is an algorithm for atomic level, 140 high-resolution protein structure improvement. InterProscan at http://www.ebi.ac.uk/interpro/about.html [28] was used to find functional analysis of protein Functional conserved amino acids allow more logical thinking in the epitope prediction and protein binding 151 surfaces. There are different servers to predict functionally and structurally important residues. In this study, InterProSurf at http://curie.utmb.edu/pattest9.html used to predict functional sites on protein surface using patch 153 analysis [30] . Parameters such as hydrophilicity, flexibility, accessibility, turns, exposed surface, polarity and the antigenic 157 propensity of polypeptides chains have been related to the location of B cell epitopes. So, this information about 158 parameters has led to would allow the position of B cell epitopes to be predicted from certain features of the 159 protein sequence. IEDB at http://tools.iedb.org/bcell/ was used to predict average score of physico-chemical 160 properties (hydrophobicity, flexibility/mobility, accessibility, polarity, exposed surface and turns) [31] . In 161 addition, Bcepred server at http://www.imtech.res.in/raghava/bcepred/ [32] was employed to predict linear B-162 cell epitopes in a protein sequence. The identification of features B cell epitopes have an important role in vaccine design, immunodiagnostic tests, 166 and antibody production. There are many tools to predict B cell epitope. Since each software uses specific 167 algorithms and exclusive methods in epitope prediction. Therefore, the usage of several tools to predict linear B-168 cell epitopes in protein sequences are more reliable. Ellipro at http://tools.immuneepitope.org [33] was used to 169 predict linear and discontinuous antibody epitopes based on a protein antigen's 3D structure. Also, in order to 170 predict liner epitopes based on Antigenic epitopes using support vector machine to integrate tri-peptide Fig. 9 and Table 1 . Also, the list of 4 discontinuous epitopes with the 309 highest PI (protrusion index) is given in Table 2 . ABCpred result shows 40 hits of 16 meric peptide sequences as B-cell epitopes ranking based on scores. The 311 best epitopes predicted by the server with a score above 0.85 are the sequences of "AGTITSGWTFGAGAAL", "GVSVITPGTNTSNQVA", "GWTAGAAAYYVGYLQP", "PQIITTDNTFVSGNCD", "HRSYLTPG 313 DSSSGWTA", "GSTPCNGVEGFNCYFP", "TVEKGIYQTSNFRVQP", "ESPIRATRYSYNDRME", "GCLIG 314 AEHVNNSYECD", "LQSYGFQPTNGVGYQP", "TEIYQAGSTPCNGVEG", "TRFQTLLA LHRSYLTP", "IGKIQDSLSSTASALG", "FAMQMAYRFNGIGVTQ", "SWMESEFRVYSSANNC", "CCSCLKG 316 CCSCGSCCK", "TKTSVDCTMYICGDST", and "EVRQIAPGQTGKIADY". The predicted B cell epitopes 317 mentioned above were identified by the Vaxijen server, and epitopes with ≥ 0.4 score ( One region covering residues 250-800 was selected as the best region to vaccine candidate by various 408 methodologies and softwares (Fig. 9) . Several properties such as Vaxijen antigenicity score, PI, solubility have 409 been correlated with the location of continuous epitopes. All results for the candidate region and S protein were 410 summarized in Table 5 . Properties mentioned have led to a search for empirical rules that would allow the 411 position of continuous epitopes to be predicted from certain features of the protein sequence. among the forefronts of these pathogens [4] . The recently emerged SARS-CoV-2 infects humans and causes 445 severe pneumonia and even in many cases, it is deadly [43, 44] . immunogenic data and vaccine development and this approach can decline time and cost [47] . The epitope- based vaccines can enhance immune responses by only selecting the antigenic parts of proteins exposed on the 455 surface. Thus, the employment of bioinformatics tools to select the appropriate region as a vaccine candidate 456 seems logical [41, 48] . Some of these designed vaccine candidates validate with experimental analyses. In vivo 457 analyses confirmed in silico predictions [49] [50] [51] . The spike protein on the surface of the viral particle plays key roles in the binding of the cell receptor and 459 membrane fusion, by which the host range is firmly determined. ACE2, found in the lower respiratory tract of 460 humans, is known as cell receptor for SARS-CoV [52] . Vaccine developed based on S protein could be effective against SARS-CoV-2 [53] [54] [55] has much better performance than the scales based on the single amino acid propensity. So, we composed all the 490 data from diverse servers to predict the best B cell epitopes [59] . These epitopes can be classified into two types: linear and discontinue epitopes. Amino acid sequences that are linear in shape are called Continuous epitopes 492 (Liner) while discontinuous epitopes refer to amino acid sequences that have a folded conformation. Linear and 493 discontinuous epitopes in S protein were predicted by various software and various algorithms to achieve 494 consensus epitopes. Consensus epitopes obtained from various algorithms are more reliable for selection. According to the predictions of the Svmtrip, ABCpred, Ellipro, and the antigenic scores belonging to them, the The B-cell epitopes that exhibited antigenic potential above the threshold score were analyzed for the 499 identification of T-cell epitopes. Adaptive immunity is mediated by the recognition of peptide antigens (T-cell 500 epitopes) bound to MHC molecules. In our study, 9-mer T-cell epitopes were predicted from the antigenic B- All these analyses reveal that majorities, as well as the best B cell epitopes and T cell epitopes, are located in 512 250-800 region. So, this region was selected as a vaccine candidate. The average of each single scale amino 513 acid propensity was increased in the candidate vaccines than their parent protein, S. The predicted epitopes 514 should be tested for therapeutic potency in future studies. In the present study, a bioinformatics analysis identify 515 surface-exposed peptides, rather than focus on the whole pathogen, which is more efficient. We predict and Epidemiological and Clinical Features of Hospitalized Patients with Corona Virus Disease 2019 in Yichang, China: A Descriptive Study The 2019/2020 Novel Corona Virus Outbreak: An International Health 536 Design of a peptide-based subunit vaccine against 538 novel coronavirus SARS-CoV Management strategies of neonatal 540 jaundice during the coronavirus disease Corona virus international public health 542 emergencies: implications for radiology management Provide Insights into Evolution and Pathogenesis Mechanisms of SARS-Related Coronaviruses Biochemical analysis of coronavirus spike 546 glycoprotein conformational intermediates during membrane fusion A Fusion Peptide in the Spike Protein of MERS Coronavirus The novel 552 coronavirus 2019 (2019-nCoV) uses the SARS-coronavirus receptor ACE2 and the cellular protease 553 TMPRSS2 for entry into target cells Biophysical characterization of the SARS-CoV2 spike protein 555 binding with the ACE2 receptor explains increased COVID-19 pathogenesis A pan-coronavirus fusion inhibitor 557 targeting the HR1 domain of human coronavirus spike Functional assessment of cell entry and receptor usage for lineage B β-559 coronaviruses A conceptual model for the coronavirus 561 disease 2019 (COVID-19) outbreak in Wuhan, China with individual reaction and governmental 562 action An updated estimation of the risk of 564 transmission of the novel coronavirus (2019-nCov) Epidemiological and clinical characteristics 566 of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study Drug treatment options for the 2019-new coronavirus (2019-nCoV) Human challenge studies to accelerate coronavirus vaccine 570 licensure ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmission Subunit vaccines against emerging pathogenic human 575 coronaviruses PRALINE™: a strategy for improved multiple alignment of 577 transmembrane proteins PRED-TMBB: a web server for 579 predicting the topology of β-barrel outer membrane proteins Predicting transmembrane protein 581 topology with a hidden Markov model: application to complete genomes The Phyre2 web portal for protein 583 modeling, prediction and analysis SWISS-MODEL: an automated protein homology-585 modeling server ProSA-web: interactive web service for the recognition of errors in 587 three-dimensional structures of proteins Data set for phylogenetic tree and 589 RAMPAGE Ramachandran plot analysis of SODs in Gossypium raimondii and G. arboreum InterProScan-an integration platform for the signature-recognition 592 methods in InterPro The 'second'global shift: The offshoring or global sourcing 594 of corporate services and the rise of distanciated emotional labour Comprehensive 3D-596 modeling of allergenic proteins and amino acid composition of potential conformational IgE 597 epitopes The immune 599 epitope database (IEDB) 3.0 BcePred: prediction of continuous B-cell epitopes in antigenic sequences 601 using physico-chemical properties ElliPro: a new structure-604 based tool for the prediction of antibody epitopes SVMTriP: A Method to Predict B-Cell Linear Antigenic Epitopes Linear B-Cell Epitope Prediction Software from a Users' Perspective Prediction methods for B-cell epitopes Linear B-cell epitope prediction for S1 protein of Infectious Bronchitis Virus ProPred: prediction of HLA-DR binding sites Peptide toxicity prediction. 614 Computational Peptidology AllergenFP: allergenicity prediction by 616 descriptor fingerprints Amino Acids of CarO Analysis as a Potential Vaccine Candidate in Acinetobacter Baumannii Depth: a web server to 621 compute depth, cavity sizes, detect potential small-molecule ligand-binding cavities and predict the 622 pKa of ionizable residues in proteins The epidemiological characteristics of an outbreak of 2019 novel 624 coronavirus diseases (COVID-19) in China Approach for a Novel Pathogen: using a home assessment team to evaluate patients for 2019 novel 627 coronavirus (SARS-CoV-2) The continuing 2019-nCoV epidemic 629 threat of novel coronaviruses to global health-The latest World Health Organization 632 declares global emergency: A review of the 2019 novel coronavirus An overview of biotechnology in vaccine development Epitope-based peptide 636 vaccine design and target site depiction against Middle East Respiratory Syndrome Coronavirus: an 637 immune-informatics study Acinetobacter baumannii infection via its functional deprivation of biofilm associated protein (Bap) Immunogenicity of cork and 642 loop domains of recombinant Baumannii acinetobactin utilization protein in murine model Immunogenicity of 645 conserved cork and ß-barrel domains of baumannii acinetobactin utilization protein in an animal 646 model Structure of mouse coronavirus spike 648 protein complexed with receptor reveals mechanism for viral entry Antigenic and immunogenic characterization of 650 recombinant baculovirus-expressed severe acute respiratory syndrome coronavirus spike protein: 651 implication for vaccine design Recombinant modified vaccinia virus 653 Ankara expressing the spike glycoprotein of severe acute respiratory syndrome coronavirus induces 654 protective neutralizing antibodies primarily targeting the receptor binding region S glycoprotein vaccine elicits high titers of SARS-associated coronavirus (SARS-CoV) 657 Genomic And Proteomic Studies Using Computational Approaches 659 Use of Isoelectric Point for Fast Identification of Anti-SARS CoV-2 Coronavirus Proteins Antigen recognition by T cells Immunobiology: The Immune System in Health and Disease 5th edition: Garland Science Prediction of linear B-cell epitopes using amino acid pair 665 antigenicity scale J o u r n a l P r e -p r o o f The authors have received no financial support for the elaboration of this manuscript. Yazd University did not play any decision-making role in the study analysis or writing of the manuscript. All authors declare no Potential Conflicts of Interest.