key: cord-0838558-sop5nqqe authors: Sáenz Hinojosa, Samantha; Romero, Vanessa title: Risk HLA alleles in South America and potential new epitopes for SARS-CoV2 date: 2021-04-21 journal: Hum Immunol DOI: 10.1016/j.humimm.2021.04.005 sha: bf6f80842c1349a89a6ba35a4a2dcdb3aea26815 doc_id: 838558 cord_uid: sop5nqqe HLA alleles are associated with the body’s response to infection and the regulation of the immune system. HLA alleles have been reported to be involved in response to viral infections such as SARS-CoV2. Our study reviews the HLA alleles associated with protection or susceptibility to SARS-CoV2 and the prevalence of these HLA alleles in South America. Previous studies on HLA and SARS-CoV2 infection reported that HLA-A*02:02, HLA-B*15:03, and HLA-C*12:03 are protective; while HLA-A*25:01, HLA-B*46:01, and HLA-C*01:02 increase susceptibility. We identified that these alleles are not frequent in South America, confirmed that the spike protein is the most immunogenic protein of SARS-CoV2, and detected new immunogenic epitopes that bound to protective HLA alleles and to HLA alleles common in South America (binding score>0.90). These could be used as vaccine targets. Coronavirus has been related to many respiratory zoonotic infections and is responsible for three documented pandemics in humans: Severe Acute Respiratory Syndrome (SARS-CoV) in 2002, the Middle East Respiratory Syndrome (MERS-CoV) in 2012, and the current SARS-CoV2, which began at the end of 2019, has been responsible so far for over 58 million infected [1] . These three viruses are beta coronaviruses and it has been reported that 38% of HLA-class I epitopes are conserved among them. Some of the most immunogenic regions of these viruses are part of the structural proteins, including the spike glycoprotein, membrane protein, and nucleocapsid phosphoprotein. Additionally, coronaviruses use the spike glycoprotein for neutralizing antibodies and mediate membrane fusion and virus entry [2] . Nowadays, a variety of studies are being done to understand the interaction between SARS-CoV2 and the host immune system. This will help to develop preventive measures to confront this pandemic, such as vaccines. The constrain in which HLA binds to different regions of the virus influences the adequate immune response and clearance of the virus. It also affects the susceptibility to get infected and, in some cases, the development of complications. We aimed to understand how the HLA influences susceptibility or protection to SARS-CoV2 infection; therefore, we performed a literature review to find the HLA alleles that provide susceptibility or protection, and to evaluate how these specific HLA alleles bind to the most immunogenic structural regions of SARS-CoV2. Additionally, as most of the studies are limited to Asian populations, we analyze the frequency of these HLA alleles in South America and perform a binding prediction of the most common HLA alleles in this continent to SARS-CoV2 immunogenic regions. This literature review focuses on the available information on susceptibility or We performed an additional review to obtain the prevalence of those HLA found to be related to SARS-CoV2 in South America, as well as the most common HLA in this region, from PUBMED database as well as The Allele Frequencies Net Database [3] . The Epitope Analysis Resource from the Immune Epitope Database and Analysis Resource (IEDB) [6] was used to predict the epitope and MHC I and MHC II binding, and high and low binders were inferred quantitatively. The length of the peptides is aleatory, and the ANN validation is part of the IEDB. The peptides were reviewed manually to exclude duplication. First, we calculated the binding prediction using the Prediction Method from IEDB 2020.4 (NetMHCpan EL 4.0) [6] between the HLA alleles found in our systematic review (six), the most common HLA alleles worldwide (four) and the epitopes described by Grifoni et al., [7] which included: five highly immunogenic SARS-CoV regions, six dominant SARS-CoV2 B Cell epitopes and seventeen T Cell epitopes shown to have ≥90% identity with SARS-CoV region. Second, Grifoni et al., [7] reported SARS-CoV2 epitopes that were strongly bound to the most common worldwide HLA. We consider that it was useful to identify the epitopes that were strongly bound to the most protective HLA allele since it can help identify new immunogenic regions. Therefore, we used the nucleotide sequence from SARS-Cov2 nucleocapsid, membrane glycoprotein, and spike protein to calculate the prediction binding towards the most susceptible or protective HLA alleles found on the literature review. Finally, we calculated the binding prediction between SARS-CoV2 most immunogenic regions and the most common HLA alleles found in South America, from our literature review. To confirm the consistency of the results, we analyzed the bindings with the additional algorithm: ANN 4.0 [6] . SARS-CoV2 shares a similar S protein with over 70% of identity with SARS-CoV (Table 1) . Grifoni et al., [7] identified five highly immunogenic SARS-CoV regions, six dominant SARS-CoV2 B Cell epitopes and seventeen SARS-CoV2 T Cell epitopes, which have ≥90% similar identity to SARS-CoV regions. We calculated the binding scores between all epitopes individually to the six susceptible/protective HLA alleles Table 1 ). SARS-CoV 2's most immunogenic regions are the spike glycoprotein, membrane protein, and nucleoprotein [7] ; and the most protective HLA alleles are HLA-B*15:03, HLA-A*02:02, and HLA-C*12:03 [9]. We consider that it was useful to identify the epitopes that are strongly bound to the most protective HLA alleles since it can help identify new immunogenic regions. Thus, we calculated the binding prediction for spike glycoprotein, the nucleoprotein, and the membrane protein and detected good binding scores (score>90), especially on the spike glycoprotein and the nucleoprotein (Table 3) . Table 3 shows the best binding scores between the most common HLA alleles and the protective HLA alleles towards the dominant SARS-CoV2 B Cell epitopes, the dominant SARS-CoV2 T Cell epitopes and the most immunogenic SARS-CoV2 epitopes. We found four new SARS-CoV2 regions that represent a good binding with the most protective HLA alleles which could represent vaccine targets. The prevalence of the HLA alleles related to SARS-CoV2 (Table 1) (Table 4 ). In Ecuador, from a sample of 1010 Ecuadorians [16] , the most frequent allele was HLA-A*02, followed by HLA-B*15:03 and HLA-A*25:01. None of the six HLA alleles that contribute to susceptibility or protection to SARS-CoV2 infection had a higher frequency than 0.58. By the time we performed our research, there was no available data from Paraguay, Bolivia, Guyana, Suriname, and French Guiana. Supplementary Table 2 demonstrates the complete data, including the number of populations for each country, showing no significant changes with the summary scores shown in Table 4 . Additionally, we found that the most common HLA alleles in South America are HLA class I: HLA-A*02, HLA-A*24, HLA-B*35, HLA-B*44, HLA-C*04; and HLA class II: HLA-DQB1*03, HLA-DQB1*02, HLA-DQB1*04, and HLA-DRB1*04. When looking for the most common HLA alleles in South America we only found alleles defined at the first-field. We performed a prediction binding between the mentioned alleles and the most immunogenic SARS-CoV2 regions. The software used to calculate the binding, IEDB, only allows to perform bindings with alleles defined at the second-field; therefore, we calculated the binding with the closest HLA allele to the first-field, which correspond to the first second-field HLA allele found in the software. The binding between the MHC I alleles and SARS-CoV2 regions is summarized in Table 5 Table 4 ). In December 2019, patients with respiratory symptoms of unknown origin were described in the city of Wuhan, China. These patients shared a common exposure to the same seafood market in the city [18] . The disease was recognized to be caused by a novel coronavirus, later named SARS-CoV2 [18] . The virus was rapidly transmitted and expanded worldwide. By the end of January 2020, the World Health Organization (WHO), declared it to be a "public health emergency of international concern" [19] . The The rapid spread of the virus may respond to either a characteristic of the virus or of the host. Thus, we investigated the association with one of the main immune system features: the HLA. The development of protection or susceptibility to a specific disease is influenced by the binding between parts of the pathogen and the HLA type [20] . For instance, in the case of the HIV infection, HLA*74:01 is associated with a lower viral load (protection) while HLA-A*36:01 with a higher viral load (susceptibility) [20] . We demonstrated that the protective HLA identified from our systemic review bound tighter to immunogenic epitopes described by Grifoni et al. [7] . Additionally, we identified four new epitopes with strong bindings to the most protective HLA (HLA-B*15:03, HLA-A*02:02 and HLA-C*12:03), which could be used as potential vaccine targets and contribute to an enhanced immunity response [24] (Table 5 ). These epitopes are mainly found in the spike protein, the most virulent and immunogenic region [7] [17] . We searched for regional HLA frequency databases in South America that could provide accurate data; however, the information available is limited. The published studies analyzed primarily organ donors such as renal and bone marrow donors in Colombia or solid organs donors in Brazil. Furthermore, the Allele Frequencies Net Database also did not provide information from Paraguay, Bolivia, Guyana, Suriname, and French Guiana. We found in our literature review that the six HLA allele frequencies resulting in protection or susceptibility to SARS-CoV2, are uncommon in the region. [25] . For the most common HLA alleles in South America, we calculated the binding between SARS-CoV2 immunogenic regions. As mentioned, the IEDB software only allows to perform bindings with second-field HLA alleles, therefore, we calculated the binding with the closest HLA allele to the first-field, which correspond to the first secondfield HLA allele found in the software. We are aware that the first-field HLA allele do not always correspond to the second-field HLA allele. However, our results gives a general Additionally, we detected the epitopes interacting with the most common HLA in South America. Requena et al., reported a list of the best HLA-I candidate epitopes and our data overlap with IPFAMQMAY, YLQPRTFLL, VASQSIIAY, and YYVGYLQPRTF (score>90) [17] . Additionally, the epitopes IPFAMQMAY, YLQPRTFLL and SYFIASFRLF are also mentioned by Cuspoca et al., as potential vaccine peptides [25] . Moreover, we report new epitopes for HLA-I that could be useful in the development of a regional vaccine (Table 5 ). In conclusion, we provide bioinformatic evidence that HLA-B Coronavirus Update (Live): 58,317,427 Cases and 1,384,174 Deaths from COVID-19 Virus Pandemic -Worldometer n Characterization of spike glycoprotein of SARS-CoV-2 on virus entry and its immune cross-reactivity with SARS-CoV Allele frequency net database (AFND) 2020 update: goldstandard data classification, open access genotype data and new query tools Database resources of the National Center for Biotechnology Information The Protein Data Bank The Immune Epitope Database (IEDB): 2018 update A Sequence Taiwan in 2003 Association of HLA class I with severe acute respiratory syndrome coronavirus infection Association of human leukocyte antigen class II alleles with severe Middle East respiratory syndromecoronavirus infection Genetic diversity of the HLA system in human populations from the Sierra (Andean), Oriente (Amazonian) and Costa (Coastal) regions of Identification of Novel Candidate Epitopes on SARS-CoV-2 Proteins for South America: A Review of HLA Frequencies by Country The SARS-CoV-2 outbreak: What we know Transmission dynamics and evolutionary history of 2019-nCoV Human Leukocyte Antigen (HLA) and Immune Regulation: How Do Classical and Non-Classical HLA Alleles Modulate Immune Response to Human Immunodeficiency Virus and Hepatitis C Virus Infections? HLA and Infectious Diseases. HLA and Associated Important Diseases, InTech Binding affinities of 438 HLA proteins to complete proteomes of seven pandemic viruses and distributions of strongest and weakest HLA peptide binders in populations worldwide Preliminary Identification of Potential Vaccine Targets for the COVID-19 Coronavirus (SARS-CoV-2) Based on SARS-CoV Immunological Studies Major histocompatibility complex genomics and human disease A Multi-Epitope Vaccine Against Sars-Cov-2 Directed Towards the Latin American Population: An Immunoinformatics Approach