key: cord-0682993-62prhi1o authors: Byun, Jinyoung; Han, Younghun; Walsh, Kyle M.; Park, Amy S.; Bondy, Melissa L.; Amos, Christopher I. title: Shared genomic architecture between COVID-19 severity and numerous clinical and physiologic parameters revealed by LD score regression analysis date: 2022-02-03 journal: Sci Rep DOI: 10.1038/s41598-022-05832-5 sha: 049e9cc6dc6bce140b505ac1c8ff4500143def9f doc_id: 682993 cord_uid: 62prhi1o The COVID-19 pandemic has produced broad clinical manifestations, from asymptomatic infection to hospitalization and death. Despite progress from genomic and clinical epidemiology research, risk factors for developing severe COVID-19 are incompletely understood and identification of modifiable risk factors is desperately needed. We conducted linkage disequilibrium score regression (LDSR) analysis to estimate cross-trait genetic correlation between COVID-19 severity and various polygenic phenotypes. To attenuate the genetic contribution of smoking and BMI, we further conducted sensitivity analyses by pruning genomic regions associated with smoking/BMI and repeating LDSR analyses. We identified robust positive associations between the genetic architecture of severe COVID-19 and both BMI and smoking. We observed strong positive genetic correlation (rg) with diabetes (rg = 0.25) and shortness of breath walking on level ground (rg = 0.28) and novel protective associations with vitamin E (rg = − 0.53), calcium (rg = − 0.33), retinol (rg = − 0.59), Apolipoprotein A (rg = − 0.13), and HDL (rg = − 0.17), but no association with vitamin D (rg = − 0.02). Removing genomic regions associated with smoking and BMI generally attenuated the associations, but the associations with nutrient biomarkers persisted. This study provides a comprehensive assessment of the shared genetic architecture of COVID-19 severity and numerous clinical/physiologic parameters. Associations with blood and plasma-derived traits identified biomarkers for Mendelian randomization studies to explore causality and nominates therapeutic targets for clinical evaluation. We downloaded the GWAS summary statistics (COVID19-hg GWAS meta-analyses round 5, released on January 18, 2021; https:// www. covid 19hg. org/ resul ts/ r5/) from the COVID-19 Host Genetics Initiative (COVID-19 HGI) [11] [12] [13] , comprising (1) A2 (critical illness) 13 : 4,606 very severe, respiratory-confirmed COVID-19 patients versus 702,801 population-based controls (A2_ALL_eur_leave_23andme) and (2) B2 (hospitalization) 13 : 9,373 hospitalized COVID-19 patients versus 1,197,256 population-based controls (B2_ALL_eur_leave_23andme) ( Table 1 and Supplementary Tables 1 and 2 ). While the summary statistics of COVID-19-hg GWAS meta-analyses across multiple populations have been deposited at the COVID-19 HGI 11,13 , we restricted analyses to European-ancestry subjects (to align with the ancestral background of participants in GWAS of traits used in our downstream LDSR analyses) and did not include the 23andMe cohort (due to the data-use constraint, which makes only the top 10,000 SNPs publicly available) 14 . All methods were performed in accordance with the relevant guidelines and regulations. To estimate cross-trait genetic correlation patterns between COVID-19 disease severity and multiple polygenic traits, we harmonized publicly available GWAS summary-level data from the UK Biobank (UKBB), a prospective population-based cohort study consisting of ~ 500,000 individuals, aged 40-69 years, who were recruited in the United Kingdom between 2006 and 2010 15, 16 . All methods were carried out in accordance with relevant guidelines and regulations. GWAS summary-level data used for the LDSR analyses of UKBB traits are from publicly-posted results generated by the Neale lab (http:// www. neale lab. is/ uk-bioba nk/). These association analyses are adjusted with the first 20 principal components, which adjust for sources of population level variability in genetic allele frequencies. The GWAS summary-level data of UKBB used in our study are restricted to "British ancestry" using the first 6 principal components to determine "British ancestry" and further filtered by self-reported ethnicity with "white-British", "Irish", or "White". The sample sizes and more details for the tested traits are shown in Supplementary Table 2 . Estimating SNP-heritability and cross-trait genetic correlation of COVID-19. LD score regression analysis with 1000 Genomes Project European (EUR) samples as a reference for pattern of genome-wide LD quantifies the co-heritability of diverse traits 4,9, 10,17 using GWAS summary statistics for common genetic variants (i.e., SNPs). In brief, LDSR method regresses χ 2 statistics from GWAS on LD scores, allowing the estimation Table 1 . Study description. Trait Sample size SNPs A2 very severe, respiratory-confirmed COVID-19 patients versus population-based controls COVID-19 A2 707,407 1,140,193 B2 hospitalized COVID-19 patients versus population-based controls COVID-19 B2 1,206,629 1,141,302 Exclusion of smoking-associated genomic regions A2 very severe, respiratory-confirmed COVID-19 patients versus population-based controls COVID-19 A2⟂Smoke 707,407 1,001,866 B2 hospitalized COVID-19 patients versus population-based controls COVID-19 B2⟂Smoke 1,206, www.nature.com/scientificreports/ of genetic correlation without bias due to population stratification or cryptic relatedness 4, 9, 10, 18, 19 . By regressing SNP-level associations for two traits, (i.e., the product of Z scores, Z COVID19_A2 × Z UKBB_BMI ) and weighting each SNP by its LD Score (an estimate of the amount of total genetic variation tagged by each variant), one can estimate the magnitude and direction of shared genomic architecture between these traits. To control the multiple testing burden, we restricted analyses to the tested UKBB traits showing heritability ≥ 1% and for which prior studies have suggested correlations between COVID-19 and risk for severe outcomes, or traits that were correlated with traits that have been associated with severe outcomes. We conservatively set the test-wise level of significance after Bonferroni correction to be 0.05/(6 × 64), adjusting for analysis of COVID-19 severity (A2 and B2) with 64 UKBB traits, with and without removal of BMI and Smoking SNPs. We first implemented the command option of LD Score (https:// github. com/ bulik/ ldsc; ldsc v1.0.1) with "munge_sumstats.py" to generate the ".sumstats" format from the GWAS summary statistics after ~ 1.14 M HapMap3 SNPs with MAF > 1% were selected for the analysis as recommended. Multi-allelic SNPs and the major histocompatibility complex (MHC) region (Chr6:25-34 Mb) were excluded from summary statistics because of the complex and unusual LD pattern and genetic architecture of the MHC region 4 . We then applied "ldsc.py -rg covid19.A2.sumstats.gz, trait1. sumstats.gz-ref-ld-chr eur_w_ld_chr/-w-ld-chr eur_w_ld_chr/-out covid19.A2_triat1". Exclusion of genomic regions related to smoking behavior and BMI. Although a clearer picture is emerging, the contribution of cigarette smoking to COVID-19 disease severity remains incompletely understood, with most studies suggesting increased disease severity among former smokers versus never-smokers, but some studies observing a protective effect for current smoking 20 and others showing an increased risk for more severe symptoms in smokers 21 . Since smoking behaviors are heritable traits that correlate with many other complex diseases, we performed sensitivity analyses by excluding chromosomal regions (± 500 kb) around 473 SNPs previously associated with various smoking behaviors (⟂Smoke) to attenuate the genetic contribution of smoking-related variants 4 . The removed genomic regions related to cigarettes per day, smoking initiation, smoking cessation, initiation age of regular smoking, and nicotine dependence (Supplementary Tables 3 and 4 ). Although obesity increases risk of systemic inflammation, pulmonary clots, stroke, and myocardial infarction, it remains unclear whether reported associations between BMI and COVID-19 disease severity are confounded by socioeconomic status or concurrent health issues. We performed sensitivity analyses by excluding genomic regions (± 500 kb) around 941 SNPs previously associated with BMI (⟂BMI) to attenuate the genetic contribution of BMI-related variants (Supplementary Tables 4 and 5 ). We implemented cross-trait LDSR analysis to examine shared genetic contributions to COVID-19 disease severity and multiple clinical and epidemiologic traits using pairwise genetic correlations (rg) and the observed-scale heritability (h 2 , representing the proportion of phenotypic variance explained by all common SNPs). The flow chart presented in Fig. 1 summarizes the steps from data preparation to LDSR analysis for COVID-19 severity versus 64 polygenic traits we studied. A prior GWAS analysis 13 of very severe, respiratory-confirmed COVID-19 (phenotype A2: critical illness; 4606 cases, 702,801 controls in only European descent) identified loci on chromosomes 3, 12, 17, 19, and 21 that reached genome-wide statistical significance (P < 5.0 × 10 −8 shown in the red horizontal line), with a genomic inflation factor of 1.047, and an estimated h 2 of 0.35%. Sensitivity analysis excluding chromosomal regions known to be associated with smoking reduced the genomic inflation factor to 1.041 and h 2 to 0.34%. Sensitivity analysis excluding chromosomal regions known to be associated with BMI increased the genomic inflation factor to 1.050 and h 2 to 0.35% (Fig. 2 , Supplementary Table 6) . A prior GWAS analysis of hospitalized COVID-19 (phenotype B2: hospitalization; 9373 cases, 1,197,256 controls in only European-descent) identified loci on chromosomes 3, 12, 19, and 21 at genome-wide statistical significance, with a genomic inflation factor of 1.041, and an estimated h 2 of 0.19% (Fig. 2 ). Sensitivity analysis excluding chromosomal regions known to be associated with smoking reduced the genomic inflation factor to 1.038 and h 2 to 0.19%. Sensitivity analysis excluding chromosomal regions known to be associated with BMI reduced the genomic inflation factor to 1.035 and h 2 to 0.17% (Fig. 2, Supplementary Table 6 ). Using these GWAS results, we next performed LDSR analyses with two phenotypes for COVID-19 severity (COVID-19 A2, and COVID-19 B2) considering four phenotypes for exclusions of genomic regions related to BMI and Smoking (COVID19_A2⟂BMI, COVID19_A2⟂Smoke, COVID19_B2⟂BMI, and COVID19_ B2⟂Smoke) and 64 UKBB polygenic traits that had SNP array-based heritability (h 2 ) ≥ 1% (to maximize study power and to provide reliable inferences). Twenty-three diverse traits showed moderate to strong co-heritability with COVID-19 disease severity ( Table 2, Supplementary Table 7) , including several at Bonferroni-corrected significance level (P < 1.30 × 10 −4 ). Very severe, respiratory-confirmed COVID-19 illness (A2) and COVID-19 hospitalization (B2) showed strong genomic correlation with traits related to adiposity, diabetes, digestive diseases, smoking behaviors, hematologic traits, and selected nutrient levels (Fig. 3 , Supplementary Table 7) . Among physical traits, the genetic architecture of COVID-19 disease severity was positively correlated with BMI (rg COVID19_A2 = 0.20, P COVID19_A2 = 1.51 × 10 −5 ; rg COVID19_B2 = 0.34, P COVID19_B2 = 1.99 × 10 −8 ), weight (rg COVID19_A2 = 0.17, P COVID19_A2 = 1.24 × 10 −4 ; rg COVID19_B2 = 0.27, P COVID19_B2 = 7.23 × 10 −7 ), and whole body fat mass (rg COVID19_A2 = 0.20, P COVID19_A2 = 7.81 × 10 −6 ; rg COVID19_B2 = 0.33, P COVID19_B2 = 2.24 × 10 −8 ). After excluding genomic regions previously associated with BMI, both BMI and whole body fat mass continued to show strongly significant positive correlation with COVID-19 disease severity (rg COVID19_A2⟂BMI = 0.17 and P COVID19_A2⟂BMI = 2.37 × 10 −3 ; rg COVID19_B2⟂BMI = 0.28 and P COVID19_B2⟂BMI = 1.82 × 10 −5 ). Among medical conditions, the genetic architecture of COVID-19 disease severity was positively correlated with shortness of breath walking on level ground (rg COVID19_A2 = 0.28, P COVID19_A2 = 2.87 × 10 −3 ; rg COVID19_B2 = 0.43, P COVID19_B2 = 4.56 × 10 −5 ), diabetes (rg COVID19_A2 = 0.54, P COVID19_A2 = 7.10 × 10 −4 ; , and diseases of the musculoskeletal system and connective tissue (rg COVID19_A2 = 0.24, P COVID19_A2 = 4.84 × 10 −4 ; rg COVID19_B2 = 0.34, P COVID19_B2 = 3.54 × 10 −6 ). Excluding genomic regions associated with smoking behaviors or BMI generally attenuated these correlations, although most remained nominally associated at P < 0.05 and the association between COVID-19 hospitalization and diseases of the digestive system remained significant at Bonferroni-corrected levels after exclusion of smoking-associated loci (rg COVID19_B2⟂Smoke = 0.38, P COVID19_B2⟂Smoke = 2.30 × 10 −5 ). Among smoking behaviors, current tobacco smoking (rg COVID19_B2 = 0.34, P COVID19_B2 = 2.01 × 10 −6 ) and exposure to tobacco smoke at home (rg COVID19_B2 = 0.47, P COVID19_B2 = 1.73 × 10 −5 ) presented strongly significant positive genetic correlation with COVID-19 hospitalization, which were only modestly attenuated when removing known smoking-associated loci from analysis. Current tobacco smoking was more modestly associated with severe, respiratory-confirmed COVID-19 illness COVID (rg COVID19_A2 = 0.13, P COVID19_B2 = 0.021) and this association became non-significant after removing smoking-associated loci from analysis (Table 2) , supporting a link between known smoking risk loci and risk for severe COVID-19 outcomes. Examining hematologic traits, both high light scatter reticulocyte percentage and count were significantly positively correlated with COVID-19 hospitalization, as was immature reticulocyte fraction. These traits were also positively correlated with severe COVID-19 illness, but not at Bonferroni-corrected levels of statistical significance. C reactive protein levels were also positively correlated with COVID-19 disease severity. Interestingly, serum (not urinary) albumin was negatively correlated with COVID-19 disease severity at nominal statistical www.nature.com/scientificreports/ significance (rg COVID19_A2 = − 0.12, P COVID19_A2 = 0.026; rg COVID19_B2 = − 0.16, P COVID19_B2 = 0.011), as were HDL, apolipoprotein A levels, and levels of serum IGF-1 ( Table 2) . We also examined the pairwise genetic relationship between COVID-19 disease severity and nutrient-related traits in UKB. Although we were not able to observe any significant associations between COVID-19 critical illness and hospitalization and nutrient-related traits at Bonferroni-corrected levels, we identified suggestive negative correlations with magnesium (rg COVID19_A2 = − 0.39, P COVID19_A2 = 2.28 × 10 −3 ; rg COVID19_B2 = − 0.36, P COVID19_B2 = 5.17 × 10 −3 ), retinol (rg COVID19_A2 = − 0.59, P COVID19_A2 = 0.041; rg COVID19_B2 = − 0.59, P COVID19_B2 = 0.029), and vitamin E (rg COVID19_A2 = − 0.53, P COVID19_A2 = 2.16 × 10 −3 ; rg COVID19_B2 = − 0.53, P COVID19_B2 = 3.10 × 10 −3 ) ( Table 2 and Supplementary Table 7 ). Vitamin D levels were not associated with risk for severe COVID-19 (rg COVID19_A2 = − 0.023, P COVID19_A2 = 0.67; rg COVID19_B2 = − 0.043, P COVID19_B2 = 0.44). We investigated the genetic correlations between COVID-19 disease severity (A2:critical illness and B2:hospitalization) with a variety of clinical and physiologic traits using summary-level GWAS data from extremely large patient cohorts, observing shared genomic architecture with a number of illnesses and biomarkers of somatic well-being. We identify a suite of medical conditions and physiological traits that appear to share the genetic architecture with that of COVID-19 severity. Many of these traits overlap those previously identified in the large databases of COVID-19 patient outcomes, including traits related to adiposity, kidney function, and pulmonary insufficiency. We also identified additional traits that have received comparatively little attention, such as blood and serum levels of several vitamins and nutrients. Although our datasets are quite large (COVID-19 severity GWAS n = 707,407 and 1,206,629 for critical illness (A2) and hospitalization (B2), respectively; UKBB GWAS n = 361,194), larger datasets would likely identify many of these same associations and could potentially bring some of the nominally associated associations to a corrected level of statistical significance. Using an orthogonal genomics-driven approach that complements previous COVID-19 clinical epidemiology research, we confirm a link between the development of severe COVID-19 illness and both elevated BMI and diabetes. We also clarify associations with current smoking status, observing that it was positively correlated with COVID-19 disease severity, and note new associations with diverticulosis and reticulocyte traits. Additionally, www.nature.com/scientificreports/ we observe a suggestive association between increased disease severity and reduced levels of IGF-1-a marker of nutritional status-and additional suggestive protective associations with magnesium, retinol, and vitamin E levels. COVID-19 is primarily a respiratory illness. We observed that higher forced vital capacity (FVC) was negatively (protectively) associated with COVID-19 disease severity and observed a strongly positive correlation between the genetic architecture of 'shortness of breath while walking on level ground' and development of severe COVID-19 illness. Chest pain and discomfort have previously been associated with COVID-19 hospitalization and the U.S. Centers for Disease Control and Prevention (CDC) announced that individuals with chronic lung diseases including emphysema, chronic bronchitis, COPD, and interstitial lung disease are at high risk for becoming critically ill from SARS-CoV-2 1 . Our study demonstrates a positive correlation between the genetic architecture of these risk factors and COVID-19 disease severity through LDSR analyses. In this study, a differential diagnosis of COPD was strongly positively correlated with COVID-19 hospitalization, regardless of the exclusion of genomic regions related to BMI and smoking behaviors. Since chronic inflammation is an important feature in developing both emphysema and bronchitis, these finding suggest a potential shared genetic contribution between COPD and COVID-19 hospitalization separate from the contributions of known BMI and Table 2 . Cross-trait genetic correlations of COVID-19 on inclusion/exclusion of genomic regions associated with BMI and smoking. P-values in bold indicates P ≤ 1.30 × 10 −4 . COVID19_A2, very severe respiratory confirmed covid versus population including whole genomic regions; A2⟂BMI, very severe respiratory confirmed covid versus population with exclusion of genomic regions related to BMI; A2⟂Smoke, very severe respiratory confirmed covid versus population with exclusion of genomic regions related to smoking behaviors; COVID19_B2, hospitalized covid versus population including whole genomic regions; B2⟂BMI, hospitalized covid versus population with exclusion of genomic regions related to BMI; B2⟂Smoke, hospitalized covid versus population with exclusion of genomic regions related to smoking behaviors. Diseases of the musculoskeletal system and connective tissue 0. www.nature.com/scientificreports/ smoking-related variants. Variants located in immune-related genes and contributing to increased pulmonary inflammation could be evaluated in future work. Traits related to smoking behaviors were generally associated with increased COVID-19 disease severity in our analyses, including current smoking, exposure to tobacco smoke either at home or outside home, in utero tobacco smoke exposure, and cumulative pack-years. Conversely, never-smoker status showed negative genomic correlation with COVID-19 disease severity. Although UKBB does not delineate former smokers in ascribing smoking status, our analyses indicate that the genetic determinants of current smoking are associated with increased COVID-19 disease severity and do not support the clinical observations that current smoking may protect against severe COVID-19 illness. Given the lack of a COVID-19 vaccine during the first year of the pandemic and continued supply scarcity in numerous regions, many studies have sought to identify alternative strategies to minimize risk of developing severe COVID-19 following SARS-CoV-2 infection and also to treat severe COVID-19. In addition to evaluations of existing pharmacologic agents (e.g., ivermectin, hydroxychloroquine, azithromycin, and dexamethasone), vitamin and nutrient supplementation has been widely studied. Global mortality rate differences associated with latitude and clinical observations of low serum 25-hydroxyvitamin D levels among hospitalized COVID-19 patients has perhaps garnered greatest attention 22 , but we did not observe a significant association between genetic determinants of vitamin D levels and COVID-19 severity. However, we observed nominally significant protective effects for less-studied nutrient-related traits, including magnesium, calcium, retinol, and vitamin E. A combined vitamin D/magnesium/vitamin B12 combination was associated with a reduction in the proportion of elderly COVID-19 patients requiring oxygen support and intensive care support in a small prospective cohort 23 , and lower plasma retinol levels have also been observed in hospitalized COVID-19 patients 24 . We did not observe a significant association between serum Vitamin D levels and risk for COVID-19 or severe outcomes. Vitamin E levels have not been widely examined in the context of COVID-19, but deficiency is frequently associated with intestinal malabsorption rather than dietary insufficiency and thus may reinforce the observed genetic correlation between COVID-19 disease severity and diverticular disease in our analyses. In our study, we do observe an association between higher levels. Further, we observe protective associations for both HDL and serum concentration of apolipoprotein A, a major component of the HDL complex involved in clearing fat. HDL is involved in vitamin E absorption and contains approximately 40% of circulating α-tocopherol, the main dietary source of vitamin E 25 . Integration and harmonization of extant large-scale GWAS datasets has become a popular approach to reveal novel epidemiologic associations. Still, access to individual-level GWAS datasets remains limited, because of data www.nature.com/scientificreports/ use restrictions. The LDSR method does not require individual-level genotype data or LD pruning and can quantify the shared genetic architecture of traits having undergone GWAS analysis. However, LDSR analysis assumes absence of population stratification in the underlying summary statistics used and necessitates incorporation of GWAS data from populations expected to have similar genomic architecture. This assumption restricted our analysis to use of GWAS data from British-ancestry individuals, limiting our ability to make conclusions about the shared genetic architectures among other racial/ethnic groups. Given that COVID-19 disease severity has been associated with racial/ethnic background, as well as socioeconomic status and somatic well-being, it is imperative that efforts be made to enrich future genetic epidemiology studies for participants of non-European descent to expand generalizability of results. Interpretation of our results is also limited by the strong correlation between many of the traits studied with BMI and smoking behaviors. Although we made efforts to limit the impact of BMI and smoking-associated genetic variation by excluding known loci from LDSR analysis, such sensitivity analyses cannot account for polygenic contributions not yet having reached genome-wide statistical significance in prior research. To estimate cross-trait genetic correlation, we restricted the range of UKBB traits with an arbitrary threshold of h 2 ≥ 1% to improve reliability. For instance, there were additional subtypes of diabetes derived from the various medical records in UKBB and their estimates of SNP-heritability showed h 2 Type 1 Diabetes = 0.3% and h 2 Type 2 Diabetes = 0.4%. Therefore, we did not include results from them. The type of diabetes reported in Table 1 , described as "Diabetes diagnosed by doctor" (UKBB Field Identifier:2443), is not specified for the type of diabetes, but given the age of participants and the general prevalence of T2D versus T1D, the association between diabetes and COVID-19 severity is ostensibly driven by the shared genetic architecture between COVID-19 severity and the genetic architecture of T2D. Furthermore, LDSR analysis relies on the common genetic variants with MAF > 1% and therefore it can fail to capture the SNP-heritability on the observed scale due to underlying low-frequency or rare variants 4 . If a polygenic trait in UKBB shows a significant genetic correlation with COVID-19 severity, this does not imply a causal association. Both the tested trait and COVID-19 severity risk may be jointly influenced by an unmodeled trait that is independently associated with each 19 . Although our study relies on associations with common genetic variants (generally, MAF > 1%), the inclusion of additional rare genetic variants might be valuable as it could increase the overall trait heritability being modeled. Inferences from LDSR rely on normality assumptions that may be violated when rare variants are studied, and we therefore restricted analysis to more common variants. Additionally, LDSR does not explicitly model confounding effects which can arise when studying multiple correlated traits. Therefore, the method identifies novel associations that can be further studied using Mendelian Randomization or direct analyses of the nominated trait phenotypes for further confirmation of causal relationships. LDSR is a useful approach for identifying potential novel associations that will warrant further epidemiological analysis to tease apart causal associations from associations that are influenced by confounding. Our findings support previously identified risk factors for severe COVID-19 illness, including elevated BMI, diabetes, and numerous pulmonary conditions (e.g., COPD, reduced FEV, shortness of breath during mild activity). We also observe protective associations between the genetic underpinnings of COVID-19 severity and that of non-smoking, serum albumin, apolipoprotein A, HDL cholesterol level, and several nutrients. Further studies using Mendelian randomization approaches may help to dissect causal associations between COVID-19 disease severity and these traits, potentially nominating targets for therapeutic intervention. The datasets supporting the conclusions of this article are publicly available in the COVID-19 HGI website for COVID-19 severity GWAS summary statistics [https:// www. covid 19hg. org/ resul ts/ r5/] and Neale's lab repository for UK Biobank GWAS summary statistics [https:// github. com/ Neale lab/ UK_ Bioba nk_ GWAS], and therefore no approvals were required. Geneva: World Health Organization Genomewide association study of severe Covid-19 with respiratory failure Genetic testing and common disorders in a public health framework: How to assess relevance and possibilities. Background document to the ESHG recommendations on genetic testing and common disorders The shared genetic architectures between lung cancer and multiple polygenic phenotypes in genome-wide association studies Role of comorbidities like diabetes on severe acute respiratory syndrome coronavirus-2: A review Case characteristics, resource use, and outcomes of 10 021 patients with COVID-19 admitted to 920 German hospitals: An observational study Are patients with hypertension and diabetes mellitus at increased risk for COVID-19 infection? Genetic correlations of polygenic disease traits: From theory to practice An atlas of genetic correlations across human diseases and traits LD Score regression distinguishes confounding from polygenicity in genome-wide association studies The COVID-19 Host Genetics Initiative The COVID-19 host genetics initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV-2 virus pandemic Covid-19 Host Genetics Initiative. Mapping the human genetic architecture of COVID-19 Shared genetic etiology between idiopathic pulmonary fibrosis and COVID-19 severity Smoking, alcohol consumption, and cancer: A mendelian randomisation study in UK Biobank and international genetic consortia participants Genetic correlation profile of schizophrenia mirrors epidemiological results and suggests link between polygenic and rare variant (22q11.2) cases of schizophrenia Partitioned glioma heritability shows subtype-specific enrichment in immune cells The shared genetic architecture between epidemiological and behavioral traits with lung cancer Smoking and risk of COVID-19 hospitalization Current smoking and COVID-19 risk: Results from a population symptom app in over 2.4 million people The relationship between the severity and mortality of SARS-CoV-2 infection and 25-hydroxyvitamin D concentration: A metaanalysis Cohort study to evaluate the effect of vitamin D, magnesium, and vitamin B12 in combination on progression to severe outcomes in older patients with coronavirus (COVID-19). Nutrition 79-80, 111017 Vitamin A plasma levels in COVID-19 patients: A prospective multicenter study and hypothesis Mechanisms for the prevention of vitamin E excess We thank all individuals who have contributed their samples and clinical data for this study, and we also thank all members in the COVID-19 Host Genetics Initiative community for sharing GWAS summary statistics of COVID-19. The authors declare no competing interests. The online version contains supplementary material available at https:// doi. org/ 10. 1038/ s41598-022-05832-5.Correspondence and requests for materials should be addressed to J.B. or C.I.A.Reprints and permissions information is available at www.nature.com/reprints.Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.