key: cord-0028913-nwvsubtj authors: Shin, Cheol Min; Park, Kyungtaek; Kim, Nayoung; Won, Sungho; Ohn, Jung Hun; Lee, Sejoon; Park, Ji Hyun; Kang, Seung Joo; Kim, Joo Sung; Lee, Dong Ho title: rs2671655 single nucleotide polymorphism modulates the risk for gastric cancer in Helicobacter pylori–infected individuals: a genome-wide association study in the Korean population date: 2022-03-24 journal: Gastric Cancer DOI: 10.1007/s10120-022-01285-x sha: 8bf48aa3c3978f3a235f9a0740b4ee335632bbb9 doc_id: 28913 cord_uid: nwvsubtj OBJECTIVE: To identify genetic variations which is associated with gastric cancer (GC) risk according to Helicobacter pylori infection. METHODS: This study incorporated 527 GC patients and 441 controls from a cohort at Seoul National University Bundang Hospital. The associations between GC risk and single nucleotide polymorphisms were calculated, stratified by H. pylori status, adjusting for age, sex, and smoking. mRNA expression from non-cancerous gastric mucosae was evaluated using reverse transcription quantitative polymerase chain reaction. RESULTS: In the entire cohort, genome-wide association study showed no significant variants reached the genome-wide significance level. In the H. pylori–positive group, rs2671655 (chr17:47,468,020;hg19, GH17J049387 enhancer region) was identified at a genome-wide significance level, which was more pronounced in diffuse type GC. There was no significant variant in the H. pylori–negative group, indicating the effect modification of rs2671655 by H. pylori. Among the target genes of GH17J049387 enhancer (PHB1, ZNF652 and SPOP), PHB1 mRNA was expressed more in cases than in controls, who were not affected by H. pylori. By contrast, an increase in ZNF652 and SPOP in GC was observed only in the H. pylori–negative group (P < 0.05). Mediation analysis showed that PHB1 (P = 0.0238) and SPOP (P = 0.0328) mediated the effect of rs2671655 on GC risk. The polygenic risk score was associated with the number of rs2671655 risk alleles only in the H. pylori–positive group (P = 0.0112). CONCLUSION: After H. pylori infection, rs2671655 may increase GC risk, especially in diffuse-type GC, by regulating the expression of several genes that consequently modify susceptibility to GC. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10120-022-01285-x. Gastric cancer (GC) is the fifth most diagnosed cancer and the third leading cause of cancer mortality worldwide with approximately 1 million incident cases per year; and more than 720,000 deaths annually [1, 2] . Helicobacter pylori infection is associated with GC risk. However, only 2-3% of individuals infected with H. pylori develop GC. [3] Moreover, H. pylori eradication decreases but does not eliminate GC risk. Previous studies have reported that a family history of GC in the first-degree relatives increases GC risk [4] , and a large cohort twin study showed that host genetics increased GC risk by 28% [5] . Genome-wide association studies (GWAS) identified the single nucleotide polymorphisms (SNPs) associated with GC risk such as rs4072037 (mucin 1 [MUC1]), rs9841504 (ZBTB20), rs13361707 (protein kinase AMP-activated catalytic subunit α1 [PRKAA1]), rs2294008 (prostate stem cell antigen [PSCA] ), and rs2274223 (PLCE1) [6] [7] [8] , which could contribute to H. pylori-associated GC risk. Synergistic interaction between H. pylori infection and family history of GC has been observed [9, 10] , and the effect of H. pylori infection on GC risk might differ according to host genetics [11] [12] [13] . However, most genetic studies only evaluated the interaction by using a candidate gene approach and no GWAS has considered the interactions between host genetics and H. pylori in GC. From this background, we hypothesized that comprehensive GWAS regarding H. pylori infection will reveal some genes that modify susceptibility to GC. The aim of this study was to elucidate genomic loci associated with GC risk that are differently affected by H. pylori infection in Korean population. This study consecutively incorporated a total of 1216 subjects (610 GC patients and 606 controls) registered at SNUBH. Most of the controls had undergone standard esophagogastroduodenoscopy as part of screening program for premalignant gastric mucosal lesions or GC. The subjects were enrolled as controls when esophagogastroduodenoscopy results showed no evidence of GC, dysplasia, mucosaassociated lymphoid tissue lymphoma, or esophageal cancer. All cases were identified as gastric adenocarcinomas histologically after surgery or endoscopic therapy. All subjects who were 26-80 years of age provided informed consent and were asked to complete a questionnaire under the supervision of a well-trained interviewer. The questionnaire included questions regarding demographic data (age, sex, and current and childhood residences), socioeconomic data (smoking, monthly income, and education level), dietary data (salty and spicy food diet), and history of H. pylori eradication therapy. The study protocol was approved by the Ethical Committee at SNUBH (IRB No. B-1610-366-303). In addition, this study was registered at ClinicalTrials.gov (NCT03486574; date of study start: December 7, 2016; date of primary completion: December 31, 2022 [Anticipated] ). Blood samples were obtained from all study subjects, and DNA from buffy coat layer was extracted using QIAamp ® DNA blood mini kit (Qiagen, CA, USA) according to the manufacturer's instructions. The purity and concentration of the extracted DNA was assessed using a Nanodrop spectrophotometer (Thermo scientific, USA). The extracted DNA samples were stored at − 20℃ for further analysis. The study subjects were genotyped using the Affymetrix Axiom Korean Chip (v1.1) on 796,769 variants [14] . The genotypes were clustered using K-medoids [15] , and data were trimmed by following the quality control (QC) steps suggested by Anderson et al. [16] (Suppl. Fig S1) . Participants were excluded from analysis when (1) genotypeestimated sex was discordant with biological sex, (2) the missing rate of the genotype was larger than 0.03, (3) the heterozygosity rate was greater than three standard deviations from the mean of all participants, and (4) the pairwise identity-by-descent estimate was larger than 0.185 with a larger missing genotype rate than that of the counterpart. Variants were excluded when (1) they were located in the sex chromosome, (2) their missing rate was significantly different (p < 1 × 10 -5 ) between cases and controls and was larger than 0.03, (3) the minor allele frequency (MAF) was less than 0.005, and (4) the P-value of the Hardy-Weinberg equilibrium (HWE) test was < 0.001. After QC, we obtained data on the remaining 1,038 participants and 606,270 variants by using the Michigan Imputation Server (v.1.1) [17] . Haplotype Reference Consortium r1.1 2016 with non-European and mixed populations, Eagle (v.2.4), Minimac4, and Asian were designated as the reference panel, phasing program, imputation program and QC population in its established pipeline, respectively. After imputation, variants were excluded when (1) the MAF was less than 0.05, (2) the missing rate was larger than 0.03, (3) the p-value of the HWE test was < 0.001, and (4) the imputation quality score (INFO) was smaller than 0.3. We also noticed that 3 subjects in the control group had an intermediate GC state, and 51 had a previous history of other cancers; thus we excluded them from the study analysis. Variant-wise logistic regression was performed by adjusting for sex, age, smoking status, and top four principal component (PC) scores. PC scores were calculated using pruned variants extracted by using the option -indep-pairwise 50 5 0.2 in plink (v1.90b3.44) [18] . Among the participants, 15 participants had no information on smoking status, and GC status was unidentified in 1 participant. Finally, 968 participants (527 GC cases and 441 non-cancer controls) and 4,962,361 variants were used for the analysis. There were 761 H. pylori-positive participants (454 cases and 307 controls) and 207 H. pylori-negative participants (73 cases and 134 controls, Suppl. Fig S1) . Among 527 GC cases, 323 had intestinal type cancer (271 H. pylori-positives and 52 negatives), and 197 had diffuse type cancer (176 H. pyloripositives and 21 negatives). For genomic data analysis, plink (v1.90b3.44), ONETOOL(v1.0) [19] , and R (v3.6.3) were used. The H-PEACE cohort consisted of participants who were enrolled in Seoul National University Hospital Gangnam Center for health check-ups from 2003 to 2017 [20] . Among them, 8000 and 2349 participants were genotyped with the Affymetrix Axiom Korean Chip (v1.0) at different times, respectively. After combining both genotype data, we followed the same QC and imputation steps as those state above and additionally excluded participants who had no information on H. pylori infection or were older than 65 years with GC. After QC, 7812 participants (21 GC cases and 7791 controls) remained. Among them, 3975 individuals were H. pylori-positive (13 GC cases and 3,962 controls). Logistic analysis for a significant variant in the SNUBH cohort was conducted with the same covariates except smoking status as the GWAS of the SNUBH cohort. UKBB data are well-known collections of paired genotype and phenotype data from half a million participants (https:// www. ukbio bank. ac. uk/). We extracted self-reported cancer status (field id: 20001; f.20001), age (f.21022), smoking status (f.20116), and genotype data from the data collection. Participants who requested withdrawal and had cancer other than GC were excluded. Thereafter, 182 GC and 487,097 non-cancer participants were used to validate the association between a significant variant in the GWAS of SNUBH and GC risk by using the same logistic model as the GWAS. In all subjects, ten biopsy specimens were obtained during endoscopy for histological analysis, and a Campylobacterlike organism test (CLOtest), and culture were performed to determine the presence of current H. pylori infection. This methodology has been presented previously [21] . In brief, two biopsy specimens from the greater curvature side of the antrum and two from the corpus were fixed in formalin to assess the presence of H. pylori by modified Giemsa staining and the degree of inflammatory cell infiltration, atrophy and intestinal metaplasia by hematoxylin and eosin staining. These histologic features of the gastric mucosa were recorded using the updated Sydney scoring system ("0," none; "1," mild; "2," moderate; and "3," marked) [22] . Another specimen from each of the lesser curvature of the antrum and the corpus was used for rapid urease testing (CLOtest, Delta West, Bentley, Australia), and four specimens (two from the antrum and two from the corpus, respectively) were used for the culture. The organisms present were identified as H. pylori by Gram staining; colony morphology; and positive oxidase, catalase, and urease reactions [23] . The remaining biopsy specimens from non-cancerous tissue from the antrum and corpus and GC tissues were immediately frozen at − 70 °C. Fasting serum samples were collected from the study participants at baseline. For H. pylori serology testing, specific immunoglobulin G for H. pylori was identified by an enzyme-linked immunosorbent assay in each subject's serum (Genedia H. pylori ELISA; Green Cross Medical Science Corp., Eumsung, Korea); the Korean strain was used as an antigen in this H. pylori antibody test [24] . In addition, the serum concentrations of pepsinogen (PG) I and II were measured using a latex-enhanced turbidimetric immunoassay (Shima Laboratories, Tokyo, Japan). In the study, no atrophy was defined as PG I > 70 and PG I/II ratio > 4.0, and definite atrophy was defined as PG I/II ratio < 2.5, with stricter cutoff values than previous report [24] . The H. pylori-positive group includes both current (or active) and past infections. In this study, H. pylori serology and previous history of H. pylori eradication, as well as CLOtest and histology, were comprehensively checked to confirm H. pylori infection. Subjects with a history of H. pylori eradication without current evidence of H. pylori infection were considered to have been exposed to H. pylori. Total RNA was extracted from body specimens of the gastric mucosa using Trizol Reagent (Invitrogen, Carlsbad, CA, USA). RNA samples were diluted to a final concentration of 0.5 mg/mL in RNase-free water and stored at − 80 °C until use. Total RNA (1000 ng) was reverse transcribed into complementary DNA (cDNA) using the High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems, Foster City, CA, USA) according to the manufacturer's instructions. Quantitative PCR (qPCR) was performed in 96-well reaction plates by using 2 μL of complementary DNA in a 20-μL reaction mixture containing 2 × SYBR Premix Ex Taq (Takara Bio, Otsu, Japan). Samples were amplified using the StepOnePlus Real-Time PCR System (Applied Biosystems). The thermal profile consisted of an initial denaturation at 95 °C for 3 min followed by 40 cycles at 95 °C for 5 s and at 60 °C for 30 s. The cycle threshold (Ct) value of the target genes was normalized to that of the housekeeping gene to obtain the delta Ct. The mRNA expression levels of the target genes were compared with those of the endogenous control β-actin by using the 2 −ΔΔCT method. Real-time qPCR was conducted using custom-made primers as follows: β-actin, forward 5-TTC GAG CAA GAG ATG GCC AC-3 and reverse 5-CGG ATG TCC ACG TCA CAC TT-3′; prohibitin (PHB1), forward 5′-CGG AGA GGA CTA TGA TGA G-3′ and reverse 5′-GGT CAG ATG TGT CAA GGA -3′; zinc finger protein 652 (ZNF652), forward 5′-GTT TCA GTA CAA GTA CCA GC-3′ and reverse 5′-AGA TAA AGG GTT TCT CTC CAG-3′; and speckle-type POZ protein (SPOP), forward 5′-TGA CCA CCA GGT AGA CAG CG-3′ and reverse 5′-CCC GTT TCC CCC AAG TTA -3′. [25] The association between the estimated expression levels of genes and variants or GC risk was estimated using linear regression with adjustments for the effects of variables used for GWAS analysis. The paraffin sections of the 16 controls (10 without H. pylori infection and 6 with active H. pylori infection) and 17 cases (9 without H. pylori infection and 8 with H. pylori infection) were subjected to immunohistochemical (IHC) staining for prohibitin and ZNF652. Tumor tissue Sects. (3 or 4 mm thick) were deparaffinized in xylene and rehydrated in a graded ethanol series. Epitope retrieval was performed in a citrate buffer (pH 6.0) in humid heat in a pressure cooker. Thereafter, the tissue sections were incubated with a primary mouse monoclonal antibody against prohibitin (MA5-12,855, Thermo Fisher Scientific, 3747 N. Meridian Road, Rocfold, IL 61,105, USA) and ZNF652 (ZNF652 antibody, NBP1-97,753, Novus Biologicals, 10,730 E. Briarwood Avenue, Centennial, CO 80,112, USA). Sites of immunoreactivity were visualized using a SuperPicture Polymer Detection Kit (Invitrogen, USA). The slides were viewed by light microscopy by using a Nikon Eclipse E600 microscope (Nikon, USA) equipped with a digital Nikon DSM1200F camera (Nikon, USA). IHC results were classified as absent (0), mild (1), moderate (2) , and marked (3) according to the intensity of immunoreactivity (Suppl. Fig. S2 ). We used a publicly available transcriptome dataset, namely GSE79973, to target genes associated with GC [26] . It is an mRNA expression microarray data of pairs of GC tissue and its adjacent non-tumor tissue extracted from ten different human patients. limma (v3.42.2) R package was used to analyze the data with the duplicateCorrelation function to adjust sample correlation [27, 28] . Among the genes regulated by GH17J049387 enhancer, where a significant variant is located, HSALNG0117178, ENSG00000248714, E N S G 0 0 0 0 0 2 0 7 1 2 7 , E N S G 0 0 0 0 0 2 5 0 9 4 8 , a n d ENSG00000262039 were not included in the data analysis. If there were more than two probe IDs corresponding to one gene id, the most significant probe ID was chosen. The mediation effect of a gene between a variant and GC risk was evaluated using Sobel test with bootstrapping [29] . To calculate the effect, the following two different regression models were examined: a model of the association between the expression level of a gene and a variant and a model of the GC risk and expression level of a gene with a variant as covariate. When using all samples, H. pylori infection status was added as a covariate to both regression models. The standard error of the mediation effect was estimated by resampling 100,000 times with the replacement. Genome-wide complex trait analysis (v1.92.0; GCTA) was used to calculate the PRS of 968 participants [30] . We excluded rs2671655 and its highly correlated variants ( r 2 > 0.2 ) within the 1 MB region. Thereafter, the genomic relationship matrix (GRM) was estimated using only pruned variants larger than 0.1 MAF. The pruned variants were extracted using the -indep-pairwise 50 10 0.2 command in plink. With the GRM, PRS was calculated without rs2671655 and its highly correlated variants. To adjust for ascertainment bias, 0.00298 was used as the prevalence value [31] , and sex, age, smoking status and top four PC scores were used as covariates. For genes that are closely located near previously identified variants, gene set analyses were conducted with the SNUBH cohort [6] , and they were performed with the optimal sequence kernel association test [32] . The effects of variables used for GWAS were adjusted and H. pylori infection status was included as a covariate when using all samples. Variants that are located in each gene's genomic location and within its 0.2 Mb flanking region were selected. We also summarized variants that had the lowest p-value in the region for each gene from GWAS, and Holm-Bonferroniadjusted p values were calculated [33] . Continuous variables were presented as mean ± standard deviation, and categorical variables were presented as numbers with proportions. For categorical variables, the χ 2 test was used for analysis. The differences between the groups were analyzed using Student's t test when there were only two groups or one-way ANOVA when there were more than two groups. All statistical analyses were performed using R version 3.2.3 (The R Foundation for Statistical Computing, Vienna, Austria; http:// www.r-proje ct. org). Statistical significance was set at P values < 0.05. Table 1 presents the baseline characteristics of the SNUBH cohort with 968 subjects (527 GC cases and 441 controls). GC patients were older and predominantly male, and they had a higher proportion of H. pylori positivity, smoking, alcohol drinking, and high-salt intake than the control group (P < 0.001). Education level was significantly different between the cases and controls (P < 0.001). By using the SNUBH cohort, a GWAS was conducted to elucidate the genomic loci associated with GC risk. For a GWAS with all participants, there was no significant variant at a significance level of p < 5 × 10 -8 (Suppl. Fig S3) . The most significant variant was rs2484529 (chr10:13,338,730; hg19) and its p value was 2.00 × 10 -7 (Suppl . Table S1 ). We divided the cohort into two groups according to H. pylori infection status and conducted a GWAS separately for both datasets. The H. pylori-positive group consisted of 454 GC cases and 307 controls. GWAS with H. pylori-positive participants identified a significant locus where rs2671655 had the lowest p-value (P = 4.08 × 10 -8 , OR = 2.03, 95% CI = 1.57-2.61, Fig. 1A and Suppl. Table S2 ). Type-1 error was well controlled in this analysis (genomic inflation factor = 1.015), i.e., the p-value was reliable (Fig. 1B) . The variant is located on chromosome 17q21.33 (chr17:47,468,020; hg19) (Fig. 1C) , and its risk and protective alleles are T and C, respectively. The risk allele frequency of the variant was 0.708 and 0.599 in the cases and controls, respectively, and the risk allele frequency in the controls was similar to that in the Korean population (0.601; Table 2 ). The variant was not significantly associated with GC risk in the H. pylori-negative group (P = 0.929, OR = 1.02, 95% CI = 0.64-1.64, Table 2 ) or in the entire cohort (P = 1.96 × 10 -6 , OR = 1.70, 95% CI = 1.37-2.11; Table 2 ). The H. pylori-negative group consisted of 73 GC cases and 134 controls, and GWAS with these participants were conducted. There was no significant variant from the GWAS of the H. pylori-negative group (n = 207, Suppl. Fig. S4 ). The variant with the lowest P -value was rs169356 (chr9:34,782,550; hg19), and its P-value was 5.35 × 10 -6 (Suppl . Table S3 ). Then, we conducted GWAS taking into consideration Lauren classification and H. pylori infection (Suppl. Fig. S5 , Suppl. Tables S4-S8). In the case of intestinal type GC, there was no significant variant (P < 5 × 10 -8 ). However, a significant association of rs2671655 was identified (P = 4.09 × 10 -8 , OR = 2.53, 95% CI = 1.82-3.52) when diffuse type GC patients were compared with controls in H. pylori-positive group (176 cases and 307 controls, Suppl. Table S8 ). The association between rs2671655 and GC risk was replicated in two independent cohorts, namely H-PEACE and UKBB. In the H-PEACE cohort after QC, the number of participants infected by H. pylori was 3,975, of which 13 participants had GC. Although the number of cases was too small, the risk allele frequency of the variant of cases (0.731) was larger than that of controls (0.651, one-tail P = 0.150, OR = 1.59, 95% CI = 0.66-3.83, Suppl. Table S9 ). The UKBB cohort had H. pylori infection information for approximately 1% of the total subjects only. Thus, we compared GC cases (n = 182) and non-GC subjects (n = 444,614) regardless of H. pylori infection. Logistic analysis showed a significant association between rs2671655 and GC risk at a nominal p-value of 0.05 (one-tail P = 0.041, OR = 1.61, 95% CI = 0.94-2.74) and the allele frequencies of the variant were 0.961 and 0.940 in the cases and controls, respectively (Suppl . Table S9 ). rs2671655 is located in GH17J049387 enhancer, which targets several genes such as PHB1, ZNF652, SPOP, and lysine acetyltransferase 7 (KAT7). PHB1 is physically closest to rs2671655, and 256 cases and 159 controls were randomly selected from the SNUBH cohort to measure the expression level of PHB1 by using non-cancerous gastric mucosae with reverse transcription qPCR (RT-qPCR). As the number of risk alleles for rs2671655 increased, PHB1 was significantly more expressed both in the H. pylori-positive group (β = 0.530; P = 1.34 × 10 -3 ) and H. pylori-negative group (β = 0.804, P = 1.35 × 10 -3 ) (Fig. 2A) . Figure 2B shows that PHB1 was significantly more expressed in participants with GC than those without GC regardless of H. pylori infection (H. pylori-positive group: β = 0.604, P = 1.82 × 10 -2 ; H. pylori-negative group: β = 0.530, P = 3.78 × 10 -2 ). We analyzed GSE79973, which is a published expression microarray dataset, to target other differentially expressed genes between GC and normal tissues. The results showed that among the genes regulated by GH17J049387, ZNF652, SPOP and KAT7 were significantly associated with GC at an FDR 0.1 level (Suppl . Table S10) . RT-qPCR experiments were conducted to measure the expression levels of KAT7, ZNF652, and SPOP using noncancerous gastric mucosae from the same subjects prepared for PHB1 expression levels. The expression level of KAT7 was not estimated by RT-qPCR. ZNF652 was significantly more expressed as the number of risk alleles of rs2671655 increases in the H. pylori-positive group (β = 0.300, P = 4.08 × 10 -2 ), but its expression level did not depend on the number of risk alleles in the H. pylori-negative group (β = 0.027, P = 0.888; Fig. 3A ). There was no difference in ZNF652 expression levels between the cases and controls in the H. pylori-positive group (β = 0.206, P = 0.357), but a significant difference was observed in the H. pylori-negative group (β = 0.773, P = 6.56 × 10 -3 ; Fig. 3B ). We also evaluated the association between SPOP expression levels and rs2671655. As the number of risk alleles increased, SPOP was more expressed in both the H. pylori-positive group (β = 0.649, P = 2.07 × 10 -2 ) and H. pylori-negative group (β = 0.729, P = 6.04 × 10 -5 ) (Fig. 4A ). In addition, an increase in SPOP in GC was observed only in the H. pylorinegative group (β = 1.348, P = 1.46 × 10 -3 ; Fig. 4B ). The expression levels PHB1, ZNF652 and SPOP were correlated (P < 0.05) in both the H. pylori-positive group and H. pylori-negative group (Suppl. Tables S11 and S12). PCA was conducted by considering their correlations. By using the expression levels of PHB1, ZNF652, and SPOP, PC scores were estimated for the H. pylori-positive group and H. pylori-negative group. Thereafter, the association between GC risk and the top three PC scores was tested after adjusting for the effect of rs2671655 and variables used in GWAS. In the H. pylori-positive group, GC risk significantly increased as the second top PC (PC2) score increased (β = 0.236, P = 3.42 × 10 -2 ; Suppl. Fig. S6A , S6B and S6C). PC2 increased as PHB1 became more expressed, and the other two genes became less expressed (Suppl. Figures S6D, S6E and S6F ). However, in the H. pylori-negative group, GC risk was significantly associated with only the first top PC (PC1) score (β = 0.349, P = 1.55 × 10 -3 ; Suppl. Fig. S7A , S7B and S7C) and PC1 increased as the expression of the three genes increased (Suppl. Fig. S7D, S7E and S7F ). Immunohistochemistry (IHC) was performed for PHB1 and ZNF652 from the non-cancerous gastric mucosae of the controls (n = 16) and GC cases (n = 17). PHB1 was more expressed in the gastric mucosae of GC cases than in the controls, regardless of H. pylori infection (P < 0.05; Suppl. Fig. S8A ). There was no difference in the ZNF652 expression scores between the controls and GC cases in the H. pylori-positive group, but ZNF652 expression was significantly increased in the GC cases compared with the controls in the H. pylori-negative group (P < 0.05; Suppl. Fig. S8B ). These findings were comparable with the RT-qPCR findings. However, with the small number of samples, significant association between IHC scores and the number of T allele of rs2671655 could not be observed in PHB1 (Suppl. Fig. S8C ) and ZNF652 (Suppl. Fig. S8D) , regardless of H. pylori infection status. Mediation analysis was conducted to investigate whether the effect of rs2671655 on GC is mediated by gene expression by using the Sobel test with bootstrapping. When using all samples, significant mediation effects of PHB1 and SPOP were found (mediation effect [m] = 2.261 and P = 2.38 × 10 -2 for PHB1 and m = 2.135 and P = 3.28 × 10 -2 for SPOP). ZNF652 did not have a significant mediation effect ( m = 1.110, P = 0.267; Suppl. Table S8 ). According to the subgroup analysis stratified by H. pylori infection status, similar mediation effects were found only in PHB1 (1.529 for the H. pylori-positive group and 1.774 for the H. pylori-negative Table S13 ). We calculated the PRS of participants by using pruned variants, except rs2671655 and its highly correlated variants (r 2 > 0.2). The PRS of H. pylori-positive GC cases increased as the number of risk alleles of rs2671655 increased (β = 2.35 × 10 -9 , P = 1.12 × 10 -2 ), but there was no difference in the PRS of H. pylori-negative GC cases according to the number (β = -1.59 × 10 -9 , P = 0.392) (Fig. 5 ). The PRS of H. pylori-positive GC cases was significantly larger than that of H. pylori-negative GC cases if participants had two risk T alleles of rs2671655 (β = 5.39 × 10 -9 , P = 5.44 × 10 -3 ; Fig. 5 ). However, there were no significant differences between H. pylori-positive and H. pylori-negative groups when the number of risk alleles was one or zero. On the basis of the GWAS results in the SNUBH cohort, gene set analysis was performed for the previously identified nine genes (MUC1, ZBTB20, PRKAA1, LINC02161, UNC5CL, LRFN2, PSCA, PLCE1, and ATM). For all participants, association of MUC1 and PRKAA1 was replicated (Suppl . Table S14 ). In the H. pylori-positive group, only MUC1 was significantly associated with GC risk (Suppl . Table S15 ). On the other hand, PRKAA1 and PSCA were significant in the H. pylori-negative group (Suppl . Table S16 ). There was no significant variant associated with GC in the genomic location of the genes at the significance level of FWER 0.05. Although H. pylori infection plays an important role in the pathogenesis of GC, only 2%-3% of infected individuals develop GC, thus suggesting the possible importance of host genetics in the development of GC. Moreover, the interaction between H. pylori and host genetic factors, rather than H. pylori itself, may further increase GC risk. However, there have been no GWAS that consider the interactions between host genetics and H. pylori in gastric carcinogenesis so far. Our results can serve as a way to explain the mechanism by which GC occurs only in some H. pylori-infected individuals and can identify a high-risk population for GC among H. pylori-infected individuals. In this study, we presumed that the effect of SNPs differs by H. pylori infection status and a GWAS was conducted with Korean participants stratified by H. pylori infection status. We found that rs2671655 was associated with GC risk and its effect was modified by H. pylori ( Fig. 1 and Table 2 ). rs2671655 is located on chromosome 17, which is one of the most common chromosomes exhibiting numerical aberrations in GC [34] . When GWAS was conducted according to Lauren classification, rs2671655 modifies the risk of diffuse type cancer in the H. pylori-positive group (P = 4.09 × 10 -8 , OR = 2.53, 95% CI = 1.82-3.52, Suppl. Table S8 ). However, rs2671655 T allele also increased the risk of intestinal type cancer in the H. pylori-positive group, although it did not reach a gemone-wide significance level (P = 2.87 × 10 -4 , OR = 1.75, 95% CI = 1.29-2.36). Therefore, it is necessary to confirm this through a larger-scale population-based GWAS. We tried several approaches to determine how rs2671655 polymorphism can affect the development of GC in H. pylori-infected individuals. First, rs2671655 is located in the enhancer region, i.e., GH17J049387; thus, we targeted PHB1, SPOP, and ZNF652, among its regulating genes. RT-qPCR experiments showed that PHB1 was expressed more in the GC cases than in the controls, and SPOP and ZNF652 have the same pattern only in H. pylori-negative individuals, thus implying that SPOP and ZNF652 might not function properly in the presence of H. pylori (Fig. 2, 3, and 4) . When adjusting their correlation, the patterns did not change Polygenic risk score (PRS) comparison. PRS estimates were compared in the two groups and between H. pylori-positive and -negative groups depending on the number of T alleles of rs2671655, respectively. Black lines represent association results in H. pyloripositive or -negative subgroups, respectively, and colored lines represent between H. pylori-positive and -negative groups where participants have same number of risk alleles of rs2671655, respectively. estimated association, P P value, NS not significant according to the PCA ( Figure S5 and S6) . IHC experiments also showed similar results (Suppl. Fig. S8 ). Figure 6 shows the pathway that can be proposed from the results of this study. PHB1 seems to act as an oncogene, and SPOP and ZNF652 act as tumor suppressors. H. pylori cannot hinder the function of PHB1 but can hinder the function of SPOP and ZNF652. For H. pylori-infected individuals, all three genes become more expressed as the risk allele of rs2671655 increases, and the function of PHB1 may be uniquely active. For H. pylori-negative individuals, however, the effects of the oncogene and tumor suppressors are offset; therefore, the number of rs2671655 appears not to be associated with GC risk. SPOP is known as a tumor suppressor gene in GC [35] . However, the effect might be alleviated by H. pylori infection. Actually H. pylori infection predisposes both the development and metastasis of GC by increasing miRNA-543 expression, and miRNA-543 downregulates SPOP expression [25, 36] . Furthermore, in gastric carcinogenesis, Hedgehog (Hh) signaling pathway is regulated by SPOP which suppresses moving Gli2, a transcription factor, from cytoplasm to nucleus [37] ; H. pylori increases the expression of one of the Hh ligands, Sonic Hedgehog [38] , which activates the binding of Hh ligands to PTCH1, moving Gli family from cytoplasm to nucleus [39] . Although the role of ZNF652 has not been established in gastric carcinogenesis, it may reduce the effects of ZNF652 via miR-155. H. pylori has been reported to facilitate tumor growth via the induction of miR-155, and ZNF652 is upregulated following miR-155 inhibition in malignant T cells [40, 41] . By contrast, the role of PHB1 in GC is controversial [42] . Some previous studies have described an increase in PHB1 expression in GC samples [43] [44] [45] , but other studies have reported PHB1 downregulation in GC [46, 47] . If PHB1 is not an oncogene, rs2671655 might increase GC risk via other pathways excluding the three genes, as shown in the PRS analysis (Fig. 5) . Therefore, it is difficult to fully elucidate the mechanism of how rs2671655 modifies GC risk in H. pylori-infected individuals with the results of this study alone. Further studies are necessary to clarify these findings. We performed a gene set analysis for the previously reported genes including MUC1, ZBTB20, PRKAA1, LINC02161, UNC5CL, LRFN2, PSCA, PLCE1, and ATM (Suppl. Tables S14-S16). We found that GC risk differed according to H. pylori infection even for reported genes. That is, only MUC1 was significantly associated with GC risk in the H. pylori-positive group (Suppl. Table S15), whereas PRKAA1 and PSCA were significantly associated with GC in the H. pylori-negative group (Suppl . Table S16 ). There was no significant variant associated with GC in the genomic location of the genes at the significance level of FWER 0.05. This was probably attributed to a relatively small sample size of this study population. Also, previous studies did not consider the interaction between H. pylori infection and host genetics, which might explain why the results from these studies were inconsistent. In addition to PHB1, SPOP, and ZNF652, GH17J049387 enhancer can also regulate gene expression of KAT7 and G protein subunit γ transducin 2 (GNGT2) (Suppl . Table S10 ). KAT7 may affect gastric carcinogenesis by engaging in histone modification via circMRPS35/KAT7/FOXO1/3a pathway [48] . Also, candidate SNPs other than rs2671655 were identified in this study, although they did not reach a genome-wide significant level (p < 5 × 10 -8 ). The gene closest to rs12889548, the second top significant SNP identified in a GWAS performed in the H. pylori-positive group (Suppl . Table S2 ), is TNFα-induced protein 2 (TNFAIP2), which is related to H. pylori-induced inflammation and GC risk [49, 50] . A recent study reported that down-regulation of TNFAIP2 caused an activation of wnt/β-catenin signaling pathway to inhibit cancer cell proliferation and metastasis [51] . Our study has the following limitations: First, H. pyloriinfected participants might have been more heterogeneous in our study than in previous studies because we defined H. pylori infection in a broad sense. However, we could not reflect the heterogeneity, such as differences in H. pylori exposure duration or intensity, thus making it difficult to estimate the true effect of H. pylori infection on GC. Second, it was not possible to conduct a GWAS according to the location of GC (cardia vs. non-cardia cancer), because only 31 of 527 (5.9%) cases were cardia cancer. Finally, although we hypothesized the mechanism from rs2671655 to GC risk including PHB1, SPOP, and ZNF652 (Fig. 6) , this hypothesis needs to be tested by other comprehensive studies. In conclusion, rs2671655 polymorphism may play a role in H. pylori-associated gastric carcinogenesis, especially in diffuse type GC. The variant may regulate PHB1, SPOP, and ZNF652 in terms of their effect on susceptibility to GC, and their effects are different depending on H. pylori infection. Moreover, rs2671655 might affect other genes that cause GC Fig. 6 Hypothetical pathway in H. pylori-infected individuals. These findings provide new insights into the pathogenesis of H. pylori and might help in the search for novel therapeutics for the treatment of GC. The online version contains supplementary material available at https:// doi. org/ 10. 1007/ s10120-022-01285-x. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012 Gastric cancer: descriptive epidemiology, risk factors, screening, and prevention Global prevalence of Helicobacter pylori Infection: systematic review and meta-analysis What is the quantitative risk of gastric cancer in the first-degree relatives of patients? A Meta-analysis Environmental and heritable factors in the causation of cancer-analyses of cohorts of twins from Sweden, Denmark, and Finland Resolving gastric cancer aetiology: an update in genetic predisposition Genetic variation in PSCA is associated with susceptibility to diffuse-type gastric cancer Identification of new susceptibility loci for gastric non-cardia adenocarcinoma: pooled results from two Chinese genome-wide association studies Individual and joint contribution of family history and Helicobacter pylori infection to the risk of gastric carcinoma Stomach cancer risk in gastric cancer relatives: interaction between Helicobacter pylori infection and family history of gastric cancer for the risk of stomach cancer Environmental factors, seven GWAS-identified susceptibility loci, and risk of gastric cancer and its precursors in a Chinese population Helicobacter pylori infection synergizes with three inflammation-related genetic variants in the GWASs to increase risk of gastric cancer in a Chinese population Interleukin-1B 31 C>T polymorphism combined with Helicobacter pylori-modified gastric cancer susceptibility: evidence from 37 studies The Korea biobank array: design and identification of coding variants associated with blood biochemical traits SNP genotype calling and quality control for multi-batch-based studies Data quality control in genetic case-control association studies Next-generation genotype imputation service and methods PLINK: a tool set for whole-genome association and population-based linkage analyses ONE-TOOL for the analysis of family-based big data Health and Prevention Enhancement (H-PEACE): a retrospective, population-based cohort study conducted at the Seoul National University Hospital Gangnam Center Changes in aberrant DNA methylation after Helicobacter pylori eradication: a long-term follow-up study Classification and grading of gastritis. The updated Sydney system. International workshop on the histopathology of gastritis Favorable outcomes of rescue second-or third-line culture-based Helicobacter pylori eradication treatment in areas of high antimicrobial resistance Helicobacter pylori and molecular markers as prognostic indicators for gastric cancer in Korea miRNA-543 promotes cell migration and invasion by targeting SPOP in gastric cancer Overexpression of HS6ST2 is associated with poor prognosis in patients with gastric cancer Limma powers differential expression analyses for RNA-sequencing and microarray studies Use of within-array replicate spots for assessing differential expression in microarray experiments Asymptotic confidence intervals for indirect effects in structural equation models GCTA: a tool for genome-wide complex trait analysis Cancer statistics in Korea: incidence, mortality, survival, and prevalence in 2016 Optimal tests for rare variant effects in sequencing association studies A simple sequentially rejective multiple test procedure Cytogenetic and molecular aspects of gastric cancer: clinical implications The emerging role of SPOP protein in tumorigenesis and cancer therapy SIRT1-targeted miR-543 autophagy inhibition and epithelial-mesenchymal transition promotion in Helicobacter pylori CagA-associated gastric cancer SPOP suppresses tumorigenesis by regulating Hedgehog/Gli2 signaling pathway in gastric cancer Hedgehog signaling in the stomach The role of hedgehog signaling in gastric cancer: molecular mechanisms, clinical potential, and perspective MicroRNAs in the pathogenesis, diagnosis, prognosis and targeted treatment of cutaneous T-cell lymphomas. Cancers (Basel) Induction of microRNA-155 during Helicobacter pylori infection and its negative regulatory role in the inflammatory response Prohibitin expression deregulation in gastric cancer is associated with the 3' untranslated region 1630 C>T polymorphism and copy number variation The differential proteome profile of stomach cancer: identification of the biomarker candidates Diverse proteomic alterations in gastric adenocarcinoma The proteomics approach to find biomarkers in gastric cancer MicroRNA-27a functions as an oncogene in gastric adenocarcinoma by targeting prohibitin Identification of tumor markers using two-dimensional electrophoresis in gastric carcinoma CircMRPS35 suppresses gastric cancer progression via recruiting KAT7 to govern histone modification Correlation between TNFAIP2 gene polymorphism and prediction/ prognosis for gastric cancer and its effect on TNFAIP2 protein expression The miR-184 binding-site rs8126 T>C polymorphism in TNFAIP2 is associated with risk of gastric cancer Downregulation of TNFAIP2 suppresses proliferation and metastasis in esophageal squamous cell carcinoma through activation of the Wnt/β-catenin signaling pathway Data availability statement Raw data were generated at SNUBH. Data that support the findings of this study are available from the corresponding author upon reasonable request. The authors have no conflicts of interest to declare.