key: cord-0586987-oymszplb
authors: Zhao, Bangyao; Zhong, Yuan; Kang, Jian; Zhao, Lili
title: Bayesian learning of COVID-19 Vaccine safety while incorporating Adverse Events ontology
date: 2022-02-10
journal: nan
DOI: nan
sha: 16b4d5110c2d1237bbe4965389eea29393078ede
doc_id: 586987
cord_uid: oymszplb

While vaccines are crucial to end the COVID-19 pandemic, public confidence in vaccine safety has always been vulnerable. Many statistical methods have been applied to VAERS (Vaccine Adverse Event Reporting System) database to study the safety of COVID-19 vaccines. However, all these methods ignored the adverse event (AE) ontology. AEs are naturally related; for example, events of retching, dysphagia, and reflux are all related to an abnormal digestive system. Explicitly bringing AE relationships into the model can aid in the detection of true AE signals amid the noise while reducing false positives. We propose a Bayesian graphical model to estimate all AEs while incorporating the AE ontology simultaneously. We proposed strategies to construct conjugate forms leading to an efficient Gibbs sampler. Built upon the posterior distributions, we proposed a negative control approach to mitigate reporting bias and an enrichment approach to detect AE groups of concern. The proposed methods were evaluated using simulation studies and were further illustrated on studying the safety of COVID-19 vaccines. The proposed methods were implemented in R package textit{BGrass} and source code are available at https://github.com/BangyaoZhao/BGrass.

As of February 1st, 2022, coronavirus disease 2019 (COVID-19) has infected over 376 million people and caused over 5.67 million deaths world-wide. The pandemic caused both high public attention and serious public health risk, and vaccines are crucial to end the pandemic. The US Food and Drug Administration (FDA) approved two mRNA-based COVID-19 vaccines, the BNT162b2 (Pfizer-BioNTech) and the mRNA-1273 (Moderna) and the virus-based COVID-19 vaccine Ad26.COV2.S (Johnson & Johnson-Janssen) is still under emergency use authorization (EUA).

Despite the high safety standard of the FDA on COVID-19 vaccines, the public confidence in vaccines is always fragile. Due to the limited sample size, restrictive inclusion criteria, and limited duration of follow-up in the phase-3 trials, rare and serious outcomes associated with COVID-19 vaccines may not be identified. Besides, although theoretically proven to be safe, the two mRNA-based vaccines (BNT162b2 and mRNA-1273) are brought to the market for the first time, and their safety can only be testified by long-term public surveillance.

Vaccine Adverse Event Reporting System (VAERS), accepting spontaneous reports of suspected vaccine adverse events after administration of any vaccine licensed in the United States from 1990 to present, is an important data source to study vaccine safety. VAERS reports were received primarily from vaccine manufacturers, state and local health departments, and health-care providers, with fewer reports filed directly by patients and parents or others and are publicly available on the VAERS website www.vaers.hhs.gov/data/index.

VAERS is a key component in providing early warnings of vaccine safety problems. This large database (748,779 reports for the year 2021 alone) provides a valuable resource to detect safety problems (called "signals") related to COVID-19 vaccines. A signal is considered evidence that an adverse event (AE) might be caused by vaccination and warrants further investigation or action. Several AE signals have been identified after COVID-19 vaccines using VAERS data, including anaphylaxis and myocarditis after receipt of mRNA-based COVID-19 vaccines , Oster et al., 2022 , and Guillain-Barré syndrome and Cerebral Venous Sinus Thrombosis (CVST) after receipt of the Ad26.COV2.S vaccine .

As of December 24, 2021, the VAERS database included a total of 710295 AE case reports for all FDA-authorized or approved COVID-19 vaccines, including 319082 reports for the Pfizer COVID-19 vaccine, 326403 for the Moderna COVID-19 vaccine, 63224 for the Janssen COVID-19 vaccine, and 1586 for COVID-19 vaccines for which the manufacturer is unknown. VAERS data is commonly described by a large contingency table. In this table, each row represents a vaccine and each column represents an AE. Each cell in the table contains the number of reports that mention both that vaccine and that AE for a pre-defined period. Such a table can be very large, with hundreds of rows for vaccines and thousands of columns for AEs. Since we are interested in studying safety of the COVID-19 vaccines, the large table can be collapsed into a 2 × 2 table for each AE with rows corresponding to COVID-19 vaccines (Yes/No) and two columns for the AE (Yes/No). The two rows can also be COVID-19 vs influenza vaccines, or mRNA-based vaccines vs Janssen COVID-19 vaccine. The strength of the vaccine-AE association is commonly expressed in terms of a ratio, such as relative risk or Odds Ratio (OR) [Zhao et al., 2020] . A large OR (> 1) indicates a safety signal. For example, if the OR = 3 when comparing COVID-19 versus influenza vaccines, then the odds of the AE occurred after COVID-19 vaccines is three times more likely than the odds of AE after influenza vaccines.

Statistical methods used to test the vaccine-AE association include the chi-squared test, likelihood ratio test [Huang et al., 2011] , Gamma-Poisson Shrinker (GPS) model [DuMouchel, 1999] , Multi-item GPS model [DuMouchel and Pregibon, 2001] , and Bayesian confidence propagation neural network (BCPNN) [Bate et al., 1998 , Lansner et al., 2000 , Nóren et al., 2006 . However, none of the existing methods considered the inherent connections among AEs. AEs are naturally related. For example, events of retching, dysphagia and reflux are all related to an abnormal digestive system. Explicitly bringing AE relationships into the model will aid in the detection of true signals amid the noise while reducing false positives.

Several public resources are available to be used as ontology both to describe AEs and to form the basis of a statistical model. The largest resource for AE ontology is MedDRA (Medical Dictionary for Regulatory Activities), which describes disease relationships using a multi-level hierarchy. The second lowest term, "Preferred Terms" (PT), is a distinct descriptor for a symptom, sign and disease. Related PTs are grouped into "High Level Terms" (HLT), "High Level Group Terms" (HLGT) and the largest category "System Organ Classes" (SOC). Higher layers represent biologically and clinically meaningful categories for the disease observed on the lower levels. Multiple AEs may individually be rare enough to be undetected, but if they are related, they can borrow strength from each other to increase the chance for each individual AE to be detected. In this paper, we model data on the PT level and use the higher HLGT level to build connections between AEs on the PT level. Specifically, we create an interconnected network by linking two AE terms on the PT level if they belong to the same AE group on the HLGT level. This network naturally considers the complexity that each PT AE term might belong to multiple groups.

In this article, we use logistic regression to model the binary AE outcome as a function of the binary vaccine type and confounders and then obtain the adjusted OR. We may have hundreds or even thousands of adjusted ORs, and one for each AE. We propose to build connections between these adjusted ORs by converting the above AE relationship network into the prior distribution of these parameters. To perform simple and fast Bayesian inference, we used Pólya-gamma augmentation and re-parameterized the covariance to construct conjugate forms for efficient Gibbs sampling. Moreover, we proposed a negative control approach to mitigate the reporting bias in VAERS data and propose an approach to identify enriched AE groups where signaled AEs in the group occur more frequently than expected. The proposed methods were implemented in R package BGrass and source code are available at https://github.com/BangyaoZhao/BGrass.

The rest of the article is organized as follows. In section 2 we introduce notation and develop our proposed methods. In section 3 we present simulation studies. In section 4 we illustrate our methods in studying safety of COVID-19 vaccines using VAERS database. Concluding remarks are given in section 5.

Suppose there are n reports in VAERS that mentioned the vaccines of interests. For example, the report mentioned COVID-19 (target) or influenza (control) vaccines if we are interested in comparing these two types of vaccines. Of these reports, a total of J AEs were mentioned. Let A ij be the indicator of mentioning AE j on report i, i.e., A ij = 1 if AE j is mentioned on report i and 0 otherwise. We used a series of logistic regression models (one for each AE) to assess the associations between AEs and vaccines:

where δ j is the indicator variable for selecting the j th AE as a safety signal (δ j = 1 if logOR of j th AE is different from zero and δ j = 0 otherwise). Equation (2)-(5) show prior distributions. Bern(π) denotes the Bernoulli distribution with probability π, N (0, σ 2 α,l ) denotes normal distribution with variance σ 2 α,l , and IG(a α , b α ) denotes the inverse gamma distribution with shape parameter, a α , and scale parameter, b α .

We introduce correlation between logORs by assuming that β = (β 1 , ...β J ) follows a multivariate normal distribution with means zero and covariance matrix W Ω W . This covariance has the correlation component, Ω , and variance component W W, where W = diag(σ β,1 , ..., σ β,J ) is the diagonal matrix with its (j, j)-th element being the standard deviation of β * j . Correlation matrix, Ω , consists of the relation network described by the AE ontology (such as MedDRA) and a parameter, , controlling the degrees of information borrowing from the network. This parametrization of the covariance matrix greatly facilitates the posterior sampling (see details in section 2.2.1).

This proposed model (1)-(5) allows us to directly incorporate AE ontology into estimation of logORs while selecting AE signals. We call this model "Bayesian Graphical Model for Signal Selection" (BGrass). If we assign independent normal priors for β * j 's, BGrass reduces to a collection of separate logistic regression models with the selection feature, abbreviated as Bss.

Determination of Ω . The AE relation network can be represented by a undirected graph G(V, E), where V is the set of vertices that correspond to the J AEs, and E = {j ∼ k} is the set of edges where j ∼ k indicates AE j and k belong to the same AE group. Let d j represent the degree of vertex j, i.e. d j = J k=1 I(j ∼ k). Mathematically, we summarize the graph G by a normalized graph Laplacian matrix L.

We then add a small positive number to the diagonal of L to guarantee the positive definiteness (i.e., L + I J ), which is commonly used as the precision matrix [Chung, 1974 , Li and Li, 2008 , Sun and Li, 2010 . Finally, we convert this precision matrix to the correlation matrix Ω , and parameter controls the strength of information borrowing between similar AEs. When is small, the correlation matrix relies more on the AE relation network, L, thereby, inducing a stronger degrees of information borrowing. If goes to ∞, BGrass reduces to Bss, where logORs of AEs are assumed to be independent. In this paper, we consider as a tuning parameter and determine its value based the deviance information criterion (DIC). For example, we ran models with a wide range of , say {10 −3 , 10 −2 , 10 −1 , 10 1 , 1, 10, 100, ∞} and selected the model with the smallest DIC value. Once Ω is determined, we just need to estimate the vector of standard deviation parameters (σ β,1 , ..., σ β,J ) in W .

2.2.1 Construction of efficient Gibbs sampling for W It is extremely challenging to sampling W = diag(σ β,1 , ..., σ β,J ) as Ω is not a diagonal matrix. To perform efficient sampling, we rewrite β * j as β * j = β * * j σ β,j in equation (1) and β * * ∼ N (0, Ω ) in equation (2), such that equation (1) and (2) become

This reparametrization is mathematically equivalent to the original equation (1)- (2), but provides an efficient framework for posterior sampling. We assign σ βj ∼ F N (0, τ 2 ) and τ −2 ∼ IG(k/2, k/2), where F N represents a folded normal distribution. This prior distribution is equivalent to a half-t prior distribution for σ β,j (j = 1, ...J) with k degrees of freedom [Gelman, 2006] . Under this reparameterization, we have simple conjugate forms to sample σ β,j 's, which greatly simplifies the computation.

The likelihood function for the j th AE on report i can be written as

This likelihood is analytically inconvenient and has long been considered as a challenging problem for Bayesian inference. We apply a novel data-augmentation strategy proposed by Polson et al. [2013] for posterior sampling; detailed definition of the Polya-Gamma distribution, denoted by P G(b, c), can be found in Polson et al. [2013] . Let ψ ij = α j X i + V i δ j σ β,j β * * j denote the linear function of predictors. A latent variable ω ij ∼ P G(1, 0) is introduced for each A ij and (A ij , ω ij ) pairs are independently distributed with the following joint density function,

where f P G(1,0) (·) denotes the probability density function of a P G(1, 0)-distributed random variable. Integrating out the ω ij in the augmented likelihood function in (8) gives the likelihood function in (7). The equation (8) provides exponential function forms with respect to α j and β * * j , thus achieving the conjugacy. The auxiliary random variable ω ij 's also has conjugate forms, (ω ij |A ij ) ∼ P G(1, ψ ij ).

Through reparametrization of the covariance matrix and data augmentation for the binomial likelihood, all posterior sampling have conjugate forms, leading to a simple and efficient Gibbs sampler. The full posterior distributions can be found in Supplementary document B.

In our model, logOR parameter β j represents strength of the signal for AE j, and the posterior probability δ j indicates the likelihood of β j being different from zero (i.e. statistical significance). Therefore, we use both β j and δ j to determine if AE j is a safety signal. In general, we identify AE j as a safety signal for the target vaccine if the posterior mean of β j (β j ) is large (sayβ j > log(2)) and the posterior selection probability,δ j , is large. To determine the threshold forδ j , we applied the FDR control methods [Morris et al., 2008 ] toδ j 's to restrict the expected Bayesian FDR at a pre-specified level α.

Negative control approach. Like all spontaneous reporting database, VAERS data also has limitations. One of the limitations is reporting bias. Therefore, some AEs can be falsely recognized as safety signals as they are more likely to be reported compared to other vaccines. For example, COVID-19 vaccines get more public attention compared to influenza vaccines, therefore, AEs after COVID-19 vaccination are more likely to be reported than AEs after influenza vaccines. In such situations, negative control methods can be a powerful tool to reduce reporting bias. [Shi et al., 2020 , Schuemie et al., 2014 . Negative controls (NCs) are AEs that are known not to be causally related to any vaccine based on clinical knowledge. The list of NCs create an empirical null distribution of logOR estimates representing no vaccine-AE association. We expect the logOR of an AE signal is larger than the logOR estimates of NCs (i.e., taking an extremely large value in the null distribution). Assuming that the null distribution is normal and we quantify extremity of an logOR if β j > µ ng + 2σ ng , where µ ng and σ ng is the mean and the standard deviation of logOR estimates for NCs at each MCMC iteration. The posterior probability, p(β j > µ ng + 2σ ng ), is the NC-adjusted selection probability (NCprob) of AE j being an signal. We can also use the FDR method described earlier to determine the threshold for AE signal selection.

Identifying AE groups of concern. In addition to identifying individual AE signals, we can also identify AE groups of concern based on the posterior distributions. Motivated by gene enrichment methods in the omic-analysis [Subramanian et al., 2005 , Maleki et al., 2020 , we define an AE group G as enriched if it contains more signaled AEs than the remaining AE groups. To this end, we create a 2 × 2 table at each MCMC interaction with two rows for AEs to be signaled and unsignaled and two columns for AEs to be in and not in group G. We then calculate the logOR for this table, denoted by γ G . A large positive γ G indicates that group G is likely enriched. Averaged over all MCMC iterations, the posterior probability, p(γ G > log(2)), is used to identify if group G is enriched.

We first set up a small simulation study to illustrate our key idea on the information borrowing based on the graphical prior. Specifically, we simulated 70 AEs, 30 of them in group 1, 15 of them in group 2, and the remaining 25 AEs were not in any group (named as isolated AEs). We set logORs to be 1, −1, and 0 for AEs in group 1, group 2, and isolated AEs, respectively. We simulated 50 datasets and applied the BGrass and Bss models to each simulated dataset.

In both BGrass and Bss models, the shape and scale parameter, a α and b α , in the IG distribution in equation 5 were both set to 0.5. The degrees of freedom was set to be one (i.e.,k = 1) in the half-t prior distributions for σ β,j 's. The probability in Bernoulli prior in 3 was set to π = 0.5 for all AEs. In the BGrass model, we used DIC criteria to select the best in {10 −3 , 10 −2 , 10 −1 , 10 1 , 1, 10, 100, ∞} (representing high to low degrees of information borrowing). Gelman-Rubin diagnostic test was used to check the model convergence.

We compared our BGrass model with the Bss model (not including correlation between AEs) using the averaged mean square error (MMSE). The MMSE for the jth AE was calculated by averaging the squared difference between the posterior meanβ j and true β j over the 50 simulated datasets. We visualized the MMSE ratios (BGrass over Bss) for all 70 AEs in Figure 1 . If the MMSE ratio for a particular AE is smaller than one, the BGrass model provides more accurate logOR estimate than the Bss model for that AE. As Figure 1 shows, our BGrass model provides more accurate logOR estimates for vast majority of the AEs in group 1 and 2, indicating that the information borrowing between similar AEs help the parameter estimation. Surprisingly, isolated AEs also had more accurate logOR estimates in the BGrass model compared to the Bss model. One explanation is that the information borrowing improved the overall accuracy for the AE selection, thereby, increasing the accuracy for the logOR estimates. 

To mimic the real data, we randomly sampled 50,000 reports from the VAERS database to perform a larger simulation study. Those reports mentioned a total of 346 AEs, which were connected based on 78 AE terms on the higher HLGT level in MedDRA (the relation network is shown in Figure 2 ).

We generated data with different degrees of correlations between logOR. We first simulated β * from N(0, 0.1Ω ) with = ∞ (no correlation), = 0.02 (small correlation), and = 0.001 (large correlation). Then, we simulated δ j ∼ Bern(0.5), and defined the true logORs to be β j = δ j β * j . As a result, around 50% of AEs had β j = 0, called noise AEs, and 25% of AEs had β j > 0, called safety signals for the target vaccine. Under this setup, the averaged correlation between two connected non-noise AEs (β j = 0) was 0 (no correlation), 0.12 (week correlation), and 0.56 (strong correlation). The regression coefficients α j 's (intercept, gender, and age) were chosen to mimic the real data. We also conducted another set of simulations with 80 % of noise AEs and around 10% of AE signals associated with the target vaccine.

We simulated 50 datasets for each scenario (3 correlation structures × 2 signal proportions). We applied both BGrass and Bss to each simulated data with hyperparameters the same as specified in the first simulation study. In each model, we ran three independent MCMC chains with 20,000 iterations and a burn-in of 10,000 and thinning of 2. We evaluated the model performance with regarding to the accuracy of the logOR estimates and the ability of correctly selecting AE signals. The accuracy was measured by the square root of sum squared error (RSSE), RSSE = J j=1 (β j −β j ) 2 . For the signal selection ability, we computed posterior probability P (β j > 0) for AE j (j = 1, ..., J) and then used these values to construct the receiver operating characteristic (ROC) curve and calculated the area under the curve (AUC). We presented the results of RMSE and AUC in Figure 3 . This Figure demonstrates the clear advantage of incorporating the AE relations in the estimation. BGrass had improved performance compared to Bss, as evidenced by smaller RSSEs and larger AUCs, when AEs are correlated. As expected, they had similar performance when AEs are not correlated. Moreover, BGrass's performance was further improved when correlations are higher and signal proportions are larger.

In the first study, we compared COVID-19 vaccines (Pfizer-BioNTech, Moderna and Johnson & Johnson-Janssen) to influenza vaccines (denoted as "FLU", including inactivated influenza vaccines, labeled as"FLU3" and "FLU4" and live attenuated influenza vaccine, labeled as"FLUN3" or "FLUN4" in VAERS). In the second study, we compared two mRNA-based COVID-19 vaccines (Pfizer-BioNTech and Moderna) to the virus-based COVID-19 vaccine (Johnson & Figure 3 : RMSE and AUC Results from Simulation II Johnson-Janssen). In all studies, we modelled AE data on the PT level and used the HLGT level of MedDRA to define AE groups.

There are a total of 970,542 VAERS reports from January 1, 2016 to December 24, 2021. In each study, we removed reports with age under 18, mentioning both vaccines of interests (i.e, a report with both COVID-19 and FLU vaccines was removed in the first study). We also removed AEs with a total report number less than 25. We divided the age into four groups, [18,30), [30,50), [50,65) , and ≥ 65. With all categorical covariates, we counted the AE reports by gender, age group and vaccine type, aggregating a large number binary data into binomial data, which significantly improved the computation efficiency.

We applied our BGrass model to both studies with the same hyperparameters as specified in the simulation studies. In BGrass, we ran 3 independent MCMC chains. For each chain, after a burn-in of 10,000 iterations, we collected one MCMC sample in every 10 iterations until we got 20,000 MCMC samples. The MCMC chains in all studies have passed the Gelman-Rubin convergence diagnostic test with the maximum Gelman-Rubin statistic R c < 1.1.

The final dataset contains 618,132 reports (585,760 for COVID-19 vaccines and 32,372 for FLU vaccines), and a total of 845 AEs, belonging to 118 AE groups. Gender distributions are similar between two vaccine groups and both groups was majority females (71.0% in . FLU vaccine recipients are older (12.6%, 34.6%, 27.1%, and 25.8% for the pre-defined age groups in the COVID-19 group and 10. 1%, 22.4%, 25.9%, and 41 .6% in the FLU group). We set = 1 based on the DIC criteria.

To mitigate the reporting bias due to the high public attention for the COVID-19 vaccines relative to the FLU vaccines, we used negative control approach to select AE signals. We identified 35 AE terms in VAERS as negative controls (NCs) (see the list of NCs in Supplementary document C). All of the NCs are known not to be causally related to vaccines and were reported more than 300 times in our dataset. The FDR for signal detection was controlled at 0.01.

We also applied the enrichment method to identify safety problem for AE groups containing more than 20 AEs, with FDR controlled at 0.01.

We identified 8 AE signals in this study; see Table 1 for the estimated logORs, NC-adjusted selection probability (NCprob). Figure 4 displays these AE signals in the relation network. Deep vein thrombosis, thrombosis and pulmonary embolism have been reported to be associated with COVID-19 vaccines See et al. [2021] , Hippisley-Cox et al. [2021] , Jabagi et al. [2022] . Menstruation delay has also been reported to be associated with COVID-19 vaccines in a recent paper Edelman et al. [2022] . Based on Lechien et al. [2021] , Ageusia (loss of sense of taste) and Anosmia (loss of sense of smell) are rare AEs associated with COVID-19 vaccines. Since both of them are common symptoms of COVID-19 infection, they might be false positive signals. VAERS takes any adverse event that occurs after the administration of a vaccine, whether it is or is not caused by the vaccine. Therefore, Ageusia and Anosmia might be related to the underlying disease rather than COVID-19 vaccines. Large epidemiological studies are needed to further study these signals. We didn't identify any enriched AE groups as there were only 8 safety signals using the negative control approach.

Figure 4: Study II (COVID-19 v.s. FLU): AEs with logOR > log(2) (blue nodes) and signaled AEs identified using negative control (red nodes). The numbers on nodes map nodes to AEs in Table 1 Table 1 : AE signals associated with COVID-19 vaccines compared to FLU vaccines. Visualization of these AEs in the relation network can be found in Figure 4 .

The final dataset contains 588,487 reports (539,390 on mRNA-based vaccines and 49,097 for the Johnson & Johnson-Janssen vaccine), and a total of 845 AEs, belonging to 132 AE groups. There are majority females in both groups (71.9% females in mRNA-based vaccine and 61.9% in Johnson & Johnson-Janssen). mRNA-based vaccine recipients are older than Johnson & Johnson-Janssen recipients (12%, 34.4%, 26.8%, and 26.8% for the pre-defined age groups in mRNA-based vaccine recipients versus 18.8%, 38.3%, 30.2%, and 12.7% in Johnson & Johnson-Janssen recipients). In this study, we are interested in AE signals associated with both mRNA-based vaccines and Johnson & Johnson-Janssen vaccines, represented by positive and negative logORs, respectively. We set = 0.1 based on the DIC criteria.

When FDR is controlled at 0.01, we identified 29 AEs associated with mRNA-based vaccines (represented by yellow in Figure 5 ). Of them, for example, thrombosis, myocarditis, cellulitis, rash erythematous, and throat irritation have been reported to be associated with mRNA-based vaccines Biykem et al. [2021] , Ramalingam et al. [2021] , Papamanoli et al. [2021] , Kadali et al. [2021] . We also identified 108 AEs associated with Johnson & Johnson-Janssen (represented by green in Figure 5 ). Of them, for example, Guillain-Barre syndrome have been reported to be associated with Johnson & Johnson-Janssen vaccines as mentioned in CDC warning and Cerebral venous sinus thrombosis (CVST) caused pause of the Johnson & Johnson-Janssen vaccine .

For mRNA-based vaccines, we identified one enriched AE group "administration site reactions". This group mainly contains mild and common symptoms, including 11 AE signals, such as injection site swelling, pain, and rash. For the Johnson & Johnson-Janssen vaccine, we identified two enriched AE group, "embolism and thrombosis" (including 20 AE signals) and "central nervous system vascular disorders" (including 16 AE signals). see Supplementary material D for the complete list of AE signals. In this study, we also applied the Gamma Poisson Shrinker (GPS) model in DuMouchel [1999] using the R package openEBGM. BGrass and GPS are not directly comparable, since they estimate different parameters (logOR in BGrass and Relative Risk in GPS). GPS also have no clear way to select AE signals. Therefore, we used both the relative rates and confidence intervals to select the AE signals; see details and comparsion results with BGrass in Supplementary material E.

Like all spontaneous reporting database, VAERS data also has limitations, such as incompleteness, inaccuracy, coincidence, difficulty in verification, and reporting bias. Nevertheless, due to its national scope, VAERS still serves as an important early warning system of vaccine safety issues.

In this paper, we developed a graphical Bayesian model for detecting adverse events of COVID-19 vaccines while incorporating the AE ontology. The proposed BGrass model simultaneously estimates the strength of all AEs while allowing information borrowing between similar AEs. We developed novel methods to construct conjugate forms leading to an efficient Gibbs sampler for posterior computation. Simulation studies have demonstrated that BGrass has better performance compared to the model without considering the correlation between AEs. The analysis of VAERS data also identified important AE signals.

This article focus on COVID-19 vaccines, but the method is broadly applicable to other vaccine safety studies.

A. R-package for Gibbs sampling This R package has passed the R CMD check and contains a vignette document to demonstrate its reproducibility. This package can be directly installed from https://github.com/ BangyaoZhao/BGrass. 

Reports of anaphylaxis after receipt of mrna covid-19 vaccines in the us

Myocarditis cases reported after mrna-based covid-19 vaccination in the us from

Association of receipt of the ad26. cov2. s covid-19 vaccine with presumptive guillain-barré syndrome

Improvement in the analysis of vaccine adverse event reporting system database

A likelihood ratio test based method for signal detection with application to fda's drug safety data

Bayesian data mining in large frequency tables, with an application to the fda spontaneous reporting system

Empirical bayes screening for multi-item associations

A Bayesian neural network method for adverse drug reaction signal generation

Bayesian neural networks with confidence estimations applied to data mining

Extending the methods used to screen the WHO drug safety database towards analysis of complex associations and improved accuracy for rare events

Spectral graph theory. CBMS Reginal Conferences Series 92

Network-constrained regularization and variable selection for analysis of genomic data

A bayesian approach for graph-constrained estimation for high-dimensional regression

Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper)

Bayesian inference for logistic models using pólya-gamma latent variables

Bayesian analysis of mass spectrometry proteomic data using wavelet-based functional mixed models

A selective review of negative control methods in epidemiology

Interpreting observational studies: why empirical calibration is needed to correct p-values

Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles

Gene set analysis: Challenges, opportunities, and future research

Us case reports of cerebral venous sinus thrombosis with thrombocytopenia after ad26. cov2. s vaccination

Risk of thrombocytopenia and thromboembolism after covid-19 vaccination and sars-cov-2 positive testing: self-controlled case series study

Myocardial Infarction, Stroke, and Pulmonary Embolism After BNT162b2 mRNA COVID-19 Vaccine in People Aged 75 Years or Older

Association between menstrual cycle length and coronavirus disease 2019 (covid-19) vaccination. Obstetrics & Gynecology

Covid-19: Post-vaccine smell and taste disorders: Report of 6 cases. Ear

Myocarditis with covid-19 mrna vaccines

Covid-19 vaccine-induced cellulitis and myositis

Delayed skin rash after receiving sars-cov-2 mrna moderna vaccine

Side effects of bnt162b2 mrna covid-19 vaccine: A randomized, cross-sectional study with detailed self-reported symptoms from healthcare workers