key: cord-1040720-omvhnr61 authors: Lin, D.-Y.; Zeng, D.; Mehrotra, D. V.; Corey, L.; Gilbert, P. B. title: Evaluating the Efficacy of COVID-19 Vaccines date: 2020-10-05 journal: nan DOI: 10.1101/2020.10.02.20205906 sha: 8e2f085457b3ef7cb60bcc1fc427974a72b1ed96 doc_id: 1040720 cord_uid: omvhnr61 A large number of studies are being conducted to evaluate the efficacy and safety of candidate vaccines against novel coronavirus disease-2019 (COVID-19). Most Phase 3 trials have adopted virologically confirmed symptomatic COVID-19 disease as the primary efficacy endpoint, although laboratory-confirmed SARS-CoV-2 is also of interest. In addition, it is important to evaluate the effect of vaccination on disease severity. To provide a full picture of vaccine efficacy and make efficient use of available data, we propose using SARS- CoV-2 infection, COVID-19, and severe COVID-19 as dual or triple primary endpoints. We demonstrate the advantages of this strategy through realistic simulation studies. Finally, we show how this approach can provide rigorous interim monitoring of the trials and efficient assessment of the durability of vaccine efficacy. The vaccine regimens have generally protected against COVID-19 disease endpoints in animal models 5 and have induced binding and neutralizing antibody responses to vaccineinsert Spike proteins in most vaccine recipients, exceeding response levels seen in convalescent sera. ness as the primary efficacy endpoint, although laboratory-confirmed SARS-CoV-2 is also acceptable. It is possible that a vaccine is much more effective in preventing severe than mild COVID-19. Thus, we should also evaluate the effect of vaccination on severe COVID-19. 10 However, it would be difficult to power a trial using a severe COVID-19 endpoint. We propose using SARS-CoV-2 infection, COVID-19, and severe COVID-19 as triple primary endpoints or using SARS-CoV-2 infection and COVID-19 or COVID-19 and severe COVID-19 as dual primary endpoints, the specific choice depending on the expected incidence of the three events and on the targeted vaccine efficacy for the three endpoints. This approach incorporates more evidence on vaccine efficacy into decision making than using The criteria for claiming that a vaccine is successful should be strict enough to ensure worthwhile efficacy. A vaccine whose efficacy is higher than 50% can markedly reduce incidence of COVID-19 among vaccinated individuals and help to build herd immunity. An ad-. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted October 5, 2020. . https://doi.org/10.1101/2020.10.02.20205906 doi: medRxiv preprint visory panel convened by the World Health Organization (WHO) recommended 50% vaccine efficacy for at least 6 months post vaccination as a minimal criterion to define an efficacious vaccine. 1 1 The US Food and Drug Administration (FDA) guidance defines vaccine success criteria as a point estimate of vaccine efficacy at least 50% and the interim-monitoring adjusted lower bound of the 95% confidence interval exceeding 30%. The FDA guidance criteria do not specify a minimum period of follow-up. However, given the intent of current vaccine development to identify efficacious vaccines within several months of trial initiation, the expectation seems to be reliable evidence for vaccine efficacy over approximately 6 months, consistent with the WHO recommendation. Many Phase 3 trials specify assessment of vaccine efficacy over longer-term follow-up as an important study objective. The FDA guidance document states that "A lower bound ≤ 30% but > 0% may be acceptable as a statistical success criterion for a secondary efficacy endpoint, provided that secondary endpoint hypothesis testing is dependent on the success on the primary endpoint." This statement refers to earlier FDA guidance on a fixed-sequence testing method, 1 2 under which vaccine efficacy is tested against a sequence of secondary endpoints in a pre-defined order, where tests of each endpoint are performed at the same significance level (one-sided type I error of 2.5%), moving to the next endpoint only after a success on the previous endpoint. The WHO Solidarity Trial protocol 1 3 specifies COVID-19 through longer term follow-up (ideally 12 months or more) and severe COVID-19 over the same time frame as secondary endpoints. Following these guidelines and precedents, we consider hypothesis testing of vaccine efficacy over 12 months as a secondary analysis, using a null hypothesis that is less stringent than the 30% null hypothesis value used for the primary analysis, recognizing that it is more difficult for a vaccine to provide 12-month than 6-month protection and that even moderate vaccine efficacy through 12 months could be an important characteristic of a COVID-19 vaccine. In sum, we consider both the assessment of vaccine efficacy against primary endpoints over six months, using a 30% null hypothesis, and the assessment of vaccine efficacy against the same endpoints over 12 months, using a 0% or 15% null hypothesis. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 5, 2020. . https://doi.org/10.1101/2020.10.02.20205906 doi: medRxiv preprint For each of the three endpoints, we obtain the maximum likelihood estimator for the vaccine efficacy under the Poisson model. In addition, we calculate the score statistic for testing the null hypothesis that the vaccine efficacy is less than a certain lower limit, say 30%, against the alternative hypothesis that the vaccine efficacy is greater than the lower limit; we divide the score statistic by its standard error to create a standard-normal test statistic. We propose to test all three null hypotheses, adjusting the significance threshold for the three test statistics to control the overall type I error at the desired level. We consider a vaccine to be successful if any of the three null hypotheses is rejected. We describe this multiple testing method in greater detail in Supplementary Appendix 1, where we also describe a sequential testing procedure to determine which of the three null hypotheses should be rejected. In the sequential testing procedure, we order the three hypotheses according to the order of the three observed test statistics, from the most extreme observed value to the least extreme. We test the first null hypothesis using the significance threshold from the aforementioned multiple testing procedure. If the first null hypothesis is rejected, we test the second null hypothesis by applying the multiple testing procedure to the remaining two test statistics. If the second null hypothesis is rejected, we test the last null hypothesis by using the unadjusted significance threshold. Clearly, this sequential testing procedure is more powerful than the multiple testing procedure in identifying which endpoints the vaccine is efficacious against. Both the proposed multiple testing and sequential testing methods properly account for the correlations of the test statistics and thus are more powerful than the conventional Bonferroni correction and related multiplicity adjustments that assume independence of tests. If the effects of a vaccine are expected to be similar among the three endpoints, then we can enhance statistical power by combining the evidence of the vaccine effects on the three endpoints and performing a single test of overall vaccine efficacy. Specifically, we propose taking the sum of the three score statistics and dividing the sum by its standard . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 5, 2020. . https://doi.org/10.1101/2020.10.02.20205906 doi: medRxiv preprint error to create a standard-normal test statistic. We refer to this method as the combined test (Supplementary Appendix 1); this is in the same vein as combining estimators for a common effect in meta-analysis. 1 4 Instead of the triple primary endpoints, we may consider the dual primary endpoints of infection and disease if severe disease is very rare or the dual primary endpoints of disease and severe disease if the vaccine is expected to be only weakly effective against infection. Clearly, the above methods can be modified to test only two of the three endpoints. It is desirable to periodically examine the accumulating data from a Phase 3 trial, so that the trial can be terminated if sufficient evidence emerges for a highly effective vaccine or a weakly effective candidate. In order to obtain rigorous stopping boundaries for a trial, we need to derive the joint distribution of the test statistics over interim looks. In Supplementary Appendix 2, we show that the proposed test statistics over interim looks are jointly normal with the independent increment structure, such that standard methods for interim analyses 1 5 −18 can be applied. We assigned 27,000 subjects to vaccine or placebo at a ratio of 1:1. We assumed that subjects were enrolled at a constant rate over a 2-month period and vaccine efficacy was evaluated 6 months after the first subject was enrolled. We let 1% of the placebo subjects to acquire . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 5, 2020. In each data set, we tested the null hypothesis that the vaccine efficacy is at most 30% against the alternative hypothesis that the vaccine efficacy is greater than 30% at the one-sided nominal significance level of 2.5%. In order to investigate the ability of the proposed methods in detecting long-term vaccine efficacy, we extended the follow-up time in the above simulation studies from a maximum of 6 months to a maximum of 12 months. We assumed that the event proportions for infection, disease, and severe disease in the placebo group over the 12-month period doubled those of the 6-month period. We reduced all values of vaccine efficacy by 30% to reflect the waning of vaccine efficacy against each endpoint over time. We tested the null hypothesis that the vaccine efficacy is 0% versus the alternative hypothesis that the vaccine efficacy is greater . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 5, 2020. . https://doi.org/10.1101/2020.10.02.20205906 doi: medRxiv preprint than 0% at the nominal significance level of 2.5%. The results are summarized in Table 2 . Again, the proposed methods can substantially improve statistical power. In the likely scenario that a vaccine is more effective in preventing severe than mild COVID-19, using COVID-19 and severe COVID-19 as dual primary endpoints is more powerful than using either of the two events as a single primary endpoint. If the vaccine efficacy for infection is nearly as high as that for disease, then using infection, COVID-19, and severe COVID-19 as triple primary endpoints will be the most powerful. Most Phase 3 trials have targeted 90% power for detecting 60% (short-term) vaccine efficacy against COVID-19 disease. The actual power may be lower if the vaccine is less effective, the disease incidence is lower than anticipated, or it is an interim analysis. In our simulation studies, using disease as a single primary endpoint had only 80% power. However, the proposed methods could boost the power to 90%. We have focused on vaccine trials for populations enriched with high-risk individuals (e.g., front-line health-care personnel, factory workers, older adults, people with underlying health conditions), in which the risks for infection, disease, and severe disease are all appreciable. In generally healthy populations, such as college students, the majority of infections are asymptomatic, and severe disease is rare. For such settings, power can be maximized by using the dual primary endpoints of infection and disease. We have used Poisson models instead of Cox proportional hazards models for several . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 5, 2020. . https://doi.org/10.1101/2020.10.02.20205906 doi: medRxiv preprint reasons. First, there are considerable inaccuracies in determining the event times, especially the infection time; the Poisson modeling approach requires only the knowledge of whether or not the event has occurred by the end of follow-up. Second, Poisson models are simpler than Cox models, both conceptually and computationally. Because the event rates are relatively low, the two modeling approaches should provide similar results. We fitted both Poisson and Cox models in our simulation studies, and the power of the two approaches was nearly identical. We have emphasized hypothesis testing based on score statistics. In Supplementary Appendix 4, we extend our work to general Poisson regression, which can be used to estimate vaccine efficacy, construct confidence intervals, compare multiple vaccines, and accommodate baseline risk factors (e.g., age, gender, race, occupation, co-morbidity). Baseline risk factors can have major impact on the occurrences of SARS-CoV-2 infection, COVID-19, and severe COVID-19. In addition, COVID-19 vaccine efficacy trials may become unblinded partly through follow-up, due to demonstration at an interim analysis that the study vaccine is efficacious, which leads to offering the vaccine to placebo recipients, or due to the approval and availability of a different COVID-19 vaccine, which leads to some participants electing to be unblinded to help decide whether or not to receive the approved vaccine. Covariate adjustment in the analysis of vaccine efficacy against endpoints during post unblinding followup is important for minimizing bias due to potential differences in exposure to SARS-CoV-2 between the vaccine and placebo arms. We have developed our methods in order to accelerate the discovery and licensure of effective COVID-19 vaccines. An important function of the Phase 3 trials is to continue the follow-up of the vaccine and placebo groups after definite evidence of short-term efficacy has emerged, so as to assess duration of protection and improve precision for assessment of prevention of severe disease, as well as for assessment of safety. Duration of vaccine efficacy is an influential parameter in models of population impact of deployed vaccines, and understanding of how vaccine efficacy wanes over time is essential to deciding whether or not booster vaccinations may be required and to estimating the optimal timing of the boosts. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 5, 2020. . https://doi.org/10.1101/2020.10.02.20205906 doi: medRxiv preprint The ability of our framework to provide more precise confidence intervals around the three vaccine efficacy parameters than existing methods that do not account for the correlation of endpoints is advantageous regardless of whether one, two, or three endpoints are selected as primary. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 5, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 5, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted October 5, 2020. . https://doi.org/10.1101/2020.10.02.20205906 doi: medRxiv preprint T a b l e 1 . S t a t i s t i c a l P o w e r ( % ) f o r T e s t i n g t h e N u l l H y p o t h e s i s o f A t M o s t 3 0 % V a c c i n e E f f i c a c y A g a i n s t I n f e c t i o n ( I ) , D i s e a s e ( D ) , a n d S e v e r e D i s e a s e ( S ) O v e r 6 M o n t I -D D -S I -D -S I -D D -S I -D -S I -D D -S I -D -S 4 0 % 6 0 % 6 0 % 2 1 8 0 2 7 T a b l e 2 . S t a t i s t i c a l P o w e r ( % ) f o r T e s t i n g t h e N u l l H y p o t h e s i s o f N o V a c c i n e E f f i c a c y A g a i n s t I n f e c t i o n ( I ) , D i s e a s e ( D ) , a n d S e v e r e D i s e a s e ( S ) O v e r 1 Effect of an inactivated vaccine against SARS-CoV-2 on safety and immunogenicity outcomes: Interim analysis of 2 randomized clinical trials An mRNA vaccine against SARS-CoV-2-preliminary report Phase 1/2 study of COVID-19 RNA vaccine BNT162b1 in adults Safety and immunogenicity of the ChAdOx1 nCoV-19 vaccine against SARS-CoV-2: a preliminary report of a phase 1/2, single-blind, randomised controlled trial Single-shot Ad26 vaccine protects against SARS-CoV-2 in rhesus macaques Phase 1-2 Trial of a SARS-CoV-2 Recombinant Spike Protein Nanoparticle Vaccine WHO target product profiles for COVID-19 vaccines Multiple Endpoints in Clinical Trials: Guidance for Industry An international randomised trial of candidate vaccines against COVID-19 On the relative efficiency of using summary statistics versus individual-level data in meta-analysis A multiple testing procedure for clinical trials