key: cord-0654200-cnvexi4b
authors: Tomer, Anirudh; Biccler, Jorne
title: Temporal Properties of Vaccine Effectiveness Measures in Presence of Multiple Pathogen Variants and Multiple Vaccines
date: 2022-02-14
journal: nan
DOI: nan
sha: eaf7e4a2682cac5399ee0205d16d5ea721a83846
doc_id: 654200
cord_uid: cnvexi4b

Vaccine effectiveness (VE) is typically defined as incidence rate ratio, cumulative-risk ratio, or odds ratio. The VE based on incidence rate ratio is known to be time-invariant over the study period for leaky action vaccines and, the VE based on cumulative-risk ratio is time-invariant for all-or-none action vaccines. Consequently, these VE measures are recommended as appropriate measures of VE for leaky and all-or-none vaccines, respectively. However, in diseases with multiple pathogen variants and multiple vaccines, investigators may also be interested in variant-specific VE of a vaccine, the relative VE of a vaccine against two variants, or the relative VE of different vaccines against a given variant. In this multi-variant and multi-vaccine scenario, the temporal properties of the aforementioned VE measures have not been studied entirely yet. Furthermore, no general-purpose sample size calculator is available for either studies that intend to estimate variant-specific VE or relative VE. As a solution, we define variant-specific and relative VE measures while accounting for multiple competing pathogen variants. We then propose a generic mode of action for all-or-none vaccines in a multi-variant setting. Subsequently, we evaluate the conditions and the extent to which various VE measures can be time-varying. We show that every VE measure is time-varying for all-or-none action vaccines in a multi-variant and multi-vaccine scenario. For leaky vaccines, we show that all measures other than those based on incidence rate ratios are time-varying. We discuss the practical implications of these results on VE studies in the context of the commonly used cohort, cumulative case-control, and test-negative study designs. Lastly, for the multi-variant and multi-vaccine scenario, we implement sample size calculations for both variant-specific and relative VE in an R package and an online calculator.

A key quantity of interest in vaccine studies is vaccine effectiveness for susceptibility to infection (VE ) . It pertains to the protective effects of vaccination against a pathogen. When a pathogen is genotypically and/or phenotypically diverse, differential VE against its variants is of interest too (Guy et al., 2010; Lopez Bernal et al., 2021) . An augmentation of this situation occurs when there is more than one vaccine for a pathogen and vaccines are compared against each other, for example, as in the COVID-19 pandemic (Gupta, 2021) . In such scenarios, both variant-specific VE and relative VE of vaccines may need to be calculated. The VE is typically estimated as the percentage decrease in relative risk of infection among the vaccinated subjects over the unvaccinated subjects (Halloran et al., 2010) . This relative-risk in most real-world situations is either defined as incidence rate (or incidence density) ratio (irr), hazard rate ratio (hr), or cumulative-risk (or cumulative-incidence) ratio (crr). Using these, different measures of VE can be defined as VE irr = 1 − irr, VE hr = 1 − hr, and VE crr = 1 − crr, each having a different interpretation.

The choice of an appropriate VE measure may depend upon the goal of the study and also on the action mechanism (leaky or all-or-none) of the vaccine (Halloran et al., 2010, page 132) . In this regard, Smith et al. (1984) have proposed using VE irr over VE crr in leaky action vaccines because, in leaky vaccines VE irr remains constant over the study period whereas VE crr does not. The opposite of this phenomenon happens in all-or-none action vaccines; therefore, VE crr is recommended as the VE measure there. A vaccine's action mechanism and the temporal properties of a VE measure are also linked with study designs and VE estimation. For example, in case-control (Breslow, 1996) and test-negative design (TND) studies, either VE irr or VE crr can be estimated by utilizing 1−ôr, whereôr is the exposure odds ratio in the sample. Here, whether the temporal properties of VE irr or those of VE crr will apply depends on the approach for sampling controls (Dean, 2019; Lewnard et al., 2018) . We next expand upon how the various VE measures and their temporal properties are interesting from both the vaccine manufacturers and public health perspectives and our related motivations and goal of this work.

The motivation for this work comes from two sources. First, from the requirements of the Development of Robust and Innovative Vaccine Effectiveness (DRIVE, https://www.drive-eu.org/) project of Innovative Medicines Initiative -IMI (Goldman, 2012) which provided funding for this work. The DRIVE project aims to set up a platform, bringing together all stakeholders to study the brand and variant-specific influenza vaccine effectiveness in the European Union. Within DRIVE, both cohort and TND studies are used. For these designs, in the context of DRIVE's multi-variant and multi-vaccine scenario, two challenges are as follows. First, are the temporal properties of variant-specific VE measures comparable to the temporal properties of VE in a single variant scenario (Smith et al., 1984) ? This may affect conclusions regarding the effectiveness of a vaccine. The time-dependency of a VE measure is not a drawback per se in other situations. For example, VE crr is time-dependent in leaky vaccines, but it can still be useful for assessing the population level benefit of a vaccine rollout. This brings us to the second challenge: no sample size calculator that handles multiple variants and vaccines is available for study planners. While the DRIVE project focuses on influenza, the multi-variant situation is also present in other pathogens such as dengue (Guy et al., 2010) and human-immunodeficiency virus (Gilbert et al., 1998) . Our second motivation comes from the current COVID-19 pandemic (World Health Organization, 2020) . Recently, several studies that estimate variant-specific VE for COVID-19 vaccines (Zimmer et al., 2020) have been published (Shinde et al., 2021; Lopez Bernal et al., 2021; Abu-Raddad et al., 2021; Madhi et al., 2021) . Since these studies focus on the change of VE over time, it is crucial to know if the variation of VE is, in fact, the biological effect of the vaccine or whether this could be an artifact due to the use of a time-dependent VE measure. Hence, these studies show that knowledge on temporal properties of VE irr , VE crr , VE or is of importance, and substantiate the need for a sample size calculator.

The aim of this work is two-fold. First, to extend upon the work of Smith et al. (1984) to study the temporal properties of VE measures in the presence of multiple variants and vaccines, for both leaky and all-or-none action vaccines, in cohort and TND study designs. For this, we not only focus on VE irr and VE crr but also check if VE or = 1−or is time-invariant. The VE or is rarely the target measure; however, we study it because when the rare events assumption is not met in case-control studies based on the inclusive sampling scheme, the sample exposure odds ratio based VE estimator 1 −ôr does not estimate VE crr . Instead, it estimates VE or , which can be seen as a measure of VE based on the odds ratio. We also compare the use of 1 − irr, 1 − crr, and 1 − or as measures of relative VE of vaccines. Our second goal is to implement calculations for obtaining power and precision (Kelley et al., 2003) for VE in the multi-variant and multi-vaccine scenario, in an R package and an online calculator.

Previously, for cohort studies, Gilbert (2000) studied the temporal properties and proposed estimators for VE of a single leaky vaccine against multiple variants. However, to our knowledge, no such work exists for allor-none action vaccines. In TND studies, Lewnard et al. (2018) and Dean (2019) have studied the temporal properties of the estimator 1 −ôr, with two different control sampling techniques, namely inclusive sampling and incidence density sampling. Both authors have corroborated the findings of Smith et al. (1984) , but only for a single vaccine and single pathogen variant. However, within TND studies, there is still scope for discussing the scenario of multiple competing variants and the relative VE of vaccines. Regarding our sample size calculation goal for VE, currently available tools such as the R package epiR (Stevenson et al., 2021) and general-purpose calculators such as PASS (https://www.ncss.com/software/pass/) have two drawbacks. Specifically, they only handle the single variant and single vaccine scenario, and for sample size calculations aiming at obtaining a certain precision (Kelley et al., 2003) for VE, the calculators ignore the randomness of the data generating process.

The rest of this article is as follows. In Section 2, we define VE irr , VE crr , VE or employing a generic scenario of M vaccines and I variants in a cohort study. In Section 3, we define the probability components of these measures for subjects vaccinated with a leaky vaccine, an all-or-none action vaccine, or placebo. In Section 4, we study the temporal properties of the VE measures for leaky and all-or-none action vaccines, and in Section 5, we discuss them further for TND studies. In Section 6, we present a sample size calculator for the multi-variant and multi-vaccine scenario. Lastly, Section 7 entails the discussion of this work.

Consider a set M = {1, . . . , M } of vaccines of interest whose VE we intend to calculate against a set I = {1, . . . , I} of variants of a pathogen circulating in the source population. The VE measures described in this manuscript focus on describing the vaccine effect at the end of a follow up-period of length t in a population for which no immunity was present at the beginning of the study. In what follows, we will rely on a simplification of the assumptions mentioned by Gilbert et al. (1998) and Greenwood and Yule (1915) . More specifically, we assume that the vaccine assignment was randomized, the different strains can be considered to be competing events, and for all subjects the incidence rate of infection from any strain remains constant throughout the follow-up period.

Suppose that among unvaccinated subjects, the incidence rate of infection due to a variant i ∈ I is given by λ i > 0. We assume λ i to be constant over the study period for ease of exposition. Subsequently, let the overall rate of infection due to all variants be Λ = i∈I λ i . At the end of the study period, let C = i indicate getting infected (case) by variant i and C = 0 indicate remaining uninfected (control), V = m denotes being vaccinated by vaccine m ∈ M and V = 0 denotes receiving placebo. Let Y represent the person-time contribution of a subject in the study. It is defined as the minimum of the time to infection and the study end time t. Then, in a cohort study, the data available at the end of the study is obtained with the probabilities shown in Table 1 , and the expected person-time contribution shown in the last row of Table 1 . Table 1 : Cell probabilities of the data available from a cohort study with multiple variants and multiple vaccines. Here, C = i indicates getting infected (case) by variant i ∈ I and C = 0 indicates remaining uninfected (control), V = m denotes being vaccinated by vaccine m ∈ M and V = 0 denotes receiving placebo. Let Y represent the person-time contribution of a subject in the study. It is defined as the minimum of the time to infection and the study end time t. The probabilities P(C = i | V = ·) and P(C = 0 | V = ·) are the cumulative-risks of getting infected and not getting infected, respectively, by the variant i in subjects vaccinated with the vaccine m (or placebo). The E(Y | V = ·) denotes the expected person-time a subject vaccinated with the vaccine m (or placebo) will spend in the study.

Our first aim is to define the VE of vaccine m against the variant i compared to the placebo group. We call this VE the variant-specific VE of the vaccine and utilize three different measures for it. These are namely, VE irr i,m based on incidence rate ratio, VE crr i,m based on ratio of cumulative-risks, and VE or i,m based on exposure odds ratio. Using Table 1 , these three are defined as:

where the probabilities P(C = i | V = ·) and P(C = 0 | V = ·) are the cumulative-risks of getting infected and not getting infected, respectively, by the variant i in subjects vaccinated with the vaccine m (or placebo). The E(Y | V = ·) denotes the expected person-time a subject vaccinated with the vaccine m (or placebo) will spend in the study. In addition, P(C = i | V = ·)/E(Y | V = ·) denotes the incidence rate of infection with variant i among subjects with the given vaccination status. It is important to note that for VE or i,m in (1), we expressed the odds ratio or retrospectively by conditioning on the infection status C. Specifically, we first used the probablities P(V = m | C = ·) and later converted them using the Bayes' theorem to P(C = i | V = ·). We did so because or is typically employed in case-control studies wherein data is collected retrospectively on vaccination status V , for a known infection status C.

Our next aim is to compare the relative VE of a vaccine m against two variants, i and j. To this end, three measures of the relative VE, namely VE irr ij,m based on incidence rate ratio, VE crr ij,m based on ratio of cumulative-risks, and VE or ij,m based on exposure odds ratio are defined as:

All three definitions lead to the same result, that is, a ratio of the odds of being a case of variant i versus the variant j in vaccinated subjects and the same odds in placebo subjects.

Last, we define the relative VE of two vaccines m and n against the same variant i. For this purpose, three measures of the relative VE, namely VE irr i,mn based on incidence rate ratio, VE crr i,mn based on ratio of cumulative-risks, and VE or i,mn based on exposure odds ratio are defined as:

We next define the component probabilities of the measures in (1), (2), and (3) among the placebo and vaccinated subjects, for both leaky and all-or-none action vaccines.

Since we assumed that infection can happen with only one variant over the study period, infections from all variants are competing events (Putter et al., 2007) . Under the competing events framework, among placebo subjects (V = 0) the cumulative-risk of being a control (C = 0), the cumulative-risk of being a case of the variant i (C = i), and the expected person-time contribution of a subject E(Y | V = 0) over the study period [0, t] is given by:

The detailed mathematical derivation for (4) is given in the Appendix A.

Let the vaccine m be a leaky vaccine (Halloran et al., 2010, page 132) 

In leaky vaccines, the effectiveness reduces the incidence rate of infection due to variant i in the study period to θ i,m λ i , compared to λ i among placebo. The overall incidence rate of infection by any variant after receiving the vaccine m becomes Θ m Λ compared to Λ in placebo subjects.

Under the competing events framework, among subjects vaccinated with the vaccine m (V = m), the cumulative-risk of being a control or a case of the variant i, and the expected person-time contribution E(Y | V = m) is given by:

The detailed mathematical derivation for these follows the equations in Appendix A.

To understand our assumed action of all-or-none vaccines (Halloran et al., 2010, page 132 ) in a multi-variant scenario, consider that there are only three variants i, j and k circulating. Among the subjects vaccinated with the vaccine m, let θ i,m be the proportion of subjects who become immune to the variant i but not to variant j or k. Among these subjects infections occur with the combined force of infection of variants j and k given by λ j + λ k . We can define θ j,m and θ k,m similarly. Next, let θ ij,m be the proportion of subjects who become immune to both variants i and j but not to variant k. Among these subjects infection with variant k happens with the force of infection λ k . We can define θ ik,m and θ jk,m similarly. Then, let θ ijk,m be the proportion of subjects who become immune to all three variants. The remaining proportion of subjects are the ones who do not become immune to any variant despite vaccination. For them infections happen with the combined force of infection λ i + λ j + λ k of all three variants. Overall in our three variant example we have seven VE components whose sum is 0 ≤ (θ i,m + θ j,m + θ k,m + θ ij,m + θ ik,m + θ jk,m + θ ijk,m ) ≤ 1.

For an all-or-none vaccine, as the number of variants increase, the number of VE components θ ·,m increase as well. Consequently, to denote the VE calculations succinctly, we first define P(I) as the power set of the set I of all variants circulating. For example, in the scenario above with three variants i, j, k, the power set P(I) = {{}, {i}, {j}, {k}, {i, j}, {i, k}, {j, k}, {i, j, k}}. Then, among the subjects vaccinated with vaccine m, let θ g,m denote the proportion of subjects who are immune to a subset combination g ∈ P(I) of variants. Here, when g = {}, i.e., no variants, then θ g,m = θ {},m corresponds to the proportion of subjects who are not immune to any variant despite vaccination. On the other hand when g = I, i.e., set of all variants, then θ g,m = θ I,m corresponds to the proportion of subjects who are immune to all variants. Subsequently, among subjects vaccinated with the vaccine m, the cumulative-risk of being a control or a case of the variant i, and the expected person-time contribution E(Y | V = m) is given by:

where, the expansions for P(C = 0 | V = m) and P(C = i | V = m) are obtained using the law of total probability. Specifically, the conditional probability P(C = 0 | V = m, G = g) denotes the probability of being a control among subjects who are vaccinated with vaccine m and are also immune to variants in the subset combination g ∈ P(I). The proportion of these subjects is P(G = g) = θ g,m . Among these subjects the probabilities of being a control or a case of variant i can be derived in the competing events framework following the equations in Appendix A.

The set P(I \ i) in (6) It is important to note that it is not possible to estimate each of the proportion of subjects θ g,m who are immune to a subset combination g ∈ P(I) of variants. This is because, given a total of I variants, the combination g can be any of the 2 I combinations from the power set P(I). However, the total number of equations (6) and the corresponding data (Table 1) are of size I + 2. That is, number of equations are less than number of parameters. Although, this is only the case when there are more than two variants (I > 2). (1), (4), and (5), to obtain:

These results show that only the measure VE irr i,m based on incidence rate ratio is time-invariant. This falls in line with the findings of Smith et al. (1984) , albeit theirs was a single variant scenario. Gilbert (2000) also reached a similar conclusion using a time-to-event model in which VE was expressed in terms of the ratio of cause-specific hazards in vaccinated and placebo groups.

The measure VE crr i,m based on ratio of cumulative-risks, varies over the study period. In general, VE crr i,m is always smaller than the actual VE 1 − θ i,m . This because VE crr i,m can be equivalently rewritten as

person-time is always greater than 1 because, vaccinated subjects should get infected at a slower rate than placebo arm subjects; therefore contributing more person time. When the study length t is very short (t → 0) and/or the overall incidence rate of infection Λ is low (Λ → 0), it may happen that their product Λt → 0 as well. In such situation VE crr i,m becomes time-invariant. This is because lim Λt→0 VE crr i,m = 1 − θ i,m . This result is also in line with the findings of Smith et al. (1984) in a single variant scenario. The other situation is when Λt → ∞, plausible when the study period is long and/or the pathogen prevalence is high (e.g., an epidemic). In this case lim Λt→∞ VE crr i,m = 1 − θ i,m /Θ m , depends on the overall effectiveness of the vaccine.

Lastly, the measure VE or i,m is also time-dependent and always overestimates 1−θ i,m . Although, when Λt → 0 it becomes time-invariant. This is because lim Λt→0 VE or i,m = 1 − θ i,m . Conversely, when Λt → ∞ then lim Λt→∞ VE or i,m = 1. This is an interesting result because when Λt → ∞, then VE or i,m will show a large effectiveness and VE crr i,m may show a small effectiveness.

The relative effectiveness of a leaky vaccine m against two variants i and j is given by 1 − θ i,m /θ j,m (Section 2 and Section 3.2), where 1 − θ j,m is the effectiveness of vaccine m against variant j. To compare it against the three measures VE irr ij,m , VE crr ij,m , VE or ij,m , we combine (2), (4), and (5), to obtain:

Thus, all three VE measures are time-invariant. An advantage of this situation is that even if the only data available is counts of infected and not-infected by vaccine and variant, whether via a cohort design or a cumulative case-control design, time-invariant VE can be obtained. Furthermore, another interesting property of these relative VE measures is that even if the incidence rates of infection of the two variants change over the study period, time-invariant VE can be obtained as long as the ratio of the incidence rates of infection of the two variants remains constant (Appendix B).

The relative effectiveness of two vaccines m and n against the same variant i is given by 1 − θ i,m /θ i,n (Section 2 and Section 3.2), where 1 − θ i,n is the effectiveness of vaccine n against variant i. To compare it against the three definitions VE irr ij,m , VE crr ij,m , VE or ij,m we combine (3), (4), and (5), to obtain:

where, 0 ≤ 1 − Θ n ≤ 1 is the overall effectiveness of vaccine n against all variants and is defined similar to Θ m (Section 3.2). The relative VE in this case is time-dependent when either VE crr i,mn or VE or i,mn are used. For both of these, whether the VE increases or decreases over time depends on the relative overall effectiveness Θ n /Θ m of the two vaccines. A special scenario is when vaccines m and n have different variant-specific effectiveness θ i,m = θ j,m for variant i, but the overall efficacies of the two vaccines are equal Θ m = Θ n . In such a situation, both VE crr i,mn and VE or i,mn are time-invariant and equal to 1 − θ i,m /θ i,n . Furthermore, both VE crr i,mn or VE or i,mn are also time-invariant when Λt → 0 (short study period and/or low infection rate). This is because lim Λt→0 VE crr i,mn = VE or i,mn = 1 − θ i,m /θ i,n . The other scenario is, when the study period is long and/or the infection rate is too high, leading to Λt → ∞. Then, lim Λt→∞ VE crr i,mn = 1−(θ i,m Θ n )/(θ i,n Θ m ) and lim Λt→∞ VE or i,mn = 1 − θ i,m exp (Θ m − Θ n )∞ /θ i,n . Hence, VE or i,mn can be either equal to 1 if the overall effectiveness of vaccine m is less than that of vaccine n, or it can be −∞ if the opposite is true. (1), (4), and (6), the different measures of VE at the end of the study period are given by:

These measures of the effectiveness for all-or-none vaccine are time-dependent irrespective of the measure utilized. This is unlike a single variant situation wherein VE crr is known to be time-invariant (Smith et al., 1984) . In the multi-variant situation VE crr i,m becomes time-invariant only when all λ k → 0, k ∈ I/i. That is, when all variants other than the variant i have a negligible incidence rate of infection. The VE irr i,m is time-invariant only when both λ k → 0, k ∈ I/i and the cumulative-incidence rate λ i t of the variant i over the study period λ i t → 0. Along with these conditions if λ k t → 0, k ∈ I/i then time-invariant estimates can also be obtained using VE or i,m . In general these are very strict conditions.

The relative VE of an all-or-none vaccine m against any two variants i and j can be obtained from Section 2 and Section 3.3 as 1 − {1 − g∈P(I)\P(I\i) θ g,m }/{1 − g∈P(I)\P(I\j) θ g,m }. Here, the g∈P(I)\P(I\j) θ g,m corresponds to the variant-specific VE of vaccine m against variant j. Using information from (2), we know that our relative VE measures of interest are all equal, i.e., VE irr ij,m = VE crr ij,m = VE or ij,m . For brevity they are defined in (12) in the Appendix C. The expression in (12) is time-dependent. Although, when the incidence rates of infection of all variants are negligible, i.e., λ i → 0, i ∈ I, then this relative VE becomes time-invariant as well.

The relative VE of two all-or-none vaccines m and n against the same variant i can be obtained from Section 2 and Section 3.3 as 1−{1− g∈P(I)\P(I\i) θ g,m }/{1− g∈P(I)\P(I\i) θ g,n }. Here g∈P(I)\P(I\i) θ g,n is the effectiveness of vaccine n against variant i. To obtain our relative VE measures VE irr i,mn , VE crr i,mn , VE or i,mn we combine (3), (4), and (6). For brevity, the resulting VE expression is given in (13) in Appendix C. Both VE irr i,mn and VE crr i,mn are time-invariant when the incidence rates of infection of all variants are negligible, i.e., λ i → 0, i ∈ I. In addition if every λ i t → 0, then VE or i,mn becomes time-invariant as well.

In this section, we expand upon the results we obtained so far, for the widely used TND study design (Lewnard et al., 2018) . In TND studies, the aim is to bring in cases and controls through routine surveillance systems. Subjects meeting a clinical case definition are then tested for the disease of interest. For example, in TND studies aiming to estimate effectiveness against influenza, it is common to test and enroll subjects presenting with influenza-like illness (Stuurman et al., 2020; Rondy et al., 2017) . While cases are subjects infected with the pathogen of interest, controls are subjects who may be infected with other non-target pathogens that elicit similar symptoms. Overall, the premise of the TND studies is that by sampling subjects who seek medical care, healthcare-seeking behavior bias can be limited. Depending on the exact schedule to sample controls, TND design imitates case-control design with incidence density sampling (Dean, 2019) or inclusive sampling (Vandenbroucke and Pearce, 2012) .

Consider that the size of our population of interest is n, wherein an infection can occur either from a variant i ∈ I of the pathogen of interest (C = i) or from a non-target pathogen (C = Ω). We assume that infection with the target pathogen or one of the non-target pathogens does not lead to immunity against other nontarget pathogens (Dean et al., 2020) . Following Lewnard et al. (2018) we consider that an infected subject may be recorded as a case or as control only if symptoms appear and the subject subsequently seeks care and gets tested. We denote symptom appearance by S = 1 (or 0) and seeking care and getting tested by Z = 1 (or 0). Suppose, after getting tested, the subject is found to be infected with a variant i ∈ I of target pathogen. In that case, the subject is counted as a case. In contrast, if the subject is infected with a non-target pathogen, the subject is counted as a control. We next present equations for the expected number of variant-specific cases and expected number of controls. These expressions will be later employed in the expression for VE in TND studies.

We 

Here, P(Z = 1, S = 1, C = i, V = m) is the joint probability that over the study period, a subject seeks care (Z = 1) after having symptoms (S = 1) due to infection with variant i, and received vaccine m before the study began. The probability P(Z = 1, S = 1, C = i, V = 0) has a similar interpretation and is applicable for unvaccinated subjects (V = 0).

In (7), we have split the joint probability P(Z = 1, S = 1, C = i, V = m) into conditional probabilities, and especially into P(Z = 1 | S = 1, V = m) and P(S = 1 | C = i). The importance of P(Z = 1 | S = 1, V = m) is that it means that seeking care (Z = 1) is conditionally independent of the type of infection given the vaccination status and symptoms. That is, P(Z = 1 | S = 1, V = m) = P(Z = 1 | S = 1, C = i, V = m).

This also reflects what happens in practice in a TND study. Specifically, a subject may base their decision to seek care knowing their vaccination status and symptoms even though they may not know their infection type until getting tested. The probability P(S = 1 | C = i) indicates that the occurrence of symptoms depends on the type of infection and is conditionally independent of the vaccination status. As we show later in this section, these conditional probabilities cancel out in VE calculations.

Consider that the instantaneous rate of being infected with any non-target pathogen (C = Ω) is a constant λ Ω . This rate λ Ω does not depend upon getting vaccinated with vaccine m for the target pathogen. Consequently, even if a subject becomes a case of a variant i of the pathogen of interest, the subject can also later become a case of non-target pathogens multiple times over the study period. Whether such a subsequent infection with the non-target pathogen qualifies to be counted as a control, depends upon the sampling design. In this regard, two designs of interest for us are inclusive sampling and incidence density sampling. We present them next.

In inclusive sampling a subject is counted as control as many times as they get infected with the non-target pathogen, seek care and get tested. Then, over the study period [0, t] the expected number of controls E(N Ω,m ) vaccinated with vaccine m and expected number of unvaccinated controls E(N Ω,0 ) are given by (Lewnard et al., 2018) :

Here, P(Z = 1, S = 1, V = m | C = Ω) is the probability that a subject known to be infected with a non-target pathogen seeks care after having symptoms and received vaccine m before the study began. The interpretation of the probability P(Z = 1, S = 1, V = 0 | C = Ω) is similar, but it is applicable to unvaccinated subjects.

In incidence density sampling, after a subject becomes a case of a variant i of the pathogen of interest (C = i), they leave the risk set. Thus any subsequent infection with a non-target pathogen is ignored. Although, before this censoring time, the subject can be counted as control as many times as they get infected with the non-target pathogen, seek care, and get tested. We have previously defined the expected time of getting infected with the pathogen of interest as E(Y | V = m) and E(Y | V = 0) among subjects vaccinated with vaccine m and unvaccinated subjects, respectively. By replacing t with with E(Y | V = ·) in (8), we obtain E(N Ω,m ) and E(N Ω,0 ) in an incidence sampling approach. They are given by (Dean, 2019) :

A problem in the incidence density sampling is that ascertaining if a subject is indeed a control, i.e., they have not had a previous infection with the target pathogen, can be challenging (Lewnard et al., 2019) .

Let VE T N D i,m be the variant-specific effectiveness of vaccine m against variant i of the pathogen of interest in a TND study. It is defined in terms of ratio of odds of being vaccinated with vaccine m among the subjects infected with variant i, and the same odds in subjects infected with non-target pathogen. Specifically,

This expression can be further expanded for both inclusive and incidence density sampling designs using (8) and (9), respectively. We then obtain,

Thus, the temporal properties of the VE measures VE crr i,m and VE irr i,m described in Section 4 also hold in TND studies. Among these, a peculiar one is that when a vaccine has an all-or-none action, then irrespective of the sampling design VE T N D i,m will always be time-dependent (Section 4.2). This is in contrast to the findings to Lewnard et al. (2018) in TND studies with a single variant. The only scenario in which VE T N D i,m is time-invariant is when the vaccine has a leaky action and an incidence density sampling design is used.

Let the relative VE of a vaccine m against two variants i and j be denoted by VE T N D ij,m . Using (10) and the relative VE definitions from Section 2 we can show that VE T N D ij,m = VE crr ij,m for inclusive sampling and VE T N D ij,m = VE irr ij,m for incidence density sampling. In Section 4 we have shown that both VE crr ij,m and VE irr ij,m are time-invariant for leaky vaccines and time-dependent for all-or-none vaccines. Consequently, irrespective of the sampling design VE T N D ij,m should be time-invariant for leaky vaccines but not for all-or-none vaccines.

Let the relative VE of two vaccines m and n against a variant i be VE T N D i,mn . Using (10) and the relative VE definitions from Section 2 we can show that VE T N D i,mn = VE crr i,mn for inclusive sampling and VE T N D i,mn = VE irr i,mn for incidence density sampling. Based on the properties of VE crr i,mn and VE irr i,mn we can say that VE T N D i,mn is time-invariant only in leaky vaccines, that too only if the incidence density sampling is used.

An important problem in effectiveness studies is sample size calculation during the study planning phase. In this regard, investigators may face two types of challenges. First, to have enough subjects to show a non-zero VE, wherein one can obtain an appropriate sample size utilizing the hypothesis testing framework. The second type of challenge is having enough subjects to obtain precise estimates of VE to further assist in decisions pertaining to the various phases of the development of a vaccine. In statistical terms, precision (Kelley et al., 2003) refers to confidence limits of VE. In general, narrower confidence intervals can be obtained with larger sample sizes. While the actual confidence limits depend upon the actual data, using Monte Carlo simulations, confidence intervals can also be simulated, and a range for both the upper and lower confidence limit can be obtained. To resolve the aforementioned challenges and to meet the requirements of the DRIVE project (Section 1), we developed a sample size calculator. It is available at https://apps.p-95.com/drivesamplesize/, and the underlying calculations are also implemented in an R package available at https://github.com/anirudhtomer/vess. It is important to note that for VE based on incidence rate ratio, currently our R package only supports leaky-action vaccines.

When the aim is to find a minimum detectable VE for a given sample size, we utilize the framework of hypothesis testing. The null hypothesis we utilize is that the VE = 0 and the alternate hypothesis is that the VE = 0. In cohort studies with a given sample size n, when the interest is in variant-specific or relative VE measures based on ratio of cumulative-risks, then we utilize the methodology proposed by Woodward (2013) to find the minimum detectable VE. Whereas, when the interest is in variant-specific or relative VE measures based on incidence rate ratio then we use the methodology proposed by Lwanga et al. (1991) . Both methodologies require vaccine coverage and the number of study subjects. However, these methodologies are only suitable for a single vaccine and single variant scenario. To use them in our multivariant and multi-vaccine scenario, we recalculate vaccine coverage in terms of only the subjects that are used in a particular calculation. For example, for variant-specific VE of vaccine m against variant i, we define coverage as {P(C = i, V = m) + P(C = 0, V = m)}/{P(C = i, V = m) + P(C = 0, V = m) + P(C = i, V = 0) + P(C = 0, V = 0)}. The total subjects are calculated as n{P(C = i, V = m) + P(C = 0, V = m) + P(C = i, V = 0) + P(C = 0, V = 0)}. That is, we ignore all subjects that are vaccinated with vaccines other than vaccine m or are infected with any variant other than variant i. Similarly, for relative VE of two vaccines m and n against a variant i, we ignore all subjects that are unvaccinated or are vaccinated with vaccines other than vaccine m, or are infected with any variant other than variant i.

For a classic case-control study and TND study (with inclusive sampling), let the total cases be x, and the controls per case be r. Thus total controls are given by rx. The measures of interest are those based on ratio of cumulative-risks, namely variant-specific VE crr i,m or the relative VE crr i,mn . These measures are estimated using the odds ratio (Section 5). Here, to calculate the minimum detectable VE we employ the methodology of Dupont (1988) . However, this methodology is only suitable for a single vaccine and single variant scenario. To use it in our multi-variant and multi-vaccine scenario, we recalculate controls per case in terms of only the subjects that are used in a particular calculation. For example, for variant-specific VE of vaccine m against variant i, we first define total cases for the variant-specific VE as x{P(V = m, C = i | C = 0) + P(V = 0, C = i | C = 0)}. Here conditioning on C = 0 indicates that these probabilities are relevant only for cases. The total controls are then given by x × r{P(V = m | C = 0) + P(V = 0 | C = 0)}. Subsequently, we recalculate the controls per case using the aforementioned new totals for controls and cases.

When the aim is to find the expected precision of VE, we use a simulation-based approach. Specifically, in cohort studies, in each simulation round we first sample the data n i,m of the number of subjects infected with variant i and vaccinated with vaccine m. This data is sampled using a multinomial distribution P(C = i, V = m) of the probabilities of being infected by variant i (or being uninfected C = 0) and being vaccinated with the vaccine m (or placebo V = 0). Subsequently, the overall sample size n is given by

When the VE measures based on incidence rate ratio, namely variant-specific VE irr i,m and the relative VE irr i,mn are of interest, then in each simulation round we also sample the person-time contributions Y | V = m for each vaccine and Y | V = 0 for placebo subjects from an exponential distribution. Among the placebo subjects, the rate parameters of this exponential distribution is simply the incidence rate of infection λ i . Whereas, in subjects vaccinated with vaccine m, the corresponding rate parameter is θ i,m λ i . In this regard, both θ i,m and λ i are needed as input parameters. To subsequently estimate variant-specific and relative VE we use the sampled data n i,m and Y | V = m to estimate incidence rate ratios. The standard errors and confidence intervals are estimated using normal approximation to the distribution of log incidence rate ratio Halloran et al. (2010) . For VE measures based on the ratio of cumulative-risks we only use the sampled data n i,m , and for estimating the standard error and confidence intervals we use the methodology of Morris and Gardner (1988) . In each simulation round, we adjust standard errors for potential confounders using the approach of Hsieh and Lavori (2000) . The confidence intervals obtained over multiple such simulations are then averaged to find the expected upper and expected lower limits of the confidence interval. The simulation approach to calculate precision is not currently used by existing calculators, which instead provide the confidence interval from a single cross table of infection and vaccination status, with cells values E{n i,m } = nP(C = i, V = m). Such a confidence interval does not represent the expected precision but rather represents precision for expected cell counts, thus ignoring the randomness of the data generating process.

For case-control and TND studies (with inclusive sampling), the total number of cases of x and the controls per case r are required as input parameters. In each simulation round we then sample total cases as

Here, x i,m are the cases sampled from the multinomial distribution P(V = m, C = i | C = 0) of the probabilities of being vaccinated by vaccine m and being a case of variant i among the cases. Similarly, a total of rx controls are also sampled from the multinomial distribution P(V = m | C = 0). Subsequently, we utilize the sample odds ratio as the estimator of VE (Section 5). The standard error and confidence interval are obtained using the approach of Morris and Gardner (1988) . The rest of the simulation approach remains similar to the cohort studies.

In Section 4 we observed that various VE measures can be time-invariant or time-dependent. However, the degree of time-dependency varies as per vaccine action, the length of the study period t, and the force of infection of variants λ i . The temporal properties of relative VE may also depend on the relative overall effectiveness of brands Θ m /Θ n . To assist practitioners in quantitatively assessing the amount of timedependency they should expect, we implemented the variant-specific and relative VE definitions for both leaky and all-or-none vaccines in our R package available at https://github.com/anirudhtomer/vess.

In this paper, we worked on defining and evaluating the temporal properties of measures of the effectiveness of vaccines in the presence of multiple pathogen variants and multiple vaccines. We focused on cohort and test-negative design (TND) studies. The reason why the multi-variant and multi-vaccine scenario faces unique challenges is because of the competing-events/variants (Putter et al., 2007) situation. We worked on two challenges. First, defining variant-specific VE measures and measures of relative VE of vaccines, and evaluating the conditions and the extent to which a VE measure can be time-varying. Second, conducting sample size calculations in a multi-variant and multi-vaccine scenario. We concentrated on VE measures based on incidence rate ratio (VE irr ), cumulative-risk ratio (VE crr ), and odds ratio (VE or ).

The issue that some measures of VE can be time-varying over the study period depending on the vaccine's mode of action has been discussed by Smith et al. (1984) , albeit only in a single variant and single vaccine scenario. In the presence of multiple variants and vaccines, the variant-specific and relative VE of vaccines can both be time-dependent. In this regard, we show that when a vaccine has an all-or-none action, then all measures (VE irr , VE crr , VE or ) of variant-specific VE of a vaccine are time-dependent. This is in contrast to the findings of Smith et al. (1984) and Lewnard et al. (2018) who showed that VE crr is time-invariant for all-or-none vaccines when there is only one variant and one vaccine. Also, for all-or-none vaccines, every measure of relative VE of a vaccine against any two variants, and every measure of relative VE of two vaccines against a particular variant, is time-dependent. However, our findings for leaky vaccines are similar to those of Smith et al. (1984) . Specifically, VE measures based on incidence ratio are time-invariant for leaky vaccines. In addition, all measures of relative VE a vaccine against any two variants is also timeinvariant. We evaluated the temporal properties effectiveness of vaccines for both cohort and TND studies. For TND studies, extending upon the work on Lewnard et al. (2018) and Dean (2019) for variant-specific VE and relative VE, we showed that odds ratio from TND studies estimate VE crr and VE irr when inclusive and incidence density sampling is used, respectively. Thus the temporal properties of VE crr and VE irr also hold in TND studies.

Our findings are relevant for practitioners. First, practitioners will benefit from a smaller study period for vaccines known to have an all-or-none action. This is because, for all-or-none vaccines, the variant-specific VE measures can be time-invariant when the study period is short. In general, for both leaky and all-or-none vaccines, when the disease incidence rate is low, a larger cohort of subjects followed up for a shorter period may provide time-invariant estimates than a smaller cohort followed up for a more extended period. In practice, VE may vary due to the varying incidence rate of infection, waning VE, time-dependence of the chosen VE measure. Consequently, discerning the cause can be difficult when VE varies over time. In this regard, the time-dependence of a VE measure also depends on the vaccine action. Practitioners may utilize the measure for the relative VE of a vaccine against any two variants to identify the vaccine action. This is because this relative VE remains time-invariant in leaky vaccines but not in all-or-none action vaccines. Besides, the relative VE of a vaccine against two variants is time-invariant for leaky vaccines even when the incidence rate of infection changes over time.

To assist practitioners in evaluating the extent to which a chosen measure of VE can be time-dependent and to help them in finding an appropriate sample size, we created an R package. The package allows sample size calculations for two situations: hypothesis testing and aiming for a certain precision of the VE estimate. Our package has four key features not available in current calculators. First, we support the multi-variant and multi-vaccine scenario, which is very relevant in Influenza and the current COVID-19 pandemic. The package supports both the cohort design and TND studies. Second, along with sample size calculations for variant-specific VE, we also implemented calculations for relative VE. Third, we adjust the sample size calculations for potential confounders using the methodology proposed by Hsieh and Lavori (2000) . Last, for the precision of VE, we utilize simulations rather than the expected value of cell counts.

A limitation of our work is that we assumed constant incidence rates of infection, which may be difficult to justify in practice, and especially over more extended study periods. Although, our motivation to choose constant rates of infection was to show that some measures of VE vary over time even when the incidence rates of infection are constant. Our work does not have any data-based illustrations. Although, using our R package, practitioners may evaluate the extent to which their selected VE measures are time-varying. We also did not focus on VE measures based on hazard ratio. Given the competing-event situation with multiple variants, discussing competing risk models and cause-specific hazard ratio as a measure of VE would have been interesting. We also assumed that infection with a particular variant inhibits infection from other pathogen variants over the study period. Working under a general framework wherein reinfections are allowed will be a natural extension of this work.

Suppose that given a certain vaccination status V a subject a prone to infection by a set of variants I = {1, . . . , I}. The combined force of infection by all variants is given by Λ = i∈I λ i . In this situation the subjects who do not get infected by any variant over the study period [0, t] are control subjects. For these subjects the probability of being a control is same as in Putter et al. (2007, Equation 11 ). It is given by,

Following standard survival analysis methods, the expected person-time contribution of a subject E(Y | V ) is equal to the restricted mean survival time (Zhao et al., 2016) . This restricted mean survival time is defined as the area under the probability of being a control over the study-period [0, t] (Zhao et al., 2016) . Thus, E(Y | V ) can be defined as,

Lastly, the probability of being a case of a certain variant i is same as in Putter et al. (2007, Equation 12 ). It is given by, 

where S m (s) = P(C = 0 | V = m) is the probability of being a control at time s among subjects vaccinated with vaccine m and S 0 (s) = P(C = 0 | V = 0) is the probability of being a control at time s among placebo subjects.

Using (4), and (6), the relative VE of the vaccine m against any two variants i and j is obtained as, 

Effectiveness of the bnt162b2 covid-19 vaccine against the b. 1.1. 7 and b. 1.351 variants

Statistics in epidemiology: the case-control study

Re:"measurement of vaccine direct effects under the test-negative design

Temporal confounding in the test-negative design

Power calculations for matched case-control studies

Comparison of competing risks failure time methods and time-independent methods for assessing strain variations in vaccine protection

Statistical methods for assessing differential vaccine protection against human immunodeficiency virus types

The innovative medicines initiative: a european response to the innovation challenge

The statistics of anti-typhoid and anti-cholera inoculations, and the interpretation of such statistics in general

Will sars-cov-2 variants of concern affect the promise of vaccines?

Development of sanofi pasteur tetravalent dengue vaccine

Design and analysis of vaccine studies

Sample-size calculations for the cox proportional hazards regression model with nonbinary covariates

Obtaining power or obtaining precision: Delineating methods of sample-size planning. Evaluation & the health professions

Measurement of vaccine direct effects under the test-negative design

The authors reply

Effectiveness of covid-19 vaccines against the b. 1.617. 2 (delta) variant

Sample size determination in health studies: a practical manual

Efficacy of the chadox1 ncov-19 covid-19 vaccine against the b. 1.351 variant

Statistics in medicine: Calculating confidence intervals for relative risks (odds ratios) and standardised ratios and rates

Tutorial in biostatistics: competing risks and multi-state models

2015/16 seasonal vaccine effectiveness against hospitalisation with influenza a (h1n1) pdm09 and b among elderly people in europe: results from the i-move+ project

Efficacy of nvx-cov2373 covid-19 vaccine against the b. 1.351 variant

Assessment of the protective efficacy of vaccines against common diseases using case-control and cohort studies

Vaccine effectiveness against laboratory-confirmed influenza in europe-results from the drive network during season

Case-control studies: basic concepts

Epidemiology: study design and data analysis

Coronavirus disease 2019 (covid-19): situation report, 73

On the restricted mean survival time curve in survival analysis

Coronavirus vaccine tracker

We would like to express our gratitude to Paddy Farrington for sharing ideas during the initial development of the manuscript, subsequent review, and discussing limitations of this work. We are grateful to Jos Nauta from Abbott Laboratories for reviewing our manuscript and recommending important changes in some of terminologies that we used. Lastly, we would like to thank Kaatje Bollaerts and Anke Stuurman from P95 for coordinating and discussing the epidemiological aspects of this project.

The DRIVE project has received funding from the Innovative Medicines Initiative (IMI) 2 Joint Undertaking under grant agreement No 777363. This joint undertaking receives support from the European Union´s Horizon 2020 research and innovation and European Federation of Pharmaceutical Industries and Associations (EFPIA). The IMI is a joint initiative (public-private partnership) of the European Commission and EFPIA to improve the competitive situation of the European Union in the field of pharmaceutical research. The IMI provided support in the form of salaries for Anirudh Tomer and Jorne Biccler but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.