key: cord-0507858-zkka374w
authors: Ishihara, Takuya; Kitagawa, Toru
title: Evidence Aggregation for Treatment Choice
date: 2021-08-14
journal: nan
DOI: nan
sha: 7099908e94f2617dd8a2887af9205768d7b3218a
doc_id: 507858
cord_uid: zkka374w

Consider a planner who has to decide whether or not to introduce a new policy to a certain local population. The planner has only limited knowledge of the policy's causal impact on this population due to a lack of data but does have access to the publicized results of intervention studies performed for similar policies on different populations. How should the planner make use of and aggregate this existing evidence to make her policy decision? Building upon the paradigm of `patient-centered meta-analysis' proposed by Manski (2020; Towards Credible Patient-Centered Meta-Analysis, Epidemiology), we formulate the planner's problem as a statistical decision problem with a social welfare objective pertaining to the local population, and solve for an optimal aggregation rule under the minimax-regret criterion. We investigate the analytical properties, computational feasibility, and welfare regret performance of this rule. We also compare the minimax regret decision rule with plug-in decision rules based upon a hierarchical Bayes meta-regression or stylized mean-squared-error optimal prediction. We apply the minimax regret decision rule to two settings: whether to enact an active labor market policy given evidence from 14 randomized control trial studies; and whether to approve a drug (Remdesivir) for COVID-19 treatment using a meta-database of clinical trials.

An increasing number of policy-making authorities are interested in making their policy decisions evidence-based. In evidence-based decision-making, it is crucial for a planner to acquire credible evidence of a policy's causal impact on the affected population. Obtaining credible evidence for better policy decision-making is, however, challenging in many contexts. For instance, although randomized control trials (RCTs) are considered to be ideal for obtaining evidence of the causal impact of a policy, conducting an RCT can be costly in terms of budget, time, or administrative resources. Moreover, ethical or legal constraints can prevent the use of an RCT in certain institutional environments. In contrast, observational data can be both more accessible and easier to collect, but the credibility of any resulting causal estimates is limited if the validity of these estimates relies on restrictive identifying assumptions. In scenarios where the planner faces difficulties in collecting direct evidence, a practical alternative is to analyze the publicized results of intervention studies performed for similar policies on different populations. With this approach in mind, how should the planner make use of and aggregate existing evidence to reach her policy decision?

Statistical methodologies to aggregate evidence from multiple studies have been considered in the literature of meta-analysis and research synthesis. See, for instance, Hedges and Olkin (1985) and a recent handbook volume, Cooper, Hedges, and Valentine (2019) . Since the seminal works of Rubin (1981) and DerSimonian and Laird (1986) , a common approach to aggregation of evidence is the hierarchical Bayesian approach, in which the typical objective of analysis is to infer hyper-parameters indexing the population of studies. This framework of meta-analysis is useful for "summarizing what has been learned and quantifying how results differ across the studies beyond the sampling error" (DerSimonian and Laird (2015)). Its use is, however, limited when it comes to the planner's policy choice because the output of meta-analysis mainly concerns the population of studies rather than the particular population that is of interest to the planner. This point is made in Manski (2020) :

"Clinicians need to assess risks and choose treatments for populations of patients, not population of studies. To express this distinction succinctly, I will say that clinicians should want meta-analysis to be patient-centered rather than study-centered."

We pursue this paradigm of 'patient-centered meta-analysis' to develop a method to aggregate existing studies for the purpose of making an optimal treatment decision on the local population that is of interest to the planner (hereafter, the target population). Building on the framework of statistical treatment choice proposed by Manski (2000 Manski ( , 2004 , we formulate the planner's problem as a statistical decision problemà la Wald (1950) . The basic formulation of the decision problem analyzed in this paper is as follows. Let τ 0 be the average welfare effect of introducing a new policy to the target population. There is no data from which the planner can directly infer τ 0 , but she does have access to the results of existing intervention or observational studies that are indexed by k = 1, 2, . . . , K, K ≥ 1. Each study k reports a point estimatê τ k for the average welfare effect τ k in the study population, and an associated estimateσ k of the standard error. We allow the study population to be different from the target population, so that the average welfare effects can differ, i.e., τ k = τ k for k = k , 0 ≤ k, k ≤ K. The planner's decision problem, which we solve in this paper, is whether or not to adopt the new policy for the target population upon observing a meta-sample, (τ k ,σ k ), k = 1, . . . , K. That is, the statistical treatment choice rule we consider in this paper is a functionδ that maps the meta-sample to the binary choice of whether to adopt the policy or not.

Following Manski (2004 Manski ( , 2007 , Stoye (2009 Stoye ( , 2012 , and Tetenov (2012) , we apply the minimax regret criterion of Savage (1951) to obtain a minimax-regret treatment choice rule for the planner. We assume that the planner's objective function (social welfare function) is linear in τ 0 and consider the class of non-randomized statistical treatment choice rules that select the treatment based on the sign of linear aggregation of (τ k : k = 1, . . . , K):

where w = (w 1 , . . . , w K ) is a vector of weights assigned to each estimate in the pool of studies, which does not depend on the data. Assuming a Gaussian sampling distribution for (τ 1 , . . . ,τ K ) with known variances and imposing certain symmetry and invariance conditions on the parameter space for (τ 0 , τ 1 , . . . , τ K ), we derive the aggregation weights w minimax leading to a minimaxregret treatment choice rule. Analytical characterization and computation of the exact minimax regret rule often become challenging in the context of statistical treatment choice. Our approach to the planner's minimax regret aggregation rule, in contrast, overcomes these challenges by showing that some mild restrictions on the parameter space and the class of decision rules deliver analytically and computationally tractable minimax regret rules.

We assert that the perspective and tools of statistical decision theory are particularly appealing in the meta-analysis setting for the following reasons. First, if each study in the pool reports a consistent estimate using a sample of moderate to large size (e.g., the differencein-means estimator for the average treatment effect) then, by its asymptotic normality, it is plausible to assume thatτ k follows a Gaussian distribution centered at τ k . Hence, the standard and well-studied framework of Gaussian experiments fits well to the current meta-analysis setting. Second, it is common for the meta-sample to consist of only a small number of studies.

In such instances, asymptotic analysis with K → ∞ can be misleading, and deriving finite-K optimal procedures, which statistical decision theory is particularly suitable for, is desirable.

As an alternative to the minimax regret treatment choice rule, one could consider using a plug-in rule that chooses the treatment according to the sign of an estimate of τ 0 . The plug-in rule that uses a minimax mean squared error (MSE) optimal estimate of τ 0 is an example.

Minimax-MSE estimation for finite-dimensional Gaussian mean models is well-studied and the minimax-MSE weights w MSE are simple to compute, although the resulting plug-in decision rule does not generally possess decision theoretic optimality in terms of the planner's objective function. To quantify the welfare cost ofδ w MSE , we compare the worst-case regrets ofδ w minimax andδ w MSE , and show that the worst-case regret ofδ w MSE is worse than the minimax regret only up to a constant factor of 5.88, independent of the number of studies K and the parameter space.

Our framework can accommodate a vector of observable characteristics x k , k = 0, 1, . . . , K, where x k includes the characteristics of the treatment and demographics of the population featured in study k. Under a linear functional form specification, τ k = β 0 + x k β, β ∈ B, common to standard meta-regression analysis (see, e.g., Stanley and Jarrell (1989) ), we discuss those restrictions on B under which we can apply our minimax regret decision rule. For minimax regret to be bounded, an important constraint is boundedness of B, and the bounds of B have to be explicitly specified to obtain the minimax regret rule. In reality, the planner may not be able to come up with reasonable bounds for B. To offer a practical solution to this difficulty, we consider a data-driven way to specify the parameter space based on confidence sets for (τ 1 , . . . , τ K ).

We illustrate the use of our minimax regret treatment rule by using two empirical examples.

In the first application, we analyze whether an active labor market program should be adopted using the meta-database appearing in Card, Kluve, and Weber (2017) . We consider a pool of 14 RCT studies of job training programs, covering 8 different countries (Argentina, Brazil, Colombia, Dominica, Jordan, Nicaragua, Sri Lanka, Turkey, and the United States) . Based on the average treatment effect and standard error estimates in each of these studies, and the demographic characteristics of the studied populations, we calculate the minimax regret adoption decisions for several countries (Japan, the United Kingdom, and Peru) for which the corresponding experimental estimates are not available in the meta-database.

In the second application, we consider the drug approval decision for a COVID-19 medication called Remdesivir. Remdesivir is an antiviral medication that is known to be effective against Middle East Respiratory Syndrome (MERS) and Severe Acute Respiratory Syndrome (SARS), while its effectiveness against COVID-19 remains unknown due to conflicting evidence. Using the meta-database of randomized clinical trials for COVID-19 treatments provided by Juul, Nielsen, Feinberg, Siddiqui, Jørgensen, Barot, Nielsen, Bentzer, Veroniki, Thabane et al. (2020) , we calculate the minimax regret treatment choice for Remdesivir for some specified demographic groups.

The remainder of the paper is organized as follows. The next subsection reviews the related literature. Section 2 formulates the minimax regret decision problem and shows the main analytical result of the paper. In Section 3, we compare the minimax regret with the maximum regret of the decision rule based on the minimax-MSE aggregation rule. Section 3 also discusses a data-driven construction of the parameter space. Section 4 performs numerical analysis to compare the minimax-regret aggregation rules with the minimax-MSE and meta-OLS rules.

This paper contributes to the growing literature on statistical treatment choice and individualized treatment assignment rules initiated by Manski (2000 Manski ( , 2004 . Contributions to the current literature include Dehejia (2005) , Hirano and Porter (2009 ), Stoye (2009 , 2012 , Chamberlain (2011), Bhattacharya and Dupas (2012), Tetenov (2012) , Kasy (2016 Kasy ( , 2018 , Tetenov (2018, 2021) , Kitagawa and Wang (2020) , Russell (2020) , Kitagawa, Sakaguchi, and Tetenov (2021) , Mbakop and Tabord-Meehan (2021) , Athey and Wager (2021), Sakaguchi (2021) , and Viviano (2021), among others. The problem of individualized treatment assignment rules has also been an area of active research in the fields of medical statistics and machine learning; see, for instance, Zadrozny (2003) , Beygelzimer and Langford (2009) Qian and Murphy (2011), Zhao, Zeng, Rush, and Kosorok (2012) , Swaminathan and Joachims (2015) , Kallus (2020), to list but a few papers. The standard setting in the existing literature considers an optimal treatment assignment policy for the population from which a sample was drawn, rather than combining the pool of estimates from multiple studies performed on different populations.

There is a growing literature on how to inform policy using multiple pieces of evidence or extrapolation from one or multiple reference populations. Dehejia, Pop-Eleches, and Samii (2021) considers the use of (quasi-)experimental evidence to study the decision of whether to experiment or to extrapolate, and, if applicable, where to conduct a new experiment. Manski (2018) analyzes decision-making for personalized risk assessment under the ecological inference setting where (partial) identification of a long regression is obtained by combining information on a short regression and the joint distribution among the regressors. The meta-analysis setting considered in this paper differs from the ecological inference setting in terms of the object to identify and the type of information provided by the available studies. Focusing on conditional cash transfer programs, Gechter, Samii, Dehejia, and Pop-Eleches (2019) runs multiple program evaluation methods on data obtained from Mexico to inform treatment assignment policies for Morocco, and empirically compare the welfare performances of these policies. Hotz, Imbens, and Mortimer (2005) and Dehejia et al. (2021) analyze how to predict the effects of future programs from past experimental evaluations by adjusting for differences in the distributions of observable characteristics. Andrews and Oster (2019) propose a method to conduct sensitivity analysis and to approximate external validity bias when the trial and target populations differ in the distribution of unobservables. Gechter (2016) considers bounding causal effects in a target population by restricting the dependence between the treated and control outcomes.

Meta-analysis for research synthesis has been actively studied in statistics and the resulting literature is vast; see, e.g., Borenstein, Hedges, Higgins, and Rothstein (2009) for a textbook and Cooper et al. (2019) for a handbook volume. In economics, existing applications of meta-analysis and meta-regression include Card and Krueger (1995) , Dehejia (2003) , Bandiera, Fischer, Prat, and Ytsma (2017) , Card et al. (2017) , Meager (2019 Meager ( , 2020 , Imai, Rutter, and Camerer (2020) , and Vivalt (2020) . See Stanley (2001) for a review. The common framework of meta-analysis introduces the population of studies and draws inference for the parameters thereof. As we argue in the Introduction via the quote from Manski (2020) , the usefulness of the conventional framework of meta-analsyis is not obvious for informing the planner's policy decision. This paper follows and pushes forward the perspective of patient-centered meta-analysis.

The methodological proposals in Manski (2020) concern predicting treatment effects for the target population by intersecting the population identified sets for τ 0 formed by extrapolation from each study, rather than explicitly taking into account sampling uncertainty due to finite sample size when considering the treatment choice decision.

In terms of the analytical and computational challenges for obtaining minimax regret rules, this paper is most closely related to Stoye (2012) . In one of his baseline settings, Stoye (2012) considers Gaussian experiments for conditional average treatment effects with a scalar covariate

x ∈ X , and analyzes the properties of the minimax regret treatment rule under a restriction that the conditional average treatment effects depend on x with bounded variation. Similar to Stoye (2012) , our framework allows (study-specific) covariates to constrain the parameter space for (τ 0 , τ 1 , . . . , τ K ), but there are two aspects in which our framework differs. First, the treatment assignment rules considered in Stoye (2012) are functions δ : X → {0, 1}, while we concern ourselves with the treatment choice at a particular covariate value x 0 in X that corresponds to the covariate value of the target population. This reduction of the treatment choice rule from a function of x to a point significantly simplifies the analysis and computation of the minimax-regret rule. Second, the conditions we impose on the parameter space for feasible computation of the minimax regret rule is general and includes the bounded variation restriction considered in Stoye (2012) as a special case.

Viewing τ 0 as the value of the regression equation at x 0 in a Gaussian regression model and considering standard estimation risk such as the mean squared errors for τ 0 , the problem is reduced to an interpolation or extrapolation exercise based on the Gaussian signals. As such, the minimax estimation and inference problem for τ 0 is similar to the extrapolation issue in the regression discontinuity setting analyzed in Kolesár and Rothe (2018) . Recent contributions regarding estimation and inference for Gaussian sequence models are made by Johnstone (2017) and Armstrong and Kolesár (2018, 2020) . These papers consider statistical losses for estimation and inference, but do not consider the welfare criterion for statistical treatment choice.

2 Minimax regret treatment rule 2.1 Setting Suppose we have access to the publicized results of K studies indexed by k = 1, . . . , K. Each of the studies estimates the causal effect of a particular binary policy or treatment. We allow the details of the policy (implementation protocol, dosage, program contents, etc.) to differ across the studies. For k = 1, · · · , K, letτ k denote the estimate of the policy effect reported in study k and σ k denote the standard error ofτ k . For simplicity, we assume σ k is known, although, in practice, we can only construct a consistent estimator for σ k . We solve for the finite sample minimax regret rule with known (σ k : k = 1, . . . , K), recommending that in practice the rule is implemented with the true standard errors replaced by their consistent estimates. 1 1 Solving the decision problem with Gaussian signals with known variances and obtaining a feasible decision rule by plugging in consistent estimators for the variances are similar to the construction of an asymptotically optimal decision rule within the framework of Gaussian limit experiments. See Hirano and Porter (2020) for a We assumeτ

where τ k is the true policy effect of the population featured in study k. We allow τ k to vary across the studies. The assumption thatτ k follows a Gaussian sampling distribution for is reasonable if the reported estimatorτ k is consistent and asymptotically normal and each study has a moderate to large sample.

Throughout this paper, we consider a planner who, upon observing data D ≡ {τ k } K k=1 , must determine whether or not to adopt the policy in the target population given that its policy effect τ 0 is unknown. Following Manski (2004 Manski ( , 2007 , Stoye (2009 Stoye ( , 2012 , and Tetenov (2012), we focus on minimax regret criterion to solve this decision problem. To this end, we assume that the true parameters τ ≡ (τ 0 , τ 1 , . . . , τ K ) are ex ante known to belong to the parameter space T .

We impose the following restrictions on the parameter space T : Assumption 1. The parameter space T satisfies 1. Symmetry: τ ∈ T ⇒ −τ ∈ T , and 2. Invariance to common constant addition: τ ∈ T ⇒ τ + c · (1, . . . , 1) ∈ T for any c ∈ R.

The symmetry assumption rules out the imposition of a sign restriction on the causal effect parameters, i.e., τ k ≥ 0 for some k. The condition of invariance to common constant addition (hereafter, shortened to invariance) implies that {(τ k − τ 0 ) : τ ∈ T , τ 0 = t} does not depend on t. We use this result to simplify derivation of a minimax regret treatment rule. It is worth noting that the invariance condition rules out the case in which the parameter space for some τ k , k ∈ {0, 1, 2, . . . , K}, is bounded. For instance, if the outcome is binary, the treatment effect on the outcome is bounded by [−1, 1] for all τ k , k = 0, 1, . . . , K. However, if the standard errors (σ 2 k : k = 1, 2, . . . , K) and the variations among (τ 0 , τ 1 , . . . , τ K ) imposed on T (e.g., Lipschitz constants C kl , 0 ≤ k, l ≤ K, in Example 2 below) are sufficiently small relative to the size of the supports, bounded support of (τ 0 , τ 1 , . . . , τ K ) is less of an issue because extreme values of τ beyond its logical support are unlikely to correspond to a worst-case in terms of the regret.

The following two examples satisfy the parameter space constraints of Assumption 1.

Example 1 (The space of τ spanned by the meta-regressions). Suppose that for each study in the pool, k = 1, . . . , K, we can construct a vector of study-specific observable characteristics x k ∈ R dx . For example, x k can contain the average characteristics of the individuals in the sample recent review. studied in the k-th study. It can also include the socioeconomic or demographic characteristics of the country that the sample was drawn from and the characteristics of the treatment studied.

The target population has known covariate value x 0 ∈ R dx , which shares the same interpretation and dimension as x k , k = 1, . . . , K.

In meta-regression analysis, τ k is often specified as

Accordingly, we assume that the parameter space can be written as

where B is a compact subset of R dx . As seen in Theorem 2, compactness of B implies that minimax regret is finite. If B satisfies β ∈ B ⇒ −β ∈ B, then T meta satisfies Assumption 1. We can allow study-specific intercepts without violating Assumption 1 by viewing them as study-specific fixed dummy variables added to the covariate vector.

Example 2 (The class of constant variations). Consider the following parameter space:

where {C kl : k, l = 0, 1, · · · , K} is a set of known positive constants. With study-specific covariates as introduced in Example 1, setting C kl = C x k − x l , C > 0, yields the class of Lipschitz vectors of τ . Clearly, for any {C kl : k, l = 0, 1, · · · , K}, the parameter space T C satisfies Assumption 1. Assumption 1 in Stoye (2012) corresponds to the case where C kl is a common constant for any 0 ≤ k, l ≤ K.

When Assumption 1 does not hold in a given application, we can still formulate the optimization problem to derive the minimax regret treatment rule, although solving for this is accompanied by a substantial increase in computational complexity. In Remark 4 below, we discuss a derivation of the minimax regret treatment rule that does not rely upon Assumption 1.

Given a non-randomized treatment choice action δ ∈ {0, 1}, define the welfare attained by δ as

where c 0 is the per-person cost of the policy and µ 0 is the average outcome that would be realized in the absence of the policy. An optimal treatment choice action given knowledge of τ 0 and c 0 is

Letδ(D) ∈ {0, 1} be a non-randomized statistical treatment rule that maps the meta-sample D to the binary decision of treatment choice in the target population. The welfare regret of

where E τ (·) is the expectation with respect to the sampling distribution of D given the parameters τ . Hereafter, we normalize the cost of treatment to c 0 = 0, i.e., interpret τ k , k = 0, . . . , K, as the average treatment effect net of the per-person treatment cost in the target population.

The minimax regret criterion selects a statistical treatment rule that minimizes maximum

where D is a class of statistical treatment rules. We refer toδ minimax as a minimax regret rule.

In the next subsection, we derive a minimax regret rule under the class of statistical treatment rules spanned by linear aggregation rules.

Bayesian inference that yields a posterior distribution for each parameter in τ and its hyperparameters. Once uncertainty for τ is summarized by the posterior distribution, we can

show that the Bayes optimal decision rule is determined by the posterior mean of τ 0 . For any prior π, the Bayes optimal decision ruleδ π is defined aŝ

We observe that

where E D (·) denotes the expectation with respect to the marginal distribution of D and E π (·|D) denotes the posterior mean. This implies that

In Example 1, typical hierarchical Bayes approaches assumeβ ≡ (β 0 , β ) ∼ N 0, Σβ .

Then, the posterior mean of τ 0 ≡ β 0 + x 0 β can be written as

Hence, in this case, the Bayes optimal decision rule is a linear aggregation rule in the sense of Definition 1 below. The weights of the Bayes optimal rule depend on the hyperparameters of the Gaussian prior forβ and the matrix of study characteristics, X. If the mean of the prior distribution is not zero, then the posterior mean of τ 0 involves a constant. In such a case, the Bayes optimal decision rule belongs to the class of extended linear aggregation rules considered in Remark 3 below.

To gain analytical and computational tractability, we focus on the class of linear aggregation rules.

Definition 1 (Linear aggregation rules). The class of linear aggregation rules consists of nonrandomized treatment choice rules, each of which chooses a treatment according to the sign of a linear aggregation of (τ 1 , . . . ,τ K ):

where w = (w 1 , · · · , w K ) does not depend on the data D.

This class rules out nonlinear treatment rules that plug in an aggregation of (τ 1 , . . . ,τ K )

in the manner of James-Stein shrinkage or empirical Bayes. Nevertheless, the class of linear aggregation rules contains many reasonable treatment rules. For example, plug-in rules 1 {τ 0 (D) ≥ 0} based on linear estimatorsτ 0 (D) for τ 0 belong to D lin . When study characteristics are included in the available covariates as in Example 1, D lin includes those rules that plug in fitted values based on parametric linear regression or nonparametric kernel regression.

Furthermore, as shown in Remark 1 above, hierarchical Bayes decision rules under the linear meta-regression specification with Gaussian priors yields the linear aggregation rule.

We consider the minimax regret rule among D lin whose corresponding weight vector solves

To develop a computation method for w minimax , note from (1) 

Hence, from (6), the regret ofδ w can be written as

where the first equality follows from the normality of K k=1 w kτk and the second equality follows from 1 − Φ(a) = Φ(−a). We then obtain w minimax by minimizing the maximum regret max τ ∈T R(τ ,δ w ).

If the parameter space T satisfies Assumption 1, we can simplify the derivation of w minimax .

Noting the symmetry of T from Assumption 1, we obtain

where the last equality follows from

Viewing K k=1 w kτk as an estimator for τ 0 , b(w) and s(w) can be interpreted as the maximum bias and the standard deviation of a linear estimator K k=1 w kτk , respectively. Using these terms, we can express the maximum regret as

Hence, we obtain the following theorem:

Theorem 1. Suppose that the parameter space T satisfies Assumption 1. Then, the minimax regret rule among D lin is obtained via the following optimization:

In view of Theorem 1, we can compute w minimax using the following algorithm:

1. Fix w such that K k=1 w k = 1. Calculate s(w) and b(w), and obtain the maximum regret s(w) · η(b(w)/s(w)), where we approximate η(·) by a piecewise linear function.

Minimize the maximum regret s(w) · η(b(w)/s(w)) subject to K k=1 w k = 1.

In step 1, we compute b(w) by solving the optimization in τ . As shown below, there are many examples in which we can calculate b(w) using linear programming. If so, b(w) can be solved quickly and reliably even when K is large. Furthermore, because t → t · Φ(−t + a) is a smooth unimodal function, η(a) is easy to compute. Figure 1 displays the shape of η(a) as a function of a. From the proof of Lemma 2 below, we find that η(a) is strictly increasing and convex. In numerical simulations given in Section 4 below, we compute the optimization of step 2 using the R package "Rsolnp". We find that this optimization step is quick and stable even when K exceeds 100. (0) is approximately equal to 0.17.

Remark 2. There are some important cases where we can calculate b(w) using linear programming. For example, consider the parameter space T meta of Example 1, where we have

Hence, if B is a polyhedron, we can calculate b(w) using linear programming.

If the parameter space is T C as in Example 2, we have

where this maximization can again be solved using linear programming.

Even if the parameter space T satisfies Assumption 1, the minimax regret can be unbounded.

For example, T = R K+1 satisfies Assumption 1 but the maximum regret is unbounded with b(w) = +∞ for any w. This is because lim a→∞ η(a) = +∞.

To have the maximum regret bounded, we need to impose a restriction that the difference between τ k and τ 0 is bounded for some k.

Assumption 2. There exists M < ∞ such that τ ∈ T implies |τ k − τ 0 | < M for some k.

Theorem 2. Suppose that the parameter space T satisfies Assumptions 1 and 2. Then, the minimax regret is finite.

Assumption 2 means that there exists some study in the pool that provides some (partially) identifying information about τ 0 . This condition holds for the parameter space T meta of Example 1 if B is compact. Similarly, T C satisfies Assumption 2. Theorem 2 then applies to these cases and guarantees that the minimax regret is bounded.

Remark 3. We can add a constant term to equation (7), i.e., consider the following treatment

where v ∈ R. Then, the maximum regret ofδ v,w can be written as

Since η(a) is strictly increasing in a, for v = 0, the maximum regret ofδ v,w is greater than that ofδ w . Hence, the minimax regret rule sets v = 0, so that it is not necessary to consider treatment rules with an intercept like in (10).

Remark 4. If Assumption 1 is relaxed, the optimization problem for maximum regret can be expressed as

where S ≡ {τ 0 : τ ∈ T }. Hence, the maximum regret can be written as

whereb

The weights of the minimax regret rule therefore solve

Since the parameter space T does not satisfy the second condition in Assumption 1, the maximum biasb(t, w) may depend on t. This complicates computation of w minimax .

In this section, we compare w minimax with other ways of forming the weights. First, we consider a minimax linear estimator of τ 0 in terms of the mean squared errors (MSE). It is well known that the maximum MSE of K k=1 w kτk can be decomposed into the variance and the squared maximum bias:

Hence, the weights of the minimax MSE estimator are

We refer toδ w MSE as the minimax MSE rule.

To compare the minimax regret and MSE rules, we focus on the analytical properties of η(a).

Tetenov (2012) shows that η(a) is a continuous, strictly increasing function and η(0) 0.17.

Furthermore, in the proof of the following lemmas, we show that η(a) is concave. We accordingly obtain the following upper and lower bounds on η(a):

Lemma 2. For any a ≥ 0, we have

Relying on Theorem 1 and Lemmas 1-2, the next theorem bounds the maximum regret.

Theorem 3. Suppose that the parameter space T satisfies Assumptions 1 and 2. Then, for any w and v ≥ 0, we obtain

In addition, we obtain

Theorem 3 provides lower and upper bounds on the maximum regret. These bounds

show that the maximum regret is bounded from above and from below by b(w) + s(w) and

b 2 (w) + s 2 (w) up to some proportional factors, independently of the number of studies, K, and the dimension of x k , d x . The second set of inequalities imply that the minimax regret is equivalent to the minimax RMSE (root-MSE) up to a constant factor. In other words, minimax RMSE enables us to bound the minimax regret.

Furthermore, Theorem 1 and Lemma 2 lead to the following comparison of the maximum regret between the minimax regret ruleδ w minimax and the minimax MSE ruleδ w MSE .

Theorem 4. Suppose that the parameter space T satisfies Assumptions 1 and 2. Then, we

Theorem 4 shows that the maximum regret of the minimax MSE rule is the same as the minimax regret up to a constant factor, independently of K and d x . Numerical simulations in Section 4 suggest that the maximum regret ofδ w MSE can be about 40 percent greater than the minimax regret.

The proof of Lemma 2 given in the Appendix shows that the minimax regret criterion places greater emphasis on the bias than on the variance compared with the minimax MSE criterion. To see this, consider the directional derivatives of the maximum regret and MSE.

We fix θ = (θ 1 , · · · , θ K ) with K k=1 θ k = 1 and assume that b(w) and s(w) are directionally differentiable. We define

where Q θ (w) is the directional derivative of the maximum regret. Let t * (a) be the maximizer of t · Φ(−t + a). Then, by the proof of Lemma 2, we have

Here, the sign of Q θ (w) is determined by

where t * (a) − a is decreasing in a as shown in the proof of Lemma 2. Similarly, the sign of the directional derivative of the maximum MSE, b 2 (w) + s 2 (w), is determined by

Suppose that b θ (w) < 0 and s θ (w) > 0, that is, we face the bias-variance tradeoff. Then, because numerical evaluation implies t * (a) − a < a −1 for a ≥ 0, we obtain

When w = w MSE , the right-hand side must be zero. Hence, if b θ (w MSE ) < 0 and s θ (w MSE ) > 0, we conclude

That is, at the minimax MSE weights w = w MSE , locally perturbing the weight vector in the direction that reduces the bias and increases the variance improves the welfare regret. This implies that the minimax regret criterion places greater emphasis on the bias than on the variance compared with the minimax MSE criterion. In the numerical analysis of Section 4, we plot w minimax and w MSE to illustrate the difference in their bias-variance balancing properties.

If we have perfect knowledge of (τ 1 , . . . , τ K ), i.e., σ k = 0 for all 1 ≤ k ≤ K, we can obtain the (true) identified set of τ 0 based on the constraints on the parameter space T . For instance, we construct the identified set of τ 0 by intersecting multiple bounds for τ 0 , each of which is constructed by extrapolating from τ k , as considered in Manski (2020) . We can then consider finding the minimax regret treatment rule given the true identified set of τ 0 without any sampling uncertainty. We denote by δ * IS such a (non-randomized) minimax regret rule. As the sample size of each study increases, that is, as σ k → 0, should we expect the minimax regret ruleδ w minimax we constructed in the previous section to converge to δ * IS ? In what follows, we compare δ * IS with the limiting version ofδ w minimax , and show thatδ w minimax does not necessarily converge to δ * IS as σ k → 0. We then consider an alternative class of treatment choice rules that converge to δ * IS as σ k → 0. These alternative treatment rules solve the minimax regret with a data-driven parameter space built upon confidence regions for τ .

These rules, therefore, do not belong to the linear aggregation rules of Definition 1. Moreover, their computation are not as simple as the linear minimax regret rule δ w minimax obtained in the previous section, and we do not know if they coincide with any exact minimax regret (nonlinear) rule obtained for a data-independent parameter space. Nevertheless, we can show that such modified treatment rules converge to δ * IS as σ k → 0, which could be of theoretical interest. First, we consider the minimax regret rule when σ k = 0 for all k = 1, . . . , K. Then, τ k =τ k for all k = 1, . . . , K and the identified set of τ 0 is IS 0 ≡ {τ 0 : τ ∈ T and τ k =τ k for all k = 1, · · · , K} .

In this case, the parameter space T projected for τ 0 yields the identified set IS 0 . Because there is no randomness in this problem, for a treatment rule δ, the welfare regret of δ becomes

Hence, the minimax regret rule over IS 0 can be written as

where τ 0 ≡ inf{τ 0 : τ 0 ∈ IS 0 } and τ 0 ≡ sup{τ 0 : τ 0 ∈ IS 0 } are the smallest and largest values of the identified set of τ 0 , respectively. The rule δ * IS becomes 1 (or 0) when we have τ 0 > 0 (or τ 0 < 0), that is, all values of the identified set of τ 0 are positive (or negative). When the identified set of τ 0 contains both of positive and negative values, δ * IS becomes 1 (or 0) if the absolute value of τ 0 is larger (or smaller) than that of τ 0 . 2 Next, we consider the large sample properties of our minimax regret ruleδ w minimax , i.e., σ k → 0. In this case, we can show that the minimax regret criterion yields the treatment rule that minimizes the maximum bias. The proof of Lemma 2 shows that η(a) is strictly increasing and convex with its slope bounded from above by one. Hence, the slope of η(a) converges to a positive constant c ∈ (0, 1] as a → +∞. This implies that when a is large, η(a) can be approximated by d + c · a for some d. As σ k → 0 for all k, we have s(w) → 0 for any w. From Theorem 1, as s(w) → 0, we can approximate the maximum regret ofδ w by c · b(w). This implies that in large samples, the minimax regret rule becomes a treatment rule that minimizes the maximum bias b(w).

To be specific, consider the case in which the parameter space is the class of Lipschitz vectors given in Example 2. Since we have

as σ k → 0 for all k, the minimax regret rule converges to the rule that depends only on the closest study in terms of the metric on the covariate space, i.e., the weight of the closest study w k * converges to 1, where k * satisfies x k * − x 0 ≤ x k − x 0 for all k = 1, · · · , K. Hence, the minimax regret rule converges toδ w minimax (D) = 1{τ k * ≥ 0}, and the decision of whether or not to introduce the policy is solely based on the closest study.

In this case, we can show thatδ w minimax does not converge to δ * IS as σ k → 0. If the observed covariates are scalar, then the identified set of τ 0 can be written as

Because our minimax regret ruleδ w minimax uses only the closest study, it does not agree with δ * IS from (17). In fact, it is possible thatτ k * is positive but that the absolute value of τ 0 is larger than that of τ 0 .

To resolve such a disagreement, we propose a minimax treatment rule refined by a confidence region of τ . Data D provide some information about the parameter space T . If there is not an a priori assumption available to constrain T , we may want to exploit such in-sample information to refine the minimax regret rule.

For α ∈ (0, 1), consider a subsetT (α) ⊂ T that depends on the data D and satisfies

where P τ is the sampling probability distribution of the data when the true parameter value is τ .T (α) is a confidence set for τ with a coverage probability of at least 1 − α. For example, the following hyper-rectangle satisfies condition (18):

where z α,K is the value such that P (|Z| ≤ z α,K ) = (1 − α) 1/K for a standard normal variable Z.

When the parameter space is T meta and d x < K, we can construct another confidence region of τ . Letβ be the OLS estimator of β. Then, the following set satisfies the condition (18):

where S(β) is the variance matrix ofβ and χ(α, d x ) is the (1 − α)-th quantile of the chi-square distribution with d x degrees of freedom.

By replacing the parameter space T withT (α), we can compute the refined minimax regret rule. We defineŵ

For the class of linear aggregation rules considered in the previous sections, w minimax cannot depend on the data D. In contrast,ŵ minimax (α) depends on the data throughT (α). Hence, the refined minimax regret ruleδŵ minimax (α) becomes a non-linear aggregation rule. BecauseT (α) is contained in the parameter space T , the refined minimax regret rule is less conservative than δ w minimax . From (18), for any w, we obtain

Hence,ŵ minimax (α) minimizes the worst-case regret overT (α), which is a valid upper bound on the true regret with probability 1 − α.

SinceT (α) may not satisfy Assumption 1, we cannot deriveŵ minimax (α) using Theorem 1.

However, even ifT (α) does not satisfy Assumption 1, we can calculateŵ minimax (α) using (11) in Remark 4. When the parameter space is T meta , we can easily calculate the refined minimax regret rule usingT meta (α). If (β − β) S(β) −1 (β − β) ≥ χ(α, d x ) implies β ∈ B, then we havê

Then, for t > 0, we obtaiñ

Let β * be a maximizer of the above problem. Using the method of Lagrange multipliers, we find that β * satisfies

where Then, equations (20) imply that

For t < 0, we can obtain similar results. Hence,b(t, w) has the following closed form expression:

This result makes the computation ofŵ minimax (α) easier. In this case, because S = R, we

Hence, in this case, it is not difficult to computeŵ minimax (α).

As shown above, when σ k → 0 for all k,δ w minimax does not converge to δ * IS . However, we can show that the refined minimax regret rule usingT HR (α) converges to δ * IS . When σ k → 0 for all k, the hyper-rectangle confidence regionT HR (α) projected for τ 0 converges to the identified set IS 0 defined in (16). Hence, in this case, if K k=1 w kτk : K k=1 w k = 1 includes both positive and negative values, that is,τ k is not the same for all k, then the refined minimax regret rulê δŵ minimax (α) converges to δ * IS .

To illustrate the results we obtain in the previous sections, we present some numerical analysis.

Throughout this section, we set the study-specific covariates as equidistant grid points on [0, 1]:

x k = (k − 1)/(K − 1) and σ k = 1 for k = 1, · · · , K.

We consider the following two parameter spaces:

where C is a positive constant. For these two parameter spaces, we derive the minimax regret and minimax MSE rules and compare the maximum regrets of these two treatment rules.

In the case of the linear parameter space T 1 , one natural treatment rule is based on plugging in the OLS estimator,δ

wherex 0 ≡ (1, x 0 ) andb is the OLS estimator of (β 0 , β) . Becauseb is linear with respect to (τ 1 , . . . ,τ K ) , this rule can be expressed as a linear aggregation rule, K k=1 w OLS,kτk . Another natural treatment rule is based on plugging in the hierarchical Bayes (HB) estimator defined in Remark 1,δ

where w HB ≡ (w HB,1 , . . . , w HB,K ) ≡w HB / K k=1w HB,k andw HB ≡ (w HB,1 , . . . ,w HB,

0 . We set the prior variance matrix as Because the state space T 1 does not restrict β 0 , we specify a diffuse prior for β 0 . In contrast, because T 1 assumes that β ∈ [−C, C], we assume that β is contained in [−C, C] with prior probability 0.95.

We calculate w OLS , w HB , w MSE , and w minimax for K = 30 and x 0 = 0.1. Table 1 contains the results of this experiment for C = 0.1, 1.0, and 2.0. Table 1 shows the ratios of b(w) and s(w), and the ratios of the maximum regrets, that is,

Because the OLS estimator is unbiased, the maximum bias of the OLS estimator is exactly zero. Hence, the ratio b(w OLS )/s(w OLS ) is exactly zero in all settings. For the HB rule, b(w HB )/s(w HB ) increases as C increases. The ratio of b(w minimax ) and s(w minimax ) is smaller than that of w MSE in all settings. This implies that the minimax regret criterion places more emphasis on the bias than the variance compared with the minimax MSE criterion. Table 1 shows that the maximum regret of the minimax MSE rule is about 40 percent greater than the minimax regret when C = 1.0. When C = 0.1, the maximum regrets of w MSE and w HB are close to the minimax regret. If C is sufficiently large, w minimax is almost the same as w OLS .

Hence, when C = 1.0 or 2.0, the maximum regret ofδ w OLS is almost identical to the minimax regret. In contrast, when C is small, w minimax is quite different from w OLS and the maximum regret ofδ w OLS is about 30 percent greater than the minimax regret. 

ratio (OLS) ratio (HB) ratio (MSE) Note: The ratios that are shown in the final three columns are the ratios of the maximum regrets, max τ ∈T R(τ ,δ w )/ max τ ∈T R(τ ,δ wminimax ), for w = w OLS , w HB , and w MSE .

Next, we consider the Lipschitz parameter space T 2 . We calculate w minimax and w MSE for K = 30 and x 0 = 0.5. Similar to T 1 , we consider the following hierarchical Bayesian model with the prior distribution,

where τ −0 ≡ (τ 1 , . . . , τ K ) and

We set the prior variance of τ k as 10 and the prior covariance of τ k and τ l as 10 · exp(−|x k −

x l |/a) for some positive constant a > 0. We choose a positive constant a that satisfies 1 K(K+1)/2 k<l P (|τ k − τ l | > C|x k − x l |) = 0.05. Then, the posterior mean of τ 0 is written as

which pins down the weights of the Bayes optimal decision rule w HB . Table 2 shows the ratios of b(w) and s(w) and the ratios of the maximum regrets for C = 0.1, 1.0, and 2.0. The ratio of b(w minimax ) and s(w minimax ) is smaller than the ratios of w HB and w MSE in all settings. When C is small, the maximum regret of the minimax MSE rule nearly attains the minimax regret. In contrast, when C is large, the maximum regret of the minimax MSE rule is about 17 percent greater than the minimax regret. Similar to the minimax MSE rule, the maximum regret of the hierarchical Bayes rule nearly attains the minimax regret when C is small. In addition, it is about 30 percent greater than the minimax regret when C is large. Note: The ratios that are shown in the final two columns are the ratios of the maximum regrets, max τ ∈T R(τ ,δ w )/ max τ ∈T R(τ ,δ wminimax ), for w = w HB and w MSE . 

We illustrate the use of our methods by means of two applications. The first application considers whether an active labor market policy should be adopted, and the second application considers whether a COVID-19 treatment should be approved.

We use the meta-database of Card et al. (2017) , which contains the estimates from over 200 recent studies of active labor market programs including training, subsidized employment, and job search assistance. We focus on papers that analyze RCT data to assess the impact of job training on the employment rate. This criterion reduces the meta-sample to 14 RCT estimates (K = 14) collected from 8 different countries: Argentina, Brazil, Colombia, Dominica, Jordan, Nicaragua, Turkey, and the United States. Table 3 lists the papers included in the meta-sample of this application. To form a vector of study characteristics x k , k = 0, 1, 2, . . . , K, we use five covariates that characterize the country and the sub-population on which the RCT study was performed. These are a gender dummy (male only = 0, female only = 0, mixed = 0.5), an age dummy (age < 25 only = 1, age ≥ 25 only = 0, both = 0.5), an OECD dummy, the (standardized) GDP growth rate, and the (standardized) unemployment rate in 2010. Table 4 shows the estimates, standard errors, and study characteristics in this meta-sample. We derive the minimax regret and minimax MSE rules with the following parameter space:

with a prespecified Lipschitz constant C ≥ 0. To determine C, we perform leave-one-out cross-validation with the study-average welfare criterion to obtain C = 0.025.

We consider whether the training program should be adopted in the following three target populations:

• Japan ( Figures 5-13 plot w minimax , w MSE , and w HB for the three different target populations. Similar to Section 4, the hierarchical Bayes rule uses the following prior:

where Σ τ [k, l] = exp(− x k−1 − x l−1 /a) and we choose a satisfying 1 K(K+1)/2 k<l P (|τ k − τ l | > C x k − x l ) = 0.05. The horizontal axis measures the Euclidean distance between x k and x 0 .

The size of the plotted circle is proportional to the precision of the estimates, i.e., a smaller σ k corresponds to a larger circle. The figures show that, overall, both w minimax and w MSE tend to put greater weight on those studies that are in similar in terms of their population characteristics. This tendency is more evident for the minimax regret weights w minimax than for the minimax MSE weights w MSE .

We note that w minimax differs from w MSE for every target population. In all cases, the minimax regret criterion puts the most weight on the closest study. In contrast, the minimax MSE criterion can put the largest weight on a study that is not closest provided that it has a small standard error. For instance, in the case of Japan, the minimax regret weight of the closest study is more than 0.6 but the minimax MSE weight is about 0.3. These results reflect the different degrees of bias variance trade-off that the minimax regret and minimax-MSE weights aim to balance out, as discussed in Section 3.1. x k and x 0 , that is, x k − x 0 . The size of the plotted circle is proportional to the precision of the estimates,

i.e., a smallerσ k corresponds to a larger circle. i.e., a smallerσ k corresponds to a larger circle. Estimates 1a, 1b, and 6b receive positive weights. Figure 13 : The HB weights, w HB , for Peru. The horizontal axis measures the Euclidean distance between x k and x 0 , x k − x 0 . The size of the plotted circle is proportional to the precision of the estimates, i.e., a smaller σ k corresponds to a larger circle. Table 5 listsτ 0 (w minimax ),τ 0 (w MSE ),τ 0 (w HB ), the ratio of the maximum regrets, and the countries that were awarded minimax regret weights larger than 1/K = 1/14. The table also shows that the minimax regret and minimax MSE rules select different decisions in some cases. For example, the average annual salary amongst Japanese women aged 25-29 years is approximately $30,000; if the cost per person of adopting the policy is $1,500 and individuals that start a new job work for one year, we could set c 0 = 0.05. 3 Then, the recommendation of the minimax regret criterion is to introduce the policy in Japan. However, the minimax MSE criterion does not recommend the introduction of the policy in Japan.

For all of the target populations, the maximum regret of the minimax MSE rule is more than 10 percent greater than that of the minimax regret. For Japan and the UK, the minimax regret aggregation rule puts the most weight on the estimates of the US. In contrast, for Peru, the minimax regret criterion puts most of the weight on one estimate obtained from Argentina. 

We consider a drug approval decision for a COVID-19 treatment using the meta-database of randomized clinical trials provided by Juul et al. (2020) . There is an urgent gloabl need for To form a vector of study characteristics, we include two covariates summarizing the average patients' characteristics in each study. They are the (standardized) mean (or median) age and the (standardized) proportion of female patients. Table 6 lists the estimates, standard errors, and study characteristics. 4 T C ≡ {τ : |τ k − τ l | ≤ C x k − x l for k, l = 0, 1, · · · , K} , 4 Studies 2a-2c report the subgroup treatment effect estimates for three age subgroups (<50, 50-69, ≤70).

Because we do not have detailed information about the age of these subgroups, we suppose that the mean age of these subgroups are 45, 60, and 75, respectively.

with a prespecified Lipschitz constant C ≥ 0. Because K is small, leave-one-out cross-validation does not seem sensible. Hence, in this application, we set C = 0.01 based on the WLS estimates.

We consider hypothetical populations of interest whose characteristics range over 40-80 in terms of average age and 0.34-0.41 for the fraction of female patients. Dark red areas correspond to regions of x 0 such thatτ 0 (w minimax ) is positive and large. The white area corresponds to the region of x 0 such thatτ 0 (w minimax ) is negative or near zero. The grey line is the boundary that separates the regions of positive and negativeτ 0 (w minimax ). We also plot the covariate values of the meta-sample with the sizes of the plotted circles being proportional to the precision of the estimates.

Motivated by the recently proposed paradigm of 'patient-centered meta-analysis' (Manski (2020) ), this paper develops a method to aggregate available evidence and inform optimal treatment choice for a target population that is of interest to the planner. Building upon the framework of statistical decision theory and adopting the minimax regret criterion, we obtain a minimax regret treatment choice rule that is simple to implement in practice. The key steps of our analysis that deliver analytical and computational tractability are to constrain decision rules to the class of linear aggregation rules and to restrict the parameter space to a symmetric and invariant one (Assumption 1). These conditions for the parameter space are mild and hold in numerous contexts.

Several questions remain unanswered. First, when τ is constrained to Lipschitz vectors while the Lipschitz constant C is unknown, we do not know what is a theoretically justifiable data-driven way to select C. In the presented empirical applications, we selected C by cross validation and WLS estimation without any analytical justification for this choice. Second, our framework assumes away any publication bias of published estimates despite a growing concern in the scientific community about this, and increasing interest in both how to detect, and correct for, any such bias (see, e.g., Andrews and Kasy (2019)). Third, other than the standard errors of the estimates, our framework does not offer any way to incorporate a measure of the credibility of reported estimates. Depending on how the data were sampled and what identifying assumptions the estimate relies on, the credibility of studies can vary greatly. How to incorporate a measure of the credibility of reported estimates beyond their standard errors

remains an interesting open question.

Proof of Theorem 2. From Theorem 1, if b(w) is bounded for some w, then the minimax regret is bounded. Without loss of generality, we assume that |τ k − τ 0 | is bounded. Setting w k = 1, we have b(w) = max τ ∈T ,τ 0 =0 K k=1 w k (τ k − τ 0 ) = τ k − τ 0 < M . Hence, the minimax regret is finite.

Proof of Lemma 1. We observe that η(a) = max t≥0 {(t − a)Φ (−(t − a)) + a · Φ (−(t − a))} .

Because (t − a)Φ (−(t − a)) ≤ max t ≥0 {t · Φ(−t )} = η(0), we have η(a) ≤ η(0) + a.

The lower bound is obtained by substituting t = a + v for t · Φ(−t + a).

Proof of Lemma 2. First, we show that the derivative of η(a) is bounded from below by 0 and from above by 1. For a ≥ 0, we define t * (a) ≡ arg max Next, we show the right-most inequality of (14). For 0 ≤ a ≤ 2, we have η(a) ≤ 0.17 + a ≤ √ 1 + a 2 . Because η (a) is bounded from below by 0 and from above by 1, we have η(a) ≤ (a − 2) + η(2) for a ≥ 2. From numerical evaluation, we obtain η(2) 1.051, and hence η(a) ≤ a − 0.5 ≤ √ 1 + a 2 for a ≥ 2.

Finally, we show the left-most inequality of (14). From (A.1), t * (a) is a solution to the following equation:

where Φ(−x)/φ(x) is the Mills ratio of the standard normal distribution. Because we know that Φ(−x)/φ(x) is a strictly decreasing function, we find that t → t · Φ(−t + a) is a uni-modal function. For any d > 0, we observe that where the second equality follows from (A.1). Because t → t · Φ(−t + a) is uni-modal, we obtain t * (a + d) < t * (a) + d for any d > 0. This implies that −t * (a) + a is strictly increasing in a.

Moreover, since η (a) = Φ (−t * (a) + a) is strictly increasing, η(a) is convex. From numerical evaluation, we have t * (0) ≥ 0.75, and hence η (0) ≥ Φ(−0.75) 0.227 > η(0). Because we have d da η(0) √ 1 + a 2 ≤ η(0), we obtain the left inequality of (14).

Proof of Theorem 3. From Theorem 1 and Lemma 2, we have max τ ∈T R(τ ,δ w ) = s(w)η (b(w)/s(w)) ≤ s(w) × 1 + b 2 (w)/s 2 (w) = b 2 (w) + s 2 (w).

Similarly, using the lower bound of Lemma 2, we obtain η(0) · b 2 (w) + s 2 (w) ≤ max τ ∈T R(τ ,δ w ).

This completes the proof.

Proof of Theorem 4. Because w minimax minimizes the maximum regret, we have max τ ∈T R(τ ,δ w minimax ) max τ ∈T R(τ ,δ w MSE ) ≤ 1.

Next, we show the right-most inequality of (9). We observe that where this inequality follows from the upper bound of Lemma 2. Because w MSE minimizes the maximum mean squared error, we obtain b 2 (w MSE ) + s 2 (w MSE ) ≤ b 2 (w minimax ) + s 2 (w minimax ).

Using the lower bound of Lemma 2, we have b 2 (w minimax ) + s 2 (w minimax ) = s(w minimax ) 1 + b 2 (w minimax )/s 2 (ww minimax ) ≤ 1 η(0) × s(w minimax )η (b(w minimax )/s(w minimax )) = 1 η(0) × max τ ∈T R(τ ,δ w minimax ).

Hence, we obtain the right-most inequality of (15).

Long-run effects of youth training programs: Experimental evidence from Argentina

Identification of and correction for publication bias

A simple approximation for evaluating external validity bias

Optimal inference in a class of regression models

Finite-sample optimal estimation and inference on average treatment effects under unconfoundedness

Efficient policy learning with observational data

Subsidizing vocational training for disadvantaged youth in Colombia: Evidence from a randomized trial

Do women respond less to performance pay? Building evidence from multiple experiments

Remdesivir for the treatment of Covid-19-Final report

The offset tree for learning with partial labels

Inferring welfare maximizing treatment assignment under budget constraints

Introduction to Meta-Analysis

Can arts-based interventions enhance labor market outcomes among youth? Evidence from a randomized trial in Rio de Janeiro

What works? A meta analysis of recent active labor market program evaluations

Time-series minimum-wage studies: a metaanalysis

Bayesian aspects of treatment choice

The Handbook of Research Synthesis and Meta-Analysis

Program evaluation as a decision problem

Was there a Riverside miracle? A hierarchical framework for evaluating programs with grouped data

From local to global: External validity in a fertility natural experiment

Meta-analysis in clinical trials

Meta-analysis in clinical trials revisited

Behind the GATE experiment: Evidence on effects of and rationales for subsidized entrepreneurship training

Generalizing the results from social experiments: theory and evidence from Mexico and India

Evaluating ex ante counterfactual predictions using ex post causal Inference

Soft skills or hard cash? The impact of training and wage subsidy programs on female youth employment in Jordan

Statistical methods for meta-analysis

Asymptotics for statistical treatment rules

Asymptotic analysis of statistical decision rules in econometrics

of Handbook of Econometrics

The impact of vocational training for the unemployed: experimental evidence from Turkey

Predicting the efficacy of future training programs using past experiences at other locations

Life skills, employability and training for disadvantaged youth: Evidence from a randomized evaluation design

Meta-analysis of present-bias estimation using convex time budgets

Gaussian Estimation: Sequence and Wavelet Models

Interventions for treatment of COVID-19: A living systematic review with meta-analyses and trial sequential analyses (The LIVING Project)

More efficient policy learning via optimal retargeting

Partial identification, distributional preferences, and the welfare ranking of policies

Optimal taxation and insurance using machine learning -Sufficient statistics and beyond

Constrained classification and policy learning

Who should be treated? empirical welfare maximization methods for treatment choice

Equality-Minded Treatment Choice

Who should get vaccinated? Individualized allocation of vaccines over SIR network

Inference in regression discontinuity designs with a discrete running variable

Demand versus returns? pro-poor targeting of business grants and vocational skills training

Identification problems and decisions under ambiguity: empirical analysis of treatment response and normative analysis of treatment choice

Statistical treatment rules for heterogeneous populations

Minimax-regret treatment choice with missing outcome data

Choosing treatment policies under ambiguity

Credible Ecological Inference for Medical Decisions with Personalized Risk Assessment

Towards credible patient-centered meta-analysis

Model selection for treatment choice: Penalized welfare maximization

Understanding the Average Impact of Microcredit Expansions: A Bayesian Hierarchical Analysis of Seven Randomized Experiments

Aggregating distributional treatment effects: a Bayesian hierarchical analysis of the microcredit literature

Repurposed antiviral drugs for COVID-19-interim WHO SOLIDARITY trial results

Performance Guarantees for Individualized Treatment Rules

Estimation in parallel randomized experiments

Policy transforms and learning optimal policies

Estimation of optimal dynamic treatment assignment rules under policy constraint

The theory of statistical decision

Effect of remdesivir vs standard care on clinical status at 11 days in patients with moderate COVID-19: a randomized clinical trial

Meta-regression analysis: a quantitative method of literature surveys

Wheat from chaff: Meta-analysis as quantitative literature review

Minimax regret treatment choice with finite samples

Minimax regret treatment choice with covariates or with limited validity of experiments

Counterfactual risk minimization: Learning from logged bandit feedback

Statistical treatment choice based on asymmetric minimax regret criteria

How much can we generalize from impact evaluations?

Policy targeting under network interference

Statistical Decision Functions

Remdesivir in adults with severe COVID-19: a randomised, doubleblind, placebo-controlled, multicentre trial

Policy mining: Learning decision policies from fixed sets of data

Estimating individualized treatment rules using outcome weighted learning