key: cord-0198697-1ui6urv8
authors: Manshadi, Vahideh; Niazadeh, Rad; Rodilitz, Scott
title: Fair Dynamic Rationing
date: 2021-02-02
journal: nan
DOI: nan
sha: a4355ade0317d030e35539acbef265a8fabf1eff
doc_id: 198697
cord_uid: 1ui6urv8

We study the allocative challenges that governmental and nonprofit organizations face when tasked with equitable and efficient rationing of a social good among agents whose needs (demands) realize sequentially and are possibly correlated. As one example, early in the COVID-19 pandemic, the Federal Emergency Management Agency faced overwhelming, temporally scattered, a priori uncertain, and correlated demands for medical supplies from different states. In such contexts, social planners aim to maximize the minimum fill rate across sequentially arriving agents, where each agent's fill rate is determined by an irrevocable, one-time allocation. For an arbitrarily correlated sequence of demands, we establish upper bounds on the expected minimum fill rate (ex-post fairness) and the minimum expected fill rate (ex-ante fairness) achievable by any policy. Our upper bounds are parameterized by the number of agents and the expected demand-to-supply ratio, yet we design a simple adaptive policy called projected proportional allocation (PPA) that simultaneously achieves matching lower bounds for both objectives (ex-post and ex-ante fairness), for any set of parameters. Our PPA policy is transparent and easy to implement, as it does not rely on distributional information beyond the first conditional moments. Despite its simplicity, we demonstrate that the PPA policy provides significant improvement over the canonical class of non-adaptive target-fill-rate policies. We complement our theoretical developments with a numerical study motivated by the rationing of COVID-19 medical supplies based on a standard SEIR modeling approach that is commonly used to forecast pandemic trajectories. In such a setting, our PPA policy significantly outperforms its theoretical guarantee as well as the optimal target-fill-rate policy.

In Spring 2020, with the COVID-19 pandemic surging across the US, states were relying on the Federal Emergency Management Agency (FEMA) to provide urgently needed medical equipment from the Strategic National Stockpile. Unequipped for such a widespread emergency, FEMA aimed to ration its limited supplies in order to address states' current needs while also retaining some of the stockpile in anticipation of future needs. However, the allocation decisions made by FEMA were inconsistent and lacked transparency, which frustrated state officials (Washington Post 2020a). 1

Because having access to medical equipment can be a matter of life or death for a COVID-19 patient, making allocation decisions which are efficient and equitable is of paramount importance (Emanuel et al. 2020) . Achieving efficiency alone is easy: a first-come, first-serve policy allocates all of the supply to meet early-arriving needs. However, such a policy can be unfair to patients in states where needs materialize later.

The above is just one example of a fundamental sequential allocation problem that social planners face when aiming to allocate divisible goods as efficiently and equitably as possible to demanding agents that arrive over time.

In this paper, we take the first step toward theoretically studying the aforementioned class of problems. We develop a framework for fair dynamic rationing where agents' one-time needs (demands) for a divisible good realize sequentially and can be arbitrarily correlated. In particular, upon arrival of each agent's demand, the planner makes an irrevocable decision about their fill rate (FR), i.e., the fraction of the agent's demand that is satisfied by a one-time allocation. Toward jointly achieving efficiency and equity, the planner aims to maximize the minimum FR, either ex post or ex ante.

To assess the performance of sequential allocation policies, we introduce measures of ex-post and ex-ante fairness guarantees. For this general setting:

(i) We establish upper bounds on the ex-post and ex-ante fairness guarantees achievable by any policy. These bounds are parameterized by the supply scarcity (i.e., the expected demand-tosupply ratio) and the number of agents.

(ii) Remarkably, we show that a simple, adaptive, and transparent policy called projected proportional allocation (PPA) simultaneously achieves our upper bounds on the ex-post and ex-ante fairness guarantees for any set of parameters.

(iii) We illustrate the power of adaptivity by characterizing the ex-post guarantee of the optimal target-fill-rate policy and showing that such a non-adaptive policy cannot achieve our upper bounds.

(iv) Finally, we demonstrate the effectiveness of our policy through an illustrative case study motivated by the allocation of COVID-19 medical supplies based on a model of demand which was used by the White House.

Introducing a framework for fair dynamic rationing: We study the allocation of a divisible good to agents arriving over time with varying levels of demand. We assume the demand sequence is drawn from an arbitrary but known joint distribution across all agents. To account for heterogeneity in the demand level of different agents, we set each agent's utility to be its FR. In our base model, we focus on the objective of maximizing the minimum FR across all agents. Such an objectivewhich is in the spirit of Rawlsian justice-maximizes the utility of the worst-off agent. As such, it takes fairness into consideration along with efficiency. 2 Due to the stochasticity of the demand sequence, we consider two versions of this objective function: the expected minimum FR and the minimum expected FR (see eq. (ex-post) and eq. (ex-ante), respectively, as well as the subsequent discussion).

Like other online stochastic optimization problems, our sequential allocation problem can be formulated as a dynamic program (DP), and it similarly suffers from the curse of dimensionality as well as other practical limitations such as a lack of interpretability. (We provide further discussion of the DP in Remark 2, and in Appendix B we formally present the DP, illustrate its exponential size, and discuss other practical drawbacks.) Consequently, we aim to design sequential allocation policies that perform well while being practically appealing and computable in polynomial time.

We assess the performance of a policy by computing its ex-post and ex-ante fairness guarantees for any given supply scarcity and number of agents. In defining our notions of such guarantees, we use the minimum FR achievable under deterministic demand as a normalization factor (see Definitions 1, 2, and 3 and their related discussion in Section 2) to separate the impact of demand stochasticity from the impact of supply scarcity. The ex-post (resp. ex-ante) fairness guarantee of a policy serves as a lower bound on the expected minimum (resp. minimum expected) FR that the policy achieves relative to our normalization factor under all possible joint demand distributions.

Establishing upper bounds: In order to gain insight into the difficulty of achieving equity and efficiency in sequential allocation, we develop upper bounds on the achievable fairness guarantees of any policy, even policies which cannot be computed in polynomial time. For intuition, consider the following example with two agents. The first agent has demand of B 1 , where B 1 is a Bernoulli random variable with success probability 2/3. The second agent has demand B 1 × B 2 , where B 2 is an independent Bernoulli random variable with success probability 1/2. In other words, the demand sequence is equally likely to be (0, 0), (1, 0), or (1, 1). For such an instance, no sequential policy can distinguish between the latter two scenarios after observing the first demand, which leads to a sub-optimal decision. Building on the above intuition, in Sections 3.1 and 3.5, we establish upper bounds on the ex-post and ex-ante guarantees of any policy (see Theorems 1 and 4).

As we later show, these bounds are indeed tight. Thus, conducting comparative statics with respect to the supply scarcity and the number of agents reveals several insights (see Figure 1 ):

when demand is small relative to supply, the bounds on both fairness guarantees deteriorate with increased demand. However, in the over-demanded regime, the bounds are independent of the supply scarcity. Further, in both the under-demanded and over-demanded regimes, the ex-post fairness guarantee worsens with more agents. On the other hand, the ex-ante fairness guarantee is independent of the number of agents. This highlights the fundamental difference in our notions of fairness: the objective corresponding to ex-post fairness is concerned with fairness along all samples paths, whereas the objective corresponding to ex-ante fairness is only concerned with marginal fairness (see the related discussion in Section 2).

Achieving upper bounds: Since our upper bounds apply to all sequential policies including the optimal online policy (namely, the exponential-sized DP), it would be reasonable if no policy could achieve these upper bounds in polynomial-time. However, we show that not only are these upper bounds achievable, but they can be achieved by our PPA policy. To motivate our policy, let us consider a hypothetical situation where the demand sequence is known a priori. In that case, the optimal allocation under both objectives is to equalize the FRs and then maximize that FR (see eq. (1) and its related discussion). Alternatively, this can be written as a deterministic DP with a simple solution: at each time period, proportionally allocate the remaining supply based on the current demand and the total future demand (see Section 3.2 and Appendix C). When demand is stochastic, our PPA policy simply replaces all the future random demands by their projected values, namely, their conditional expectations (see eq. (4)).

In Sections 3.3 and 3.5, we analyze the ex-post and ex-ante fairness guarantees of the PPA policy and show that it achieves the best of both worlds: our lower bounds on the PPA policy's guarantees match the corresponding upper bounds for any supply scarcity and any number of agents. These two analyses rely on delicate inductive arguments. For ex-post fairness, we establish a lower bound on the value-to-go function of our PPA policy by analyzing the evolution of the minimum FR and progressively constructing a worst-case joint distribution for demand (see Lemma 1 in Section 3.3).

For ex-ante fairness, we demonstrate that the expected demand-to-supply ratio before the arrival of each agent is non-increasing when following the PPA policy, which enables us to bound the marginal expected FR for each agent (see Appendix F).

We highlight that beyond enjoying the best possible guarantees, our PPA policy is practically appealing: it is computationally efficient, interpretable, and transparent. In addition, it does not require full distributional knowledge, as it only relies on the first conditional moments of the joint distribution for demand. Policies which rely on detailed distributional knowledge can be prone to errors or perturbations (see Remark 2 and Appendix B.4).

Establishing sub-optimality of target-fill-rate policies: In addition to showing that our PPA policy achieves the best possible guarantees, we extend our work to studying the subclass of target-fill-rate (TFR) policies. A TFR policy commits upfront to a fill rate τ , and upon arrival of each agent, it allocates a fraction τ of that agent's demand until it exhausts the supply. Our study of TFR policies is motivated by two reasons: (i) since such policies are transparent and easyto-communicate, they are frequently used in practice, including at the outset of the COVID-19 pandemic when an initial formula allocated a fixed percentage of states' estimated needs (Washington Post 2020a), and (ii) TFR policies are a natural yet powerful class of non-adaptive policies (see Section 3.4). Consequently, comparing the performance of TFR policies with that of our adaptive PPA policy sheds light on the limitations of making non-adaptive decisions.

Intuitively, a TFR policy can perform poorly because it does not take advantage of information that reduces future uncertainty. For instance, consider a setting with two agents where the second agent's demand is perfectly correlated with the first agent's demand. A simple adaptive policy-such as our PPA policy-will perform optimally in such a setting because demand is deterministic upon the first agent's arrival. The PPA policy achieves such performance by crucially leveraging information about the second agent's demand when determining the first agent's fill rate. In contrast, a TFR policy targets the same fill rate regardless of the first agent's demand, and consequently cannot ensure that sufficient supply remains for the second agent. Based on this intuition, in Section 3.4, we provide a tight bound on the ex-post fairness guarantee of the optimal TFR policy (see Theorem 3), which can be considerably lower than the corresponding guarantee of our adaptive PPA policy (see Figure 4 (a)). On the other hand, we show that if the coefficient of variation of total demand is low, then the optimal TFR policy can provide a stronger guarantee than our PPA policy (see Proposition 2 and Figure 4 (b)).

To characterize the ex-post fairness guarantee of the optimal TFR policy, we construct the worstcase total demand distribution against such a policy. In the proof, we establish a rather surprising connection to the literature on monopoly pricing and Bayesian mechanism design (see Hartline (2013) for more details on this literature). In particular, upon mapping the problem of finding the worst-case instance into the quantile space, our problem reduces to a constrained version of the (single-item) monopoly pricing problem (see Remark 4). We identify two key properties of the worst-case distribution in this constrained monopoly pricing problem, and by exploiting the connection to our original problem, we end up with the desired characterization of the worstcase total demand distribution against the optimal TFR policy. Due to this connection, our proof technique and corresponding results can be of independent interest (e.g., see Alaei et al. (2019) for proof techniques and results in the same spirit).

Illustrative case study: To demonstrate the effectiveness of our policy, in Section 4 we conduct a numerical case study motivated by the allocative challenges that FEMA faced at the beginning of the COVID-19 pandemic (as discussed at the beginning of this section). Borrowing from the epidemiology literature around the COVID-19 pandemic, we develop a simple compartmental SEIR model that governs the need for medical supplies in different inter-connected locations. Using such a model, we demonstrate that the demand is highly variable and has complex correlation structure across locations (see Figure 6 ). Our simulation results illustrate the superior performance of our PPA policy compared to both its ex-post fairness guarantee and the optimal TFR policy. Further, the results suggest that our PPA policy performs nearly as well as the DP solution (which, as we discuss, suffers from many practical limitations). Additionally, our simulation results demonstrate the efficiency of our policy as well as its robustness to model mis-specification (see Table 2 ).

Allocating medical supplies in a pandemic is just one motivating example of the challenges that arise when a governmental or nonprofit organization aims to ration supply among agents whose (a priori uncertain and correlated) needs realize sequentially. Other examples include the allocation of emergency aid when a natural disaster such as a hurricane or wildfire impacts multiple locations over time , as well as the distribution of food donations by mobile pantries that sequentially visit agencies (Lien et al. 2014) . 3 Our proposed policy can effectively guide transparent allocation decisions in such contexts while also providing a guarantee on the fairness level of the process. Finally, as discussed in Section 5, our framework can be enriched to account for other practical considerations, such as (i) generalized objective functions that enable the social planner to balance equity and efficiency to varying degrees, and (ii) rationing multiple types of resources (see Corollaries 1, 2, and 3 in Section 5.2).

We conclude this section by discussing how our work relates to and contributes to several streams of literature.

Fairness in static resource allocation: Considerations of fairness and its trade-off with efficiency have frequently arisen in the resource allocation literature in operations research and computer science. 4 We begin by discussing papers which study fairness in static (one-shot) allocation 3 As explained in detail in Lien et al. (2014) , even though the daily demand for food donations from different agencies are not temporally scattered, they will only be observed by the operators upon their arrival at the sites.

settings. The seminal work of Bertsimas et al. (2011) considers a general setting where a central decision-maker allocates m divisible resources to n agents, each with a different utility function.

Focusing on two commonly used notions of fairness in allocation, max-min and proportional fairness, the authors characterize the efficiency loss due to maximizing fairness (see also Bertsimas et al. 2012 and Bertsimas et al. 2013) . If demand was deterministic in our setting, the optimal allocation would coincide with that of the max-min objective in Bertsimas et al. (2011) . Namely, for both objectives, the optimal allocation consists of maximized equal FRs.

Focusing on indivisible goods, Donahue and Kleinberg (2020) considers the trade-off between fairness and utilization when demand is distributed across different agents. A priori, only demand distributions are known. However, after a one-shot allocation decision, all demand values realize.

The fairness notion considered in this line of work is in the same spirit of our notion of ex-ante fairness: they require that an individual's chance of receiving the resource should not significantly depend on the group to which the individual belongs. Similarly, by maximizing the minimum expected FR, we aim to reduce the impact of an agent's place in the sequence of arrivals. Sharing similar motivation to our paper, Pathak et al. (2020) and Grigoryan (2021) consider equitable COVID-19 vaccine allocation. However, the settings (e.g., offline and deterministic), models, and techniques in both papers differ drastically from those in this work.

Also falling within the category of static allocation of indivisible goods, a stream of papers in computer science considers allocation problems when agents' valuations are deterministically known. For deterministic algorithms, recent research has centered on the existence of allocations which satisfy certain fairness properties, such as envy-freeness up to any good (see, e.g., Chaudhury et al. (2020) and references therein). For randomized algorithms, the closest to our work is the recent work of Freeman et al. (2020) , which uses notions of ex-post and ex-ante fairness and explores whether both can be achieved simultaneously. They develop a randomized algorithm that is approximately fair ex post and precisely fair ex ante. We ask a similar question, albeit in a dynamic divisible-good setting with random and correlated demand, and we affirmatively answer it: our PPA policy exactly achieves the best possible fairness guarantee ex post as well as ex ante (see Theorems 2 and 4). 2020) considers a similar model; however, it focuses on a multi-criteria objective which is based on an allocation's distance from the optimal offline Nash Social Welfare solution. We note that their notion of fairness is also different in nature from ours. 6

The algorithmic aspects of both Lien et al. (2014) and Sinclair et al. (2020) consist of designing novel heuristics and numerically evaluating them against a relevant benchmark (the intractable DP solution and the Nash Social Welfare solution, respectively). On the other hand, we take a theoretical approach and analyze fairness guarantees for the policies we design. Further, we provide upper bounds on the performance of any policy (including the DP solution), which serves a dual purpose: (i) it establishes that our policy is the best possible one if we aim to achieve both ex-ante and ex-post fairness guarantees, and (ii) it highlights the fundamental limits of achieving equity in a dynamic setting.

A related stream of papers in computer science study fair division problems in dynamic settings (Walsh 2011 , Kash et al. 2014 , Aleksandrov et al. 2015 , albeit based on different motivating applications and with different objectives. Consequently, the dynamic aspects of these papers differ from our modeling approach. Walsh (2011) studies a setting where agents arrive over time but allocation decisions are not required to be immediate. In a similar direction, Kash et al. (2014) and Aleksandrov et al. (2015) consider models where agents remain in the system and can receive multiple allocations. Beyond the model dynamics, these papers allow for an arbitrary sequence of arriving agents (e.g., an adversarial arrival model). Moreover, their results pertain to obtaining envy-free allocations, and hence do not rely on a direct comparison of agent's utilities. In contrast, in our setting agents' demands are drawn from a known (arbitrarily correlated) joint distribution, and our results center on direct comparisons based on agents' fill rates (which play the role of the utilities in our model).

In settings with multiple types of resources, Azar et al. (2010) and Bateni et al. (2016) study online versions of Fisher markets and develop policies with fairness guarantees under two different arrival models. The former assumes an adversarial model whereas the latter considers demand that belongs to a general class of stochastic processes. 7 There are fundamental differences between our work and the aforementioned papers. Just to name one, the settings of Azar et al. (2010) and Bateni et al. (2016) are motivated by online advertising, where demanding agents (advertisers) are offline and items (impressions) arrive in an online fashion. Demanding agents have a large budget compared to the price of each arriving item, and they derive item-specific utilities. Consequently, the fairness notion is concerned with the total utility of each agent, which is a function of all items allocated to it during the horizon. In contrast from such a setting, demanding agents in our work arrive in an online fashion while the supply side is offline, and each demanding agent receives a single allocation. The recent works of Ma and Xu (2020) and Nanda et al. (2020) are closer to our setting in that the demanding agents arrive online; however, they differ in several aspects: (i) the underlying arrival process is known i.i.d. where arriving demand belongs to various groups, (ii) they focus on group-level fairness, and (iii) they consider a matching setting, i.e., allocating indivisible goods.

The objectives of ex-post and ex-ante fairness which we study in our problem bear some resemblance to the objective in the online contention resolution scheme (OCRS) problem, although the two problems are not directly comparable. The OCRS is basically a rounding algorithm that aims to uniformly preserve the marginals induced by a fractional solution while obtaining feasibility of the final allocation. This technique has found application in many settings such as Bayesian online selection, oblivious posted pricing mechanisms, and stochastic probing models (see, e.g., Alaei 2014 , Feldman et al. 2016 , and Lee and Singla 2018 . The OCRS problem diverges from ours because that setting focuses on designing randomized policies for allocating indivisible goods, while our focus is on divisible goods (consequently, restricting to deterministic policies is without loss).

Dynamic allocation of social goods: On a broader level, our paper is related to the literature on dynamic allocation of social goods and services, such as public housing, donated organs, and emergency care. Examples of centralized allocation policies include Kaplan (1984) , Ashlagi et al. (2013) , Agarwal et al. (2019) , and ; examples of decentralized mechanisms are Leshno (2019), Anunrojwong et al. (2020) , and Arnosti and Shi (2020) . For the most part, the aforementioned papers focus on the analysis of social welfare in steady-state models where both demand and supply dynamically arrive. We complement this literature by focusing on equitable allocation in a non-stationary framework where a fixed amount of supply must be rationed across demand that arrives over time.

Further, our work is broadly related to the growing literature on dynamic mechanism design without money , Gorokh et al. 2019 , 2021 . These papers study settings with repeated interaction between a principal and agents, and they assume that agents' valuations are drawn independently across individuals and across time. Our framework differs from such settings in several key aspects: (i) each agent only interacts once with the social planner, (ii) agents' demands can be arbitrarily correlated, and (iii) as explained below, agents are non-strategic.

In our work, we abstract away from strategic behavior and assume that agents do not control the timing of their demand (i.e., the order of their arrival) nor can they misrepresent their demand, either individually or as a coalition. Several papers in the supply chain and social choice literature (see, e.g., Sprumont 1991 , Lee et al. 1997 , Cachon and Lariviere 1999 study incentive issues that arise when demand for a resource exceeds its capacity in various other contexts with no access to monetary mechanisms (as is the case in our setting). In particular, Lee et al. (1997) shows that proportional allocation (which is the static version of our proposed PPA policy) can induce strategic behavior as agents may benefit from over-stating their demand. However, such strategic considerations are often inapplicable in our motivating applications. In contexts such as a pandemic or a natural disaster, the sequence of realized demand is exogenous. Furthermore, demand is verifiable in these settings, and false reporting can be severely punished under the Disaster Fraud Act. 8

Online resource allocation: From a technical point of view, our work is related to the rich literature on online resource allocation and prophet inequalities, which started from the seminal work of Krengel and Sucheston (1978) and Samuel-Cahn et al. (1984) . For an informative survey, we refer the interested reader to Lucier (2017) . We highlight that in terms of modeling demand, our work departs from the prevailing approaches in this literature, namely adversarial, i.i.d., or random permutation arrival models. In our work, we assume that the sequence of demands can be arbitrarily correlated and the joint distribution is known in advance. In terms of modeling demand, our work is closest to a few papers that consider prophet inequalities with correlated demand (Rinott and Samuel-Cahn 1992 , Truong and Wang 2019 , Immorlica et al. 2020 ). However, the nature of the online decisions is different; in our model, a fraction of a divisible good is allocated to each arriving demand, whereas in prophet inequality settings, an indivisible good is allocated to a single agent. 

Problem setup: Consider a planner that is using a sequential allocation policy-also referred to as an online policy-to allocate a divisible resource of supply s among n agents. Without loss of generality, we normalize the total supply so that s = 1. Agents arrive sequentially over time periods 1, 2, . . . , n, and we index agents according to the period in which they arrive. Once agent i arrives, their demand d i ∈ R ≥0 is realized and observed by the planner. Based on the observed demand d i and the history up to time period i, the sequential policy makes an irrevocable decision by allocating an amount x i of the resource to this agent. The allocated amount x i cannot exceed the agent's realized demand d i nor can it exceed the remaining supply before agent i's arrival, which we denote by s i . Thus, x i is a feasible allocation if x i ∈ [0, min{s i , d i }]. Given the feasible allocation

After allocating x i to agent i, the remaining supply before the arrival of agent i + 1 is

To model the uncertainty about future demands, we consider a Bayesian setting where the d i 's are stochastic and arbitrarily correlated such that d = (d 1 , d 2 , . . . , d n ) is drawn from a joint distribution F ∈ ∆ R n ≥0 known by the planner. A key characteristic of a demand sequence is the expected demand-to-supply ratio, which we call the supply scarcity as defined formally below.

Definition 1 (Supply Scarcity). The supply scarcity of a demand sequence d drawn from a joint distribution F ∈ ∆ R n ≥0 is given by:

Since we normalize the supply to be 1, the supply scarcity is equal to the total expected demand.

For simplicity of presenting our results, we consider joint distributions that assign non-zero probability to at least one sample path of demands with d n = 0. Equivalently, we assume d n is not deterministically equal to zero. 11

As detailed earlier, our setup is motivated by the distributional operations of a governmental or nonprofit organization. Consequently, we focus on an egalitarian planer that intends to balance the equity and efficiency of the allocation. To this end, the planner's objective is to maximize the minimum achieved FR among the agents, i.e., min i∈[n]

x i d i , given the uncertainty in the demands. Maximizing such an objective has its roots in the classic literature on welfare economics (e.g., Arrow 1963) and has been studied more recently in similar contexts in operations research (e.g., Lien et al. 2014). It provides equity through its focus on the worst FR across all agents-in contrast to the sum of FRs-and provides efficiency by aiming to maximize this FR-in contrast to allocating an equally minimal amount of the resource to all agents. 12 Before introducing our objective function, we comment on our assumptions that the supply is fixed a priori and that we make a one-time allocation decision for each agent. In some applications, supply can get replenished over time, and there can be multiple allocations to the same agent.

However, in our motivating applications there is a time urgency that we aim to incorporate into our model: as one example, during a pandemic it may take months to replenish the national stockpile of medical resources such as ventilators. 13 In the meantime, a "pandemic wave" may last only a few weeks, which makes any potential second shipment less valuable or even unnecessary. (For instance, the need for ventilators may greatly reduce a few weeks after the peak.) The same features exist in response to a natural disaster, as providing relief is most valuable in the immediate aftermath.

Objectives & fairness guarantees: Since demands are a priori uncertain in the setup described above, the planner should consider appropriate metrics to aggregate over uncertain outcomes.

We now formally define the planner's objectives by considering two different metrics: the ex-post minimum FR and the ex-ante minimum FR. For any sequential allocation policy π, the ex-post minimum FR of policy π is its expected minimum FR, i.e.,

where x = (x 1 , x 2 , . . . , x n ) is the sequence of allocations generated by π. On the other hand, the ex-ante minimum FR of policy π is its minimum expected FR, i.e.,

For a randomized policy π, we abuse notation and again use W F p (π) and W F a (π) to denote the expectation of the above two quantities over the policy π's internal randomness. 14 These two objectives represent two different notions of fairness: eq. (ex-post) aims for equity in outcomes, whereas eq. (ex-ante) aims for equity in expected outcomes. We largely focus on the ex-post minimum FR for two main reasons. First, when allocating supplies in response to a rare event like a pandemic or natural disaster, agents only observe one realized outcome. Because the ex-ante minimum FR is only concerned with marginal fairness, it can have unfair outcomes for every sample path, i.e., every realized demand sequence. In contrast, the ex-post minimum FR considers each full sample path; every sample path with positive probability which results in an 13 During the first peak of the COVID-19 pandemic in the US, it took a few months to produce ventilators, and there were concerns that those ventilators would be ready too late (Washington Post 2020b).

14 In principle, we allow randomization of our policies in this paper; however, as will be clear later, all of our proposed policies are deterministic and no randomization is needed to obtain our targeted performance guarantees. unfair outcome reduces W F p (π). Second, by Jensen's inequality, the ex-post minimum FR serves as a lower bound on the ex-ante minimum FR, i.e., for any policy π,

However, for a fixed ex-post minimum FR, achieving a higher ex-ante minimum FR is desirable because it reduces systematic biases against a particular agent, e.g., the last-arriving agent. In the extreme case where W F a (π) = W F p (π), one particular agent receives the smallest FR, regardless of the sample path. On the other hand, W F a (π) > W F p (π) implies that the worst-off agent varies across different sample paths.

Having defined our notions of fairness, we first observe that if the sequence of demand is deterministic, then the policy that maximizes both the ex-post and the ex-ante minimum FR is simply equalizing all FRs. Namely, this policy achieves a FR equal to:

Inspired by the above observation, we define a normalization factor that will help us with providing more informative performance guarantees.

Definition 2 (Normalization Factor). Given a joint demand distribution F ∈ ∆ R n ≥0 with total expected demand of µ, the normalization factor W min 1, 1 µ represents the optimum ex-post and ex-ante minimum FR if F is replaced by a deterministic distribution (over demand sequences) with an identical total expected demand µ.

To see why we introduce the normalization factor W , note that the above observation in eq. (1)

highlights that even without stochasticity in the demand sequence, when total demand exceeds supply we cannot guarantee a minimum FR better than 1/µ. This is simply due to the "scarcity of supply." However, if the sequence of demands is stochastic (and possibly correlated), an ex-post or ex-ante minimum FR of W may not be always achievable by an online allocation decision maker due to the "scarcity of information," i.e., the unknown realizations of future demands. Consequently, we use W to enable us to decouple the effect of supply scarcity from the effect of stochasticity in the demand sequence on the quality of online allocation decisions.

We emphasize that W is not a benchmark in the sense that it does not serve as an upper bound on the achievable ex-post (or ex-ante) minimum FR for all joint demand distributions with a total expected demand of µ. Due to the stochastic nature of demand and convexity in our objectives, it is possible that when taking expectation over the realized demand sequence, we can achieve expost and ex-ante minimum FR's that are higher than W . However, as explained above, W serves as an informative normalization factor to decouple the effects of supply scarcity and information scarcity. 15

Consequently, we evaluate policies based on how they perform relative to W . For a policy π and a joint demand distribution F, we say that the policy achieves ex-post fairness (resp. ex-ante fairness) of W F p (π)/W (resp. W F a (π)/W ). We aim to design a policy with guarantees on both ex-post and ex-ante fairness that hold universally for all joint demand distributions F with n agents and supply scarcity µ. We refer to the universal lower bounds of a policy π as its fairness guarantees, which we formally define below.

Definition 3 (Ex-post/Ex-ante Fairness Guarantee). A sequential allocation policy π achieves an ex-post fairness guarantee (resp. ex-ante fairness guarantee) of κ p (µ, n) (resp. κ a (µ, n)),

where ∆ R n ≥0 ; µ denotes the domain of joint demand distributions with n agents and total expected demand of µ.

Our goals are (i) to understand the limits of achieving fairness in sequential allocation by computing upper bounds on the achievable guarantees, and (ii) to obtain tight lower bounds by designing policies with strong ex-post guarantees as well as ex-ante guarantees. We show in Section 3 that no gap exists between the achievable upper and lower bounds under both ex-post and ex-ante notions. More specifically, we show how to obtain exactly matching upper and lower bounds for both notions of fairness using a single adaptive policy.

In this section, we present our main results for the setting introduced in Section 2. First, we focus on ex-post fairness in Section 3.1 and establish parameterized upper bounds on the expost fairness guarantee achievable by any sequential allocation policy-whether adaptive or nonadaptive, computationally efficient (i.e., with polynomial running time) or not. Then, somewhat surprisingly, we show that such upper bounds can be achieved by our policy, which is introduced and analyzed in Sections 3.2 and 3.3. Next, to illustrate the power of our simple adaptive algorithm, in Section 3.4 we characterize the ex-post fairness guarantee of the best policy which non-adaptively aims for a particular target fill rate, and we show that our policy performs favorably compared to such a policy. Finally, in Section 3.5 we turn our attention to the notion of ex-ante fairness, and we show that our policy also achieves the best possible ex-ante fairness guarantee. The upper bound on the ex-post fairness guarantee of any policy, as a function of the supply scarcity and the number of agents.

We begin this section by establishing a fundamental limit on ex-post fairness for any allocation policy when faced with stochastic and sequential demands. The main result of this subsection is the following theorem:

Theorem 1 (Upper Bound on Ex-post Fairness Guarantee). Given a fixed number of agents n ∈ N and supply scarcity µ ∈ R ≥0 , no sequential allocation policy obtains an ex-post fairness guarantee (see Definition 3) greater than κ p (µ, n), defined as

.

(2) See Figure 1 for an illustration of this upper bound as a function of the supply scarcity µ and the number of agents n. Per Definition 3, the ex-post fairness guarantee is relative to the achievable minimum FR when demands are deterministic, namely W = min{1, 1/µ}. Consequently, this upper bound provides insight into the unavoidable loss in efficiency and equity when demands are a priori uncertain and realize sequentially. In particular, we remark that the achievable fairness guarantee crucially depends on the supply scarcity. In the regime where µ < 1 + 1 n , which we refer to as the under-demanded regime, κ p (µ, n) initially worsens as µ increases before hitting its minimum (for any fixed n) when expected demand equals supply, i.e., at µ = 1. This suggests that the stochastic nature of demand is most harmful when expected demand exactly equals supply. On the other hand, in the over-demanded regime where µ ≥ 1 + 1 n , the achievable fairness guarantee is independent of µ. Given that we are usually in the over-demanded regime in our motivating applications, Theorem 1 ensures that supply scarcity does not contribute to the loss in fairness due to uncertain, correlated, and sequential demand. Fixing µ, the upper bound always decreases with n, implying that achieving fairness can be more challenging for a larger population of agents with stochastic demands, even if the total expected demand of the population remains the same.

Finally, we highlight that the bound is always at least 1/2 regardless of the supply scarcity and the number of agents, and it attains its minimum when µ = 1 and n → +∞.

The proof of Theorem 1 relies on establishing two hard instances with similar structures, one for µ < 1 + 1/n and one for µ ≥ 1 + 1/n. The details of the proof are presented in Appendix A.1.

Here, we present the instance for the over-demanded regime along with a sketch of our analysis. In this instance, there are n possible equally-likely scenarios, i.e., scenario σ happens with probability 1/n for σ ∈ [n]. In scenario σ, the first σ agents have equal demand of 2µ n+1 and the rest have no demand. We illustrate this instance in Figure 2 (a). 16 First, note that the total supply scarcity for the above hard instance is µ (as shown in Appendix A.1). Next, consider any sequential policy that faces a non-zero demand from agent i. The policy cannot distinguish among possible scenarios i, i + 1, . . . , n. Consequently, its allocation decision for agent i will be independent of the scenario. In light of this observation, any policy can be sufficiently described by a set of (possibly random) allocations with expected values y = (y 1 , y 2 , . . . , y n ), such that if agent i has non-zero demand, then they receive an expected allocation y i . Given y, the minimum FR for scenario σ is

where the first inequality is due to the expectation of a minimum being less than the minimum over expectations (Jensen's inequality).

In order to establish our upper bound, we set up a factor-revealing linear program as presented in Figure 2 (b). The LP maximizes the expected minimum FR subject to three sets of natural constraints that must hold for any sequential policy:

• The minimum FR in scenario σ cannot exceed the FR for agent σ, as shown in eq. (3).

(a) (LP) uses variables r (fill rate) and y (allocation). • The minimum FR in scenario σ is at most the minimum FR in scenario σ − 1.

• The total amount of expected allocations cannot exceed the available supply of 1.

In Appendix A.1, we provide an upper bound on the optimal value of this LP by presenting a feasible solution to its dual. To complete the proof of Theorem 1, we must scale by W = min{1, 1/µ} to translate this upper bound on the expected minimum FR into an upper bound on ex-post fairness (see Definition 3).

We finish this subsection with two important remarks regarding (i) the use of the offline solution as a benchmark and (ii) the shortcomings of the optimum online solution.

Remark 1 (Comparison with Offline Solution). In sequential decision making problems, it is common to evaluate the performance of a policy by comparing it to the offline solution,

i.e., the optimum solution that observes the entire demand sequence before making any decisions (see, e.g., Mehta 2012 and references therein). However, in our setting, such an offline solution proves to be too powerful, in the sense that it is impossible to achieve a constant-factor guarantee compared to such a solution. In the following proposition (proven in Appendix A.2), we use the same example as shown in Figure 2 (a) to establish this impossibility result.

Proposition 1 (Comparison with Offline Solution). Given a fixed number of agents n ∈ N, there exists a supply scarcity µ ∈ R ≥0 such that no sequential allocation policy can guarantee more than a 1 log(n+1) fraction of the expected minimum fill rate achieved by the offline solution.

In light of the above proposition, we focus on establishing absolute guarantees on the expected minimum fill rate. However, as explained in Section 2 (see Definition 2 and its related discussion), we use the normalization factor W to disentangle the effects of supply scarcity from the effects of sequential decision-making when faced with a stochastic demand sequence.

Remark 2 (Optimum Online Solution). A natural candidate for a policy that may achieve the upper bound on the ex-post fairness guarantee -given by eq. (2) -is the optimal online policy which can be found via a DP. In Appendix B, we formally present the underlying DP. However, we also illustrate that there are significant limitations and drawbacks to a DP approach for maximizing the expected minimum FR in this setting. First, (i) as we show in Appendix B.2, the state space of such a DP is exponentially large for arbitrarily correlated demands, which makes the DP intractable (in particular, the state space is exponential in the number of agents n). Nevertheless, in Appendix B.3 we present an FPTAS for the special case of independent demand. Even beyond computational challenges, (ii) solving the DP requires full distributional knowledge, (iii) such a DP solution does not necessarily perform well for our second objective function, i.e., maximizing the ex-ante minimum FR, and (iv) the DP decisions may lack transparency and interpretability, which are highly desirable properties in our motivating applications. (For an illustration of points (iii) and (iv), see Example 1 in Appendix B.4; for a summary of the drawbacks of the DP, see Table 3 ).

Remarkably, in the following subsection, we design a simple adaptive policy that not only achieves the best possible ex-post fairness guarantee of κ p (µ, n), but also offers several corresponding advantages over a DP solution: (i) it can be computed efficiently, (ii) it only requires knowledge of the conditional first moments of agents' demands, and (iii) its decisions can be clearly explained.

Additionally, as shown in Section 3.5, it simultaneously attains the best-possible ex-ante fairness guarantee.

We introduce our policy, referred to as the projected proportional allocation (PPA) policy, through the following simple intuition. Consider a planner that (magically) has access to all the demand realizations d. As already discussed in Section 2, to maximize the minimum FR when the demand realizations are known a priori, the planner should equalize the FR of all agents by allocating

to each agent i. If j∈[n] d j is at most the initial supply (which we normalize to 1), then each agent i obtains a full allocation of x i = d i in such a solution. This results in the maximum equal FR of 1. Otherwise, all the agents will have an equal FR of 1/ j∈[n] d j , which is 1/µ when each demand is equal to its expected value.

This solution can alternatively be obtained by solving a DP that returns allocations

x * n , x * n−1 , . . . , x * 1 maximizing the minimum FR. By a simple induction argument, given the remaining supply s i at period i, this DP maintains the following invariant at each period i (refer to Appendix C for details):

Notably, the above invariant suggests a sequential implementation of the optimal solution at each period i that only uses the knowledge of d i (i.e., the current demand at period i) and j∈[i+1:n] d j (i.e., the total future demand from period i + 1 to n). Now consider a setting with incomplete information, namely, with only knowledge of the current sample path of the observed demands up to period i, which we denote by d [1:i] (d 1 , d 2 , . . . , d i ). Our PPA policy implements a version of the above policy by replacing the exact realization of total future demand with the conditional first moment of this random variable given the current sample path. More precisely:

• Given the remaining supply s i , the PPA policy allocates an amount

Note that the conditional expected future demand µ i+1 given all previously-realized demands

however, for ease of notation, we use µ i+1 without any input arguments.

We highlight that the PPA policy is simple, computationally efficient, and solely uses firstmoment knowledge about the future demands. Consequently, the PPA policy does not need to know the order of future arrivals. Further, because the allocation decisions of the PPA policy depend smoothly on the first moment of future demand, these decisions are robust to small changes in the scale of any marginal distribution. Yet, as we show in Sections 3.3 and 3.5, this simple policy remarkably achieves the best possible guarantee for both notions of fairness (ex-post and ex-ante), even though these two notions are quantitatively different whenever n > 1.

We now point out an important technical property of the PPA policy that will help us in establishing these key results. 

To conclude this section, we note that one can consider a policy similar to our PPA policy but with a slightly different updating rule that is monotone non-increasing in the fill rate. In particular, suppose the allocation at any time i is given by

is the minimum FR before the arrival of agent i. In words, this alternative policy ensures that in each time period i we do not have a fill rate larger than the current minimum FR. While such an alternative policy achieves weakly larger ex-post fairness than our PPA policy, the two policies provide an identical ex-post fairness guarantee (see Section 3.3). Furthermore, this alternative policy suffers from a significant drawback: by definition, it systematically disfavors late-arriving agents. In fact, the minimum FR under this policy is always attained by the last agent (agent n) as long as that agent's demand is non-zero, which leads to a sub-optimal ex-ante fairness guarantee. In sharp contrast, our PPA policy avoids systematically disfavoring late-arriving agents; consequently, in Section 3.5 we show that our PPA policy achieves the best possible ex-ante fairness guarantee.

In this section, we analyze the ex-post fairness guarantee of our PPA policy. In the following theorem, we show that this simple policy indeed achieves the best possible ex-post fairness guarantee.

Theorem 2 (Ex-post Fairness Guarantee of PPA Policy). Given a fixed number of agents n ∈ N and supply scarcity µ ∈ R ≥0 , the PPA policy achieves an ex-post fairness guarantee (see Definition 3) of at least κ p (µ, n) (defined in eq. (2)).

In order to prove the above theorem, we would have liked to analyze the evolution of the minimum FR, which we denote with f i at the end of period i − 1, i.e.,

Instead, we consider the evolution of a closely related stochastic process, which makes the analysis simpler. We define this surrogate stochastic process as follows:

First, we note that β i = min{f i , n+1 nµ }, i ∈ [n + 1]. Next, recall that s i denotes the remaining supply after agent i − 1 arrives and receives an allocation. We observe that under the PPA policy, s i evolves according to

With the above observations, the main step of the proof is carefully analyzing the evolution of (β k , s k ) under the PPA policy, which enables us to lower bound the final expected minimum FR in the following lemma.

Lemma 1 (Lower Bound on Expected Minimum FR). Under the PPA policy, for all i ∈ [n + 1] and any subsequence of demand realizations d [1:i−1] ,

where β i is defined in eq. (5). 18

Since the objective of our dynamic decision-making problem has no per-stage rewards and consists only of a terminal reward (i.e., the minimum FR), Lemma 1 can be thought of as establishing a lower bound on the value-to-go function of the PPA policy. Before providing the proof for this key lemma, we lay out the two remaining steps that finish the proof of Theorem 2: (i) plugging i = 1 into inequality (7) to obtain a lower bound on E d∼ F [f n+1 ], and (ii) scaling the obtained lower bound result by our normalization factor, namely W = min{1, 1/µ}, which provides an ex-post fairness guarantee (see Definition 3).

Proof of Lemma 1: We will show that inequality (7) holds via backwards induction. The base case of i = n + 1 is trivial as it follows from the observation we made earlier: β i = min{f i , n+1 nµ }. Now let us consider i = k < n + 1. Instead of proving inequality (7), we prove a stronger result:

Establishing inequality (8) means that the inequality in (7) holds for any realization of agent k's demand. Consequently, it will hold when we take an expectation over agent k's demand. In order to prove inequality (8), we consider two different cases that can arise depending on the remaining supply s k , agent k's demand d k , and the future expected demand µ k+1 . In the following, we introduce and analyze these cases separately.

(i) Sufficient supply (s k ≥ β k (d k + µ k+1 )): Recall that according to the PPA policy,

s k }. Therefore, in this case, either x k = d k , i.e., the PPA policy meets the entire demand, or

i.e., the PPA policy attains an FR of at least β k .

According to the dynamics specified in (5) and (6), this implies

Using our inductive hypothesis when i = k + 1,

The lower bound given by RHS (1) is a linear function of d k + µ k+1 , as illustrated by the dotted red lines in all panels of Figure 3 (in the regime where d k + µ k+1 ∈ [0, s k /β k ]). This linear function has a non-positive slope and an intercept of β k . We can further lower bound this function for any d k + µ k+1 ∈ [0, s k /β k ] by another linear function with the same intercept of β k and a smaller (more negative) slope. In particular, since n−k n+1−k ≤ n+1−k n+2−k , we have:

which proves inequality (8) in the sufficient supply case (see the blue lines in all panels of Figure 3 ).

(ii) Insufficient supply (s k < β k (d k + µ k+1 )): In this case, the allocation of the PPA policy is

According to the dynamics specified in (5) and (6), this implies

Using our inductive hypothesis when i = k + 1,

The lower bound given by RHS (2) is a convex homographic function of d k + µ k+1 , as illustrated by the dashed red lines in all panels of Figure 3 (in the regime where d k + µ k+1 ∈ [s k /β k , +∞)). To further lower bound this function by a linear function, note that for any variable z the following inequality holds:

The proof of the above inequality is purely algebraic and we omit it for brevity. Substituting z = d k + µ k+1 in this inequality, we have:

which proves inequality (8) in the insufficient supply case (again, see the blue lines in all panels of Figure 3 ).

Combining the above cases proves inequality (8) everywhere, which immediately implies the inductive hypothesis, i.e., inequality (7), for i = k, thus finishing the proof of the lemma.

As discussed in the previous sections, our PPA policy is adaptive, that is, the FR for agent i (and its corresponding allocation decision) can depend not only on the observed demand d i but also on the exact sample path up to time i as well as the remaining supply s i . In contrast to an adaptive policy, a non-adaptive policy commits to a sequence of feasible allocation maps 

Lower bounds on the expected minimum FR given by eq. (8) (blue solid lines), eq. (9) (red dotted lines), and eq. (11) (red dashed lines) when n = 4 for k ∈ [4].

For settings that we consider, adaptivity can indeed help with improving the expected minimum FR of a policy. As an example, compare running our PPA policy versus the best non-adaptive policy on an instance with three agents. In this instance, the demands d = (d 1 , d 2 , d 3 ) follow one of the two possible sample paths ( 1 , 1, 1) or ( 2 , 1, 0) with equal probabilities 1/2, where 1 , 2 ≥ 0 and 1 = 2 . After agent 1's demand is realized, the PPA policy knows exactly which sample path is happening. By calculating the exact total demand of agents 2 and 3, it obtains the optimal expected minimum FR of 1 2 × 1 + 1 2 × 1 2 = 3/4 for small 1 , 2 . However, a non-adaptive policy cannot distinguish between the two possible sample paths after agent 1's demand is realized. Therefore, without loss of generality, it targets a FR of τ for agent 2 and obtains an expected minimum FR of 1 2 τ + 1 2 min{τ, 1 − τ } for small 1 , 2 , which attains its maximum equal to 1 2 at any τ ∈ [ 1 2 , 1]. In applications that allow for adaptivity, our PPA policy obtains the optimal ex-post fairness guarantee while also having the desirable properties of transparency and interpretability. However, adaptivity is not admissible in some practical scenarios-e.g., when the social planner should commit to an allocation plan in advance for even more transparency or due to legal restrictions. 20

Motivated by such scenarios, we study two simple and natural canonical classes of non-adaptive policies: those that fix the sequence of allocation decisions a priori, namely they specify one allocation vector x, and "smarter" policies which fix the sequence of fill rates τ (τ 1 , τ 2 , . . . , τ n ) a priori.

In Appendix D, we show that the ex-post fairness guarantee for the former subclass is vanishing as n gets large. Therefore, we focus on the latter subclass, which is formally defined as follows.

Definition 4 (Target-fill-rate Policies). A target-fill-rate (TFR) policy is any policy π which pre-determines a target fill rate τ ∈ [0, 1]. Then, for every arriving agent i, the policy π must either allocate sufficient supply to meet the target or allocate all remaining supply, i.e.,

In the following theorem, we provide a tight bound on the ex-post fairness guarantee (Definition 3) achievable by the optimal TFR policy-defined as the one that maximizes ex-post fairness for the given joint demand distribution. We remark that setting one threshold is without loss of generality because the ex-post fairness guarantee of a policy which pre-determines a sequence of target fill rates {τ i } i∈ [n] is upper bounded by that of a TFR policy with the same target fill rate τ = min i∈[n] {τ i } for all agents. We also highlight that in addition to achieving a lower expost fairness guarantee compared to our adaptive policy, finding the best TFR policy requires full knowledge of the total demand distribution-in contrast to our PPA policy which only requires knowing the first conditional moments of the future total demand at each time.

Theorem 3 (Ex-post Fairness Guarantee of Optimal TFR Policy). Given any number of agents n ∈ N \ {1} and supply scarcity µ ∈ R ≥0 , the optimal TFR policy achieves an ex-post fairness guarantee (see Definition 3) of max{1,µ}

In Figure 4 (a), we compare the guarantee of the optimal TFR policy against our PPA policy for different model primitives, µ and n. First, we note that when n is not too large, our PPA policy achieves a considerably higher guarantee. Next, we highlight that the ex-post fairness guarantee for the optimal TFR policy does not depend on the number of agents n. 21 This is in contrast to the ex-post guarantee for the PPA policy κ p (µ, n), which worsens as the number of agents increases.

Furthermore, the guarantee in Theorem 3 has a unique minimum of 1 1+ √ 2 ≈ 0.41 when µ = 1. This once again suggests that the stochastic nature of demand is most harmful when expected demand exactly equals supply. Before presenting the proof of Theorem 3 (which we defer to Section 3.4.2), we first discuss a special case where TFR policies can provide a substantially stronger fairness guarantee.

20 As discussed in the introduction, the initial strategy for allocating medical supplies at the beginning of COVID-19 pandemic had the form of a target-fill-rate policy, which is a canonical non-adaptive strategy as we will discuss soon. 21 We elaborate on the intuition behind this behavior when we present the hard instance for establishing the upper bound of Theorem 3. 

We provide the proof of Proposition 2 in Appendix E.2. We do not have a closed-form expression for this improved lower bound, but the single-variable concave maximization problem in (13) is easy to solve numerically. We note that the lower bound on the guarantee of the optimal TFR policy approaches 1 as the CV approaches 0, regardless of the supply scarcity µ.

Even though the lower bound in Proposition 2 is not necessarily tight, as long as the CV is sufficiently small, this lower bound improves upon the bound in Theorem 3 (which holds for an arbitrary coefficient of variation). 22 Furthermore, this bound establishes that in settings where the CV is below a threshold, the optimal TFR policy can provide an improved guarantee relative to the PPA policy.

In Figure 4 (b), we illustrate the lower bound on the optimal TFR policy when the supply scarcity Thus, in cases where total demand is known to be well-concentrated, such as in instances with a large number of i.i.d. demands, TFR policies can perform quite well. In contrast, the PPA policy is particularly valuable when demand is correlated and highly variable. We emphasize that such demand sequences may arise in our motivating applications, such as when responding to a pandemic or natural disaster. In our case study presented in Section 4, we present a simple example for the COVID-19 pandemic based on commonly-used epidemiology models which exhibits correlation across demands and a high coefficient of variation for total demand.

We now provide a proof of Theorem 3, with some details deferred until Appendix E.1. We begin by placing a lower bound on the performance of the optimal TFR policy, and we then demonstrate the existence of a matching upper bound.

Proof of lower bound: For any target fill rate τ , a TFR policy will achieve that fill rate if τ i∈[n] d i ≤ 1. Let us define G as the cumulative distribution function (CDF) of the random variable v

For ease of notation, we use ∆ (R ≥0 ; µ) to denote the domain of all such CDFs. Given a CDF G, a TFR policy with target fill rate τ achieves an expected minimum fill rate of at least τ (1 − G(τ )), which implies that the optimal TFR policy attains an expected minimum fill rate of at least max τ ∈[0,1] τ (1 − G(τ )). In the following lemma, we establish a lower bound on max τ ∈[0,1] τ (1 − G(τ )), which enables us to lower-bound the ex-post fairness guarantee that the optimal TFR policy achieves. 22 The value for c at which the two bounds cross depends on the supply scarcity µ. Numerically, we find that if c ≤ 0.3, the lower bound in Proposition 2 provides a better guarantee for any µ.

Lemma 2 (Tight Lower Bound for Optimal TFR Policy). Given a fixed number of agents n ∈ N and supply scarcity µ ∈ R ≥0 , the following holds:

This infimum is attained by the following CDF

.

Before presenting the proof of the above lemma in Section 3.4.3, which is the key step of the proof of Theorem 3, we establish a matching upper bound and complete the proof of the theorem.

Proof of upper bound: We show a matching upper bound by considering a two-agent instance. In this instance, only the first agent has stochastic demand. In particular, d 1 = 1− v for v ∼Ĝ (defined in eq. (15)) and d 2 = µ deterministically. Note that E[d 1 + d 2 ] = µ. For any target fill rate τ = (1 − )τ where τ ∈ [0, 1], supply will be exhausted before the arrival of agent 2 with probabilityĜ(τ ), in which case the minimum FR will be 0. Therefore, the expected minimum fill rate of the optimal TFR policy in this instance is at most

where the equality follows from Lemma 2.

By allowing → 0, we conclude that there exists an instance where the expected minimum fill rate of the optimal TFR policy is

, which matches the lower bound from above. We remark that the construction of the above two-agent example clarifies why our upper bound does not depend on the number of agents: we can modify the example to an n-agent one where the total demand of the first n − 1 agents have correlated demand equal to 1− v for v ∼Ĝ and the last agent has a deterministic demand of µ.

With the above (matching) bounds, we complete the proof of Theorem 3 by scaling this tight bound by our benchmark for deterministic demand, namely W = min{1, 1/µ}, to arrive at the guarantee stated in Theorem 3.

Having laid out the proof steps of Theorem 3, we now provide a constructive proof of the key lemma, i.e., Lemma 2.

We do so by identifying properties of the worst-case distribution against the optimal TFR policy, which enables us to exactly characterize that distribution. To aid in this proof, we introduce a one-to-one mapping of each target fill rate τ into the quantile space, such that quantile q corresponds to TFR τ if and only if there is sufficient supply to meet a fraction τ of demand with probability exactly q. We start by describing notation for this transformation, along with some basic properties, in the following definition. For simplicity of exposition, we assume all the distributions playing the role of G are non-atomic. 24

Definition 5 (TFR in Quantile Space). Given a (non-atomic) CDF G : R ≥0 → [0, 1] and inverse total demand v ∼ G, we define the following mappings.

• TFR-to-quantile map Q G : The quantile corresponding to TFR τ ∈ [0, 1] is Q G (τ ) 1 − G(τ ).

In words, the probability of being able to meet a fraction τ of total demand is Q G (τ ). This map is monotone non-increasing.

• Quantile-to-TFR map T G : The TFR corresponding to quantile q ∈ [0, 1] is T G (q) G −1 (1−q).

In words, T G (q) is the TFR for which the probability of being able to meet a fraction T G (q) of total demand is q. This map is monotone non-increasing and is the inverse of the TFR-toquantile map, i.e., T G = (Q G ) −1 .

• The expected achievable fill rate (EAFR) curve R G : For q ∈ [0, 1], R G (q) q · G −1 (1 − q) is the EAFR when the probability of meeting demand (given the TFR) is exactly equal to q ∈ [0, 1],

i.e., the EAFR obtained by targeting a fill rate T G (q).

Remark 4. In light of the above transformation, we remark that there is a reduction from our setup to a single-parameter Bayesian mechanism design problem in which a monopolistic seller has an item to sell to a single buyer with private valuation v ∼ G, where G is the common prior valuation distribution. See Alaei et al. (2019) for an example of such a setting; also refer to Hartline (2013) for more details on monopoly pricing. In this reduction, target fill rates correspond to prices and the EAFR corresponds to the expected revenue in monopoly pricing (accordingly, the EAFR curve also corresponds to the revenue curve). The problem in this parallel monopoly pricing setting is identifying the worst-case distribution G satisfying E v∼G [1/v] = µ, so that we minimize the maximum revenue obtained from selling the item at prices constrained to be in the interval [0, 1].

According to Definition 5, τ (1 − G(τ )) is equivalent to R G (Q G (τ )) for any TFR τ ∈ [0, 1]. Based on this insight, to prove Lemma 2, it is sufficient to show that

Consider all cumulative distribution functions G ∈ ∆ (R ≥0 ; µ). We first identify two additional constraints on G that do not change the infimum in eq. (16). These constraints enable us to find the worst-case distribution that achieves the infimum value which establishes the desired result.

Before proceeding, we develop intuition using an illustrative example of the EAFR curve shown in Figure 5 (a). In general, if one draws R G (q) as a function of q ∈ [0, 1] (i.e., in the quantile space), then the slope of the line connecting the point (0, 0) to (q, R G (q)) is equal to T G (q) = R G (q)/q. This slope is monotone non-increasing in q for any CDF G according to Definition 5. Hence, given the EAFR curve R G (q), the support of the feasible fill rates is equal to [L, min{1, H}] , where L R G (1) and H lim inf q→0 R G (q)/q. The two constraints that we will add below, as stated in Claims 1 and 2, imply that the outer optimization problem in eq. (16) will remain unchanged if we require the EAFR curve to be (i) flat over quantiles corresponding to target fill rates in [L, 1], i.e., quantiles in the interval [Q G (1), 1], and (ii) a straight line with slope 1 for quantiles in the interval [0, Q G (1)].

With these two additional constraints, in Claim 3 we find the worst-case CDF, which has an EAFR curve as shown in Figure 5 (b).

Claim 1 (Equal EAFR). Adding the constraint R G (q) = R G (q ), ∀q, q ∈ [Q G (1), 1] to the outer optimization in eq. (16) does not change its infimum value.

We prove Claim 1 by contradiction: we show that for any CDF G ∈ ∆ (R ≥0 ; µ), if the above condition does not hold, we can slightly modify G to design a new distributionG ∈ ∆ (R ≥0 ; µ) which has an EAFR curve with a lower maximum value. The details are presented in Appendix E.1.1. The above claim readily implies that we can focus on distributions for which the EAFR curve is flat in the interval [Q G (1), 1].

Next, we claim that we can restrict our attention to distributions where there is no probability mass for v ∈ (1, +∞). Said differently, the support of inverse demand is (0, 1] ∪ {+∞}.

Claim 2 (Restricted Support for Inverse Demand). Adding the constraint G(v) = G (1) for all v ∈ [1, +∞) and lim v→+∞ G(v) = 1 to the outer optimization in eq. (16) does not change its infimum value.

We also prove Claim 2 by contradiction: we show that for any CDF G ∈ ∆ (R ≥0 ; µ), if there is probability mass on v ∈ (1, +∞), we can construct a CDFG ∈ ∆ (R ≥0 ; µ) which has an EAFR curve with a lower maximum value by shifting that mass to +∞. The details are presented in Appendix E.1.2. Again, note that this claim implies that we can focus on distributions for which the EAFR curve starts with a straight line up to quantile Q G (1).

Given the two claims above, the distribution that attains the infimum in eq. (16) must satisfy the two constraints introduced. Figure 5 (b) summarizes the effect of these two restrictions on R G (q).

Claim 3 (Worst-case CDF). For any µ ∈ R ≥0 , the distributionĜ given in eq. (15) is the unique distribution in ∆ (R ≥0 ; µ) satisfying the constraints introduced in Claims 1 and 2. Therefore, this distribution attains the infimum in eq. (16).

We prove Claim 3 in Appendix E.1.3. Since the EAFR curve RĜ(q) has a maximum value of

, we have shown that the optimal TFR policy always achieves an EAFR of at leastq.

This completes the proof of Lemma 2.

In this section, we study our second notion of fairness, namely, ex-ante fairness. As we did for ex-post fairness, we first establish an upper bound on the ex-ante fairness guarantee achievable by any policy. More importantly, we then show that our PPA policy achieves this worst-case ex-ante fairness bound. The following theorem establishes our matching upper and lower bounds on the ex-ante fairness guarantee.

Theorem 4 (Ex-ante Fairness Guarantee of PPA Achieves Upper Bound). Given a fixed number of agents n ∈ N and supply scarcity µ ∈ R ≥0 , no sequential allocation policy obtains an ex-ante fairness guarantee (see Definition 3) greater than κ a (µ, n), defined as

Further, the PPA policy achieves an ex-ante fairness guarantee of at least κ a (µ, n).

Like its counterpart for ex-post fairness, κ a (µ, n) depends on the supply scarcity, µ, and is at its lowest when expected demand equals supply, which highlights the loss due to stochasticity when trying to achieve efficiency and equity ex ante. However, unlike the bound for ex-post fairness, the ex-ante fairness bound is independent of the number of agents. In fact, this bound is identical to the ex-post fairness bound in the single-agent case, i.e. κ a (µ, n) = κ p (µ, 1) (which is shown by the dotted line in Figure 4(a) ).

For intuition about this relationship, note that one feasible policy is to allocate supply to each agent proportional to their expected demand. Since the ex-ante problem only depends on marginal FRs, this reduces ex-ante fairness to the minimum ex-ante fairness across n single-agent instances (where in each instance, the supply scarcity is µ). In a single-agent instance, ex-ante fairness is equal to ex-post fairness, which implies that any lower bound on single-agent ex-post fairness also serves as a lower bound on ex-ante fairness with n agents. Furthermore, since demands can be perfectly correlated, any single-agent instance can be expressed as an instance with n agents for any n ∈ N. This implies that any upper bound on single-agent ex-post fairness also serves as a upper bound on ex-ante fairness with n agents.

To prove the upper bound in Theorem 4, we build on the hard instances from the proof of Theorem 1. To prove the lower bound, we use ideas similar to the proof of Theorem 2. We show that when following the PPA policy, the expected FR for each agent is a decreasing and convex function of the ratio of expected remaining demand to remaining supply upon their arrival. We inductively place an upper bound on the ex-ante expected value of that ratio for each agent, which enables us to provide a lower bound on ex-ante fairness. See Appendix F for a detailed proof.

We complement our theoretical developments with an illustrative case study motivated by the allocative challenges of rationing medical supplies in the midst of a pandemic. First, we describe a simple compartmental model that governs the need for medical supplies experienced by different locations, which is based on models used to forecast the COVID-19 pandemic. We illustrate that in such a setting, the demand is sequential, highly variable, and has complex correlation structure due to network effects. Then, we study this dynamic rationing problem within our framework and illustrate the effectiveness of our PPA policy by comparing it to (i) its theoretical guarantee (presented in Theorem 2), (ii) the optimal TFR policy (defined in Section 3.4), and (iii) a DP approach (similar to the one described in Appendix B). We conclude this section by demonstrating the efficiency of our policy as well as its robustness to mis-specification of model parameters.

We model the spread of a pandemic across inter-connected locations using a standard SEIR (susceptible, exposed, infectious, recovered) model, with different compartments for different locations.

Such compartmental models are commonly-used and frequently influence practice. (See, e.g., Morozova et al. 2021.) Each location represents an agent in our allocation framework, and we use the peak number of infected individuals in a location as a proxy for that agent's need.

In the following, we briefly overview the SEIR model that we borrow from the epidemiology literature (Anderson and May 1992, Diekmann and Heesterbeek 2000) . In this model, there are L locations where location i has population p i . Individuals interact with each other according to a time-varying rate γ t (which we will specify later). Individuals in location i predominantly interact with members of their own location, but a small fraction α i of their interactions occur with a member of an adjacent location j, where j is equally likely to be any of location i's neighbors (denoted by the set N i ). When a susceptible individual interacts with an infectious individual, the susceptible individual becomes exposed. Exposed individuals become infectious after a random time drawn from an exponential distribution with mean 1/δ, and infectious individuals become recovered after a random time drawn from an exponential distribution with mean 1/λ. For large networks, this system approaches the deterministic dynamics presented in (18), with S i (t), E i (t), I i (t), R i (t) representing the fraction of the population that is susceptible, exposed, infected, and recovered (respectively) in location i at time t. We will use these dynamics to simulate pandemic trajectories.

We calibrate the parameters of our numerical studies in accordance with epidemiological estimates specific to the COVID-19 pandemic (Aleta et al. 2020 , Park et al. 2020 , Walsh et al. 2020 . In particular, the state transition rates δ and λ are set to be deterministic, while the initial interaction rate γ 0 is drawn from a (truncated) Normal distribution. This distribution is parameterized such that the range for the basic reproductive number in our model (R 0 = γ 0 /λ) reflects the wide range of estimates that appeared in the early stages of the pandemic. The interaction rate γ t then varies over time, according to the dynamics

where each X τ is an independently-drawn Normal random variable with mean ξ r and standard deviation σ r . The term impact of factors such as seasonality, disease mutation, government interventions, and changes in individual behavior. The mean and standard deviation of the random walk that governs the time-varying interaction rate are drawn from uniform distributions which allow for a wide range of plausible trajectories. Finally, we fix the population size of all locations to be 1000. We summarize the instance parameters and their associated distributions in Table 1 .

Our simulation study focuses on a simple fourlocation setting with equal populations where adjacency is given by a line graph. 26 (Location 1 is adjacent only to Location 2, Location 2 is adjacent to Locations 1 and 3, etc.) Initially, 0.01% of the population of Location 1 is assumed to be exposed, and all other individuals are susceptible.

In each simulation, we first independently draw instance parameters according to distributions in Table 1 , and we then draw realizations for the random walk governing the time-varying interaction rate. Using those parameters, we then compute the demand trajectory based on the dynamics of the SEIR model described by the system of equations in (18). As mentioned before, we use the peak number of infected individuals in each location as a proxy for its need, i.e., d i = max t I i (t).

Having described our simulation setting, we next illustrate that even in this simple setting, the demand is highly variable and there is a complex correlation structure across locations. In the left panel of Figure 6 , we present a histogram of total demand across 1,000 simulations of the model. We highlight that the standard deviation is large relative to expected demand: the coefficient of variation in the simulations is 0.662. Furthermore, in the right three panels of Figure 6 , we present a scatter plot which highlights the strong correlation between locations' demands that results from our SEIR model. We remark that the complex correlation structure goes beyond adjacency-based network effects and further depends on the stochastic and time-varying nature of the interaction rate. Due to the linear nature of adjacency in this setting, the sequence of peak demands across the 1,000 simulations is consistent (first, 26 Focusing on this simple setting is partly necessitated by the challenges of computing the DP solutions for larger problems. Left: Histogram of total demand. Middle and Right: Scatter plot of (d1, di) for i ∈ {2, 3, 4}.

location 1 reaches its peak demand, then location 2, etc.). 27 The average temporal gap between peak demands is around three weeks in our simulations.

For each of the 1,000 simulations of the setting described above, we numerically evaluate the performance of three policies: (i) our PPA policy, (ii) the optimal TFR policy, and (iii) a DP approach. We assume that initial supply is equal to the average demand across the simulations, which corresponds to a supply scarcity of µ = 1.

In order to compute the allocation decision of our PPA policy we need to have access to conditional first moments of individual demand. Similarly, to compute the optimal TFR policy and the DP, we need access to the distribution of total demand and the full joint distribution, respectively. Given that the joint distribution F does not have an explicit representation, we approximate these quantities by their empirical averages. In particular, in order to estimate E d∼ F j∈[i+1:n] d j d [1:i] , we first generate 1,000 sample paths for the setting described above. Then, for a given d [1:i] , we use a k-nearest neighbors approach (where k = 10) to estimate conditional first moments of future demands. 28 To compute the optimal TFR policy, we approximate i∈[n] d i by its empirical average, and we then numerically find the target threshold that performs best, which in this case is τ = 1.

Since computing the DP requires the complete joint distribution of future demand, we increase the number of samples to 1,000,000 to provide a more accurate empirical average of the distribution. We then follow the DP formulation presented in Appendix B.1 along with the discretization approach described in Appendix B.3. We stress that unlike the setting of Appendix B.3, here we have correlated demand. As such, we must augment the state space to include the entire demand history. Said differently, the Bellman equation we solve is a discrete version of (22). However, given the small size of the problem, we are able to compute the DP solution when allowing for 50 different discrete values for demand, which corresponds to ≈ 0.01 in the approach described in Appendix B.3.

In the first column of Table 2 , we present the average ex-post fairness for the aforementioned policies. 29 In addition, we compute the average ex-post fairness achieved by the optimum offline policy. (For a given realization of demands d, the optimum offline policy achieves a minimum FR of min{1, s/ i∈[n] d i }.) We make the following observations:

• PPA vs. Guarantee: The PPA policy performs 30% better than its theoretical guarantee of 0.60 (from Theorem 2 when µ = 1 and n = 4) in this non-adversarial setting.

• PPA vs. Optimal TFR: The PPA policy outperforms the optimal TFR policy by 44%, and we highlight that the optimal TFR policy performs even worse than the PPA guarantee.

Intuitively, in this pandemic-based setting where correlation is strong and the total demand has a relatively large coefficient of variation, adaptivity is particularly valuable.

• PPA vs. DP: The solution to the DP, despite being challenging to compute even in this simple setting, exhibits nearly identical performance to the PPA policy. This further implies that using additional information beyond first conditional moments does not provide significant benefit in our numerical simulations.

• PPA vs. Optimum Offline: The PPA policy even achieves 94% of the optimal performance when the demand sequence is known in advance. (We remind that Proposition 1 established that in general, no online policy can provide a constant-factor guarantee relative to this benchmark. However, in this practical instance, we are able to provide strong performance relative to the optimum offline.) Table 2 Ex-post fairness and waste of three policies across 1,000 simulations when the sample paths are accurately drawn (columns 1-2) and for two different types of mis-specification (columns 3-6 

Moving beyond the comparison in terms of ex-post fairness, we also examine the allocative efficiency of our policy by reporting the average fraction of wasted supply in the second column of Table 2 . Supply is wasted whenever there is remaining supply that could have been allocated to an agent without exceeding their demand, which is undesirable in our motivating applications. Mathematically, the fraction of wasted supply is given by

By definition, the optimum offline policy never wastes supply, and similarly, the optimal TFR policy in this setting -where the optimal target fill rate is onealso does not waste supply. The PPA policy does waste some supply in expectation, but this waste is minimal (less than 1%) and slightly less than the amount of waste under the DP solution.

Furthermore, in practice, various model parameters may be mis-specified. Therefore, it is practically important to rely on policies that are not too sensitive to such model mis-specifications.

In the following, we examine the robustness of the three aforementioned policies and show that our PPA policy outperforms both the TFR policy and the DP in terms of robustness. Specifically, we consider two different scenarios of mis-specification in the parameters of the SEIR model. In In the middle (resp. right) two columns of Table 2 , we present the ex-post fairness and the waste of all three policies on the same 1, 000 simulations, but when the policies are calibrated on sample paths generated by the mis-specified model from scenario (i) (resp. scenario (ii)).

We observe that the performance of the PPA policy remains relatively stable, despite the substantial levels of mis-specification. In contrast, the optimal TFR policy's performance changes somewhat dramatically in scenario (i). In this scenario, where demand is over-estimated, the optimal target fill rate drops precipitously from 1.0 to 0.492 due to a greater perceived risk of exhausting supply. This new target fill rate impacts both the ex-post fairness and the average fraction of supply that is wasted. Similarly, the performance of the DP solution is substantially impacted in one of the two scenarios, this time in scenario (ii). In this scenario, when demand is under-estimated, the DP can be overly aggressive in its allocation decisions and end up with a near-0 minimum FR in certain simulations. The heavier left-tail for the distribution of the DP's minimum FR drives the average performance down, ultimately leading to a decrease in ex-post fairness by more than 5%.

We conclude this section by highlighting that we are only able to compute the DP solution due to limiting the setting to four agents. Yet despite this simplified setting, the solution to the DP also suffers from the additional shortcomings described in Remark 2: it requires full distributional knowledge (which may be difficult to acquire and more prone to mis-specification), it achieves worse ex-ante fairness than the PPA policy (by nearly 5% in these simulations), and its allocation is not transparent. Transparency is particularly important in this setting, as numerous states questioned the allocation procedures implemented by the federal government (see Footnote 1 for one such example). In contrast, both the PPA policy and the optimal TFR policy follow strategies that can be easily explained to stakeholders.

We conclude the paper by summarizing our main findings, discussing a few extensions of our base framework, and listing a few future directions.

In this paper, we initiate the theoretical study of fair dynamic rationing by introducing a simple yet fundamental and well-motivated framework. In a nutshell, we design sequential policies for allocating limited supply to a sequence of arbitrarily correlated demands given an objective which encompasses the dual goals of efficiency and equity. Based on our formalized notions of ex-post and ex-ante fairness, we establish upper bounds on the fairness guarantees achievable by any sequential allocation policy which depend on the supply's scarcity level and the number of demanding agents.

More importantly, we show that our simple PPA policy achieves the "best of both worlds" by attaining the upper bound on both the ex-post and ex-ante fairness guarantees. In addition to enjoying optimal fairness guarantees, our PPA policy is practically appealing: it is interpretable as well as computationally efficient since it does not rely on distributional knowledge beyond the conditional first moments.

Our framework lends itself to extensions such as considering generalized objectives and rationing multiple types of resources. More broadly, it serves as a base model for theoretically studying sequential allocation problems with an objective beyond utility maximization, which in turn opens several new research directions. In the rest of this section, we first discuss the aforementioned extensions of our base model and then finish the paper by discussing future directions.

Throughout the paper, we have focused on the minimum FR as a social welfare objective that combines elements of equity and

. However, this social welfare function-which is also known as the Rawlsian social welfare function thanks to the philosophical work of John Rawls (1973) -is only a special case of a more general class of social welfare functions that we call the weighted power mean (WPM) social welfare family of functions. More precisely, this family is parameterized by α ∈ [0, +∞) and defined as

Note that the above is a weighted version of the celebrated power mean functions, introduced in Atkinson et al. (1970) , that provides a broad class of social welfare functions which balance equity and efficiency to varying degrees. Having weights proportional to the demands in eq. (20) ensures that equity is measured relative to demand and not simply based on the absolute allocation. 32 By varying the parameter α from 0 to +∞, the focus of the planner is shifted from extreme efficiency towards more equitable allocations. When α = 0, a utilitarian allocation (i.e., any allocation without waste) maximizes social welfare. In the limit as α → 1, proportional fairness (i.e., a generalization of the Nash bargaining solution) maximizes social welfare. Finally, in the limit as α → +∞, maximizing the minimum FR maximizes social welfare. In fact, we highlight that the value of this social welfare function exactly approaches our objective in the base model (i.e., the minimum FR, or equivalently, the Rawlsian social welfare function).

For any parameter α (including when α → +∞, which corresponds to our base model), the optimal policy when demands are deterministic is to allocate the supply proportionally. Such a policy achieves the optimal social welfare of W (defined in eq. (1)). 33 However, the value of the parameter α impacts the optimal policy when demands are stochastic. To study this impact, we naturally generalize our notion of ex-post fairness to WPM social welfare functions, i.e., ex-post fairness is given by E d∼ F [U α ] /W . We remark that in the limit as α → +∞, this is equivalent to 32 Further, this family of functions has a one-to-one relationship with the α-fairness social welfare functions introduced in Mo and Walrand (2000) . In fact, the two families of functions are the same up to a transformation via a one-to-one, increasing function, which means that the maximizing vectors are identical for a given α. 33 Note that in a deterministic setting, maximizing any social welfare objective function as in eq. (20) is a concave maximization. By writing KKT conditions it is not hard to see that any such function attains its maximum at a feasible proportional allocation, i.e., xi = min di, d i j∈[n] d j , under deterministic demands. The maximum is then equal to W = min{1, 1/µ}. the notion of ex-post fairness introduced in Section 2. In the following corollary of Theorem 2, we establish that our PPA policy achieves an ex-post fairness guarantee of at least κ p (µ, n) for any α ∈ [0, +∞).

Corollary 1 (PPA's Guarantee for WPM Objectives). Given a fixed number of agents n ∈ N, supply scarcity µ ∈ R ≥0 , and any α ∈ [0, +∞), the PPA policy achieves an ex-post fairness guarantee of at least κ p (µ, n) (defined in eq. (2)) when social welfare is measured by a WPM function (defined in eq. (20)) with parameter α.

We prove Corollary 1 in Appendix G.1.

In our base model, we assume that agents have demand for only a single type of resource. However, in many of the motivating applications that we consider, agents may have concurrent demand for multiple types of resources. For example, states may need many different types of medical supplies during the peak of a pandemic.

Our setup readily extends to the sequential allocation of m different resource types where an arriving agent simultaneously demands m types of supply. We allow the demands to be correlated across agents as well as resource types. For the sake of brevity, we refrain from repeating the setup and we simply augment our notation for various quantities (e.g., supply, demand, and allocation) by adding a superscript j ∈ [m]. In this generalized model, we define agent i's utility to be their weighted FR, defined as:

where we normalize the weights λ j to satisfy j∈[m] λ j = 1. A simple corollary of Theorem 2, as stated below, ensures that independently following our PPA policy for each resource achieves a lower bound on the expected minimum weighted FR which is a weighted sum of the expected minimum FR guaranteed by the PPA for one resource, i.e. κ p (µ j , n) max{1, µ j } for resource j.

Corollary 2 (PPA's Guarantee on Expected Minimum Weighted FR). Consider any instance with n ∈ N agents and m ∈ N resources, where the initial supply for resource j is s j ∈ R ≥0 . For any joint demand distribution over all agents and resources F ∈ ∆(R n×m ≥0 ), independently following the PPA policy for each resource achieves an expected minimum weighted FR (as defined in eq. (21)) of at least j∈[m] λ j κ p µ j s j , n max 1, µ j s j , where µ j ∈ R ≥0 is the expected total demand for resource j.

In addition, since demand can be correlated across agents, we can re-use the hard instances of Theorem 1 to construct a joint distribution which establishes an upper-bound on the performance of any policy matching the lower bound given in Corollary 2. We state this upper bound as a corollary of Theorem 1 below.

Corollary 3 (Upper Bound on Expected Minimum Weighted FR). For any n ∈ N agents, m ∈ N resources, and any initial supply for resource j of s j ∈ R ≥0 , there exists a joint demand distribution over all agents and resources F ∈ ∆(R n×m ≥0 ) for which no policy can achieve an expected minimum weighted FR greater than j∈[m] λ j κ p µ j s j , n max 1, µ j s j , where µ j ∈ R ≥0 is the expected total demand for resource j.

Together, these two corollaries (which we prove in Appendix G) establish that independently following the PPA policy for each resource j provides the best possible guarantee on the expected minimum weighted FR. Consequently, we can use our PPA policy to shed light on how the social planner can prepare for demand across multiple types of resources. If the initial endowment of different resource types is not exogenously set, then the social planner can solve an outer endowment optimization problem to maximize the guarantee on the expected minimum weighted FR subject to a budget constraint.

We remark that such an endowment optimization problem is a max-max-min problem where the social planner first optimizes the initial endowment across resource types subject to a budget constraint (the outer problem). Then, given the initial endowment, the social planner maximizes over policies the minimum over demand distributions of our objective, i.e., the expected minimum weighted FR among agents. We solve the outer maximization of the multiple resource-type problem by determining the optimal initial endowment across resource types when the social planner independently follows the PPA policy for each resource. To be concrete, suppose the social planner has a fixed budget B that can be used to procure an initial endowment s = (s 1 , s 2 , . . . , s m ). Further, suppose the per unit cost of resource j ∈ [m] is c j . Then, the outer endowment optimization problem can be formulated as follows:

j∈ [m] λ j κ p µ j s j , n max 1,

We highlight that to formulate this optimization problem, we crucially use the parameterized characterization of the ex-post fairness guarantee in Theorem 2, as opposed to the worst-case guarantee for any set of parameters. Further, we remark that the above maximization problem is linearly separable and concave in the decision variables s, meaning that it can be solved efficiently. 34

Based on Corollaries 2 and 3, the optimal solution s * , combined with independently implementing the PPA policy for each resource, is indeed the optimal solution of the max-max-min multiple resource-type problem.

34 It is not difficult to check that the function κp µ j s j , n max 1, µ j s j is concave in s j for any choice of µ j and n; check eq. (2) for a definition of κp(·, ·). We omit this purely algebraic proof for brevity.

Our paper can be viewed as an analog of the classic prophet inequality problem (Krengel and Sucheston 1978, Samuel-Cahn et al. 1984) for equitably allocating divisible goods. As such, similar to prophet inequalities, many interesting variants of our setting arise. We discussed two such variants above, and for both, we established an achievable lower bound by employing our PPA policy.

However, in the former variant, we do not establish a matching upper bound. Establishing tight bounds on the achievable performance in such a setting-which may require the use of a different policy-is an interesting direction for future research. Further, understanding the inefficiency (unused supply) which may occur in sequential allocation due to our focus on an egalitarian objective is a fruitful research direction that we plan to pursue. Finally, here we made no assumption about the correlation structure underlying the demand sequence. It would be compelling to investigate whether including a (well-motivated) correlation structure can result in improved fairness guarantees.

A. Missing Proofs of Section 3.1 A.1. Proof of Theorem 1 (Section 3.1)

We prove the theorem by considering two separate cases corresponding to the over-demanded regime (µ ≥ 1 + 1 n ) and the under-demanded regime (µ < 1 + 1 n ). For each regime, we provide an instance of the problem under which no sequential allocation policy obtains ex-post fairness larger than κ p (µ, n) restricted to that regime.

Over-demanded regime (µ ≥ 1 + 1 n ): Consider an instance with n equally likely scenarios, where in scenario σ all agents i ∈ [σ] have demand d i = 2µ n+1 . This instance is depicted in Figure 2 (a). We remark that the total expected demand is equal to µ, simply because

In such a setting, whenever agent i has non-zero demand, every agent j where j < i must also have had nonzero demand. Since the policy cannot distinguish among scenarios i, i+1, . . . , n, its allocation decision must be independent of the scenario. Therefore, any policy can be described by a set of allocations y = (y 1 , y 2 , . . . , y n ), such that if agent i has non-zero demand, then they receive an expected allocation y i . Furthermore, when making the allocation decision for agent n, there is only one possible history: every other agent also had non-zero demand. Thus, any feasible sequential allocation policy must respect the constraint i∈[n] y i ≤ 1.

Let us define r σ as the expected minimum FR of the given policy in scenario σ (i.e., if only the first σ agents have non-zero demand). By convention, we set r 0 = 1 and we must have r σ ≤ r σ−1 by definition. In addition, r σ must be less than the expected FR of agent σ, so r σ ≤ (n+1) 2µ y σ . Given r, the expected minimum FR in this instance is equal to 1 n σ∈[n] r σ . Based on these constraints and the objective, we can formulate a linear program whose optimal solution is an upper bound on the expected minimum FR achievable by any feasible sequential allocation policy. This linear program was originally presented in Section 3.1, but we replicate it here (Primal-LP1) along with its dual program (Dual-LP1).

To upper-bound the value of the program Primal-LP1, we find a feasible assignment for the dual program Dual-LP1. Consider the assignment where δ i = 1 n and γ i = 0 for all i ∈ [n], and where ω = n+1 2nµ . Under this The instance which establishes an upper bound of κp(µ, n) when µ < 1 + 1 n (which we refer to as the under-demanded regime). assignment, all dual variables are non-negative and all constraints are satisfied (in fact, all are tight). Thus, this assignment is feasible in Dual-LP1. It also attains an objective value of n+1 2nµ . By weak duality, this

represents an upper bound on the optimal value of Primal-LP1, and hence an upper bound on the expected minimum FR of any policy in the over-demanded regime.

Under-demanded regime (µ < 1 + 1 n ): Consider an instance with n + 1 scenarios, where the first n scenarios each occur with equal probability of µ n+1 and scenario n + 1 occurs with probability 1 − nµ n+1 . In scenario n + 1, there is no demand. In scenario σ for σ ∈ [n], all agents i ∈ [σ] have demand d i = 2 n . This instance is depicted in Figure 7 . We remark the total expected demand is equal to µ, simply because

As was the case in the over-demanded regime, any sequential allocation policy can be described by a set of allocation decisions such that if agent i has non-zero demand, then they receive an expected allocation y i . We again define r σ as the expected minimum FR in scenario σ, and we note that r n+1 = 1. Thus, the expected minimum FR in this instance is equal to µ n+1 σ∈[n] r σ + 1 − nµ n+1 . By imposing the constraints described for

Primal-LP1, we can formulate a slightly different linear program whose optimal solution is an upper bound on the expected minimum FR of any feasible sequential allocation policy in the under-demanded regime.

This linear program (Primal-LP2), along with its dual program (Dual-LP2), is presented below.

To upper-bound the value of the program Primal-LP2, we find a feasible assignment for its dual. Consider δ i = µ n+1 and γ i = 0 for all σ ∈ [n], and ω = nµ 2(n+1) . Under this dual assignment, all dual variables are nonnegative and all constraints are satisfied (and again tight). Thus, this assignment is feasible in Dual-LP2.

It also attains an objective value of nµ 2(n+1) 1 − nµ n+1 = 1 − nµ 2(n+1) . By weak duality, this represents an upper bound on the optimal value of Primal-LP2, and hence an upper bound on the expected minimum FR attainable by any sequential allocation policy when µ < 1 + 1 n . We conclude the proof by scaling the obtained upper bounds on the expected minimum FR by our normalization factor, namely W = min{1, 1/µ}. This establishes an upper bound of κ p (µ, n) on the ex-post fairness guarantee (see Definition 3) achievable by any sequential allocation policy.

To establish this result, we make use of the hard instance when µ ≥ 1 + 1/n which was presented in the proof of Theorem 1 and illustrated in Figure 2(a) . For ease of reference, we describe the instance below.

In this instance, there are n possible equally-likely scenarios, i.e., scenario σ happens with probability 1/n for σ ∈ [n]. In scenario σ, the first σ agents have equal demand of 2µ n+1 and the rest have no demand. As established in the proof of Theorem 1, the expected demand in this instance is µ, and no online policy can achieve an expected minimum FR greater than n+1 2nµ . The optimal offline solution is to divide supply proportionally across all realized demand. This solution achieves a minimum FR of either 1 or 1 i∈[n] d i . For this example, we will assume that 2µ n+1 ≥ 1, which means that each demand exceeds supply. This ensures that the minimum FR achieved by the optimal offline solution is exactly 1 i∈[n] d i . Hence, the expected minimum FR achieved by the optimal offline solution is simply equal to

The inequality holds due to the summation representing a left Riemann sum of a decreasing function. As noted above, the upper bound on the expected minimum FR of any online policy is n+1 2nµ . Thus, the ratio between the two is at least log(n + 1), which completes the proof of Proposition 1.

In this section we describe the dynamic programming for computing the optimum online policy for maximizing the expected minimum FR. Given a joint demand distribution F, the optimum online policy π *

denote the allocation decisions of policy π * when is run on the stochastic demand vector d ∼ F. In Appendix B.1 we describe this DP in its most general form, including for instances with a continuous state space. However, even in the case of a discrete state space, such a DP suffers from the curse of dimensionality when demand is arbitrarily correlated. We elaborate more on this issue in Appendix B.2 by providing additional details on the computational model (as well as the input model) that we consider. In Appendix B.3, we focus on the special case of independent demands, where the curse of dimensionality does not occur; we show that by appropriate discretization, we can obtain a fully polynomial-time (near-optimal) approximation scheme (FPTAS) for this DP. Finally, in Appendix B.4, we present examples that illustrate the drawbacks of the DP approach beyond computational challenges. By convention, in the remaining of this section we assume that demands and supply are normalized such that the total supply s = 1.

We start describing our dynamic programming by its state space. Upon the discrete arrival time of agent i, the state of our dynamic programming captures the index of the current agent i ∈ [n], the sample path of all previous agents' realized demands d [1:i−1] , the remaining supply s i , and the minimum FR f i up to (not including) the arrival of agent i. Note that it is necessary to include the entire sample path of previous demands in our state, as the demand is arbitrarily correlated. Thus, the conditional distribution of the vector of future demands, and hence the performance of the optimum online policy, depends on the realized sample path so far. As we discuss in more detail in Appendix B.2, the size of such a state space can be exponential in n (even after discretization), as there might be exponentially many sample paths d that have non-zero probability of happening. Now, we define V F (i, d 

Given the filled DP 

Having described the Bellman equation for the DP, we next discuss its computation. For the sake of this discussion, let us assume that all demands are discrete and can take ∆ different values. Even in this special case, the support of the joint distribution F can be ∆ n because we do not impose any restrictions on the structure of the joint distribution. Given that the DP Finally, we remark that having access to such an oracle is practical and well-motivated. In fact, demand forecasting models in contexts such as a pandemic rely on parametric models for which the set of parameters are drawn from appropriate distributions (or uncertainty sets). As such, even though such distributions do not have succinct representations, it is easy to simulate them. We provide one such example in our numerical study in Section 4, where we present a simple SEIR model to forecast the number of infections for the COVID-19 pandemic. As is common in the epidemiology literature, we assume that some of the parameters of the model are uncertain and drawn from given distributions. The resulting continuous joint demand distribution does not have a compact representation and induces demands that are highly correlated; however, it is easy to generate independent sample paths by drawing parameters and then following the dynamics of the SEIR model. 35 We would like to highlight that if the support of F has size equal to L ∈ N, i.e., if this distribution only assigns nonzero probabilities to L different demand sequences, then the size of the DP table will be polynomial in L. Hence, under a naive unary computational model where any algorithm should first read the input distribution as a L-dimensional vector in [0, 1] n , DP's running time can be thought of as a polynomial in its input size L. However, in principle, L can be as large as ∆ n (hence exponentially large in n), which makes reading the input distribution computationally inefficient and highly non-practical. As we explain in the next paragraph, we consider a different computational model to circumvent the issue of reading the input distribution. (We emphasize that this computational model does not reduce the exponential size of the DP table's state space.)

We start this subsection by considering a special case of the dynamic programming in Appendix B.1 when demands are independent. However, we still allow for a continuous state space (i.e., the demand, the remaining supply, the minimum FR, and the allocation decision at each period i can be any non-negative real numbers).

Later we show how to discretize this DP to obtain a fully polynomial-time (near-optimal) approximation scheme.

Continuous DP: When the demands d 1 , d 2 , . . . , d n are independent, the dynamic programming for the optimum online policy (discussed in Appendix B.1) does not need to consider the sample path of all previous agents' realized demands d [1:i−1] as part of the state, simply because future demand does not depend on this demand history. As a result, we can define V F ind (i, s i , f i ) to be the expected minimum FR of the optimum online policy starting from time i, when it is initialized with a remaining supply of s i and a current minimum FR of f i . For this special case, the Bellman update equation in eq. (22) can be simplified as follows:

Similar to the general setting, we add a dummy agent i = n + 1 at the end of the time horizon, who deterministically has 0 demand, and as the base of the dynamic programming, we set V F ind (n + 1, s n+1 , f n+1 ) = f n+1 for all possible values of s n+1 , f n+1 ∈ [0, 1]. Given the filled DP table for all possible values of the state space and given a state s i and f i ahead of the arrival of agent i, upon realization of d i the optimum online policy picks the allocation decision x * i (d i , s i , f i ) such that:

Note that the above DP does not suffer from the curse of dimensionality. However, its state space is still continuous. In the following, we show that through proper discretization of the state space and the space of allocation decisions, we can develop a fully polynomial-time (near-optimal) approximation scheme for the continuous DP (denoted DP CONT ) by solving the discretized DP (denoted DP DISC ).

Discretizing the DP: Suppose F = F 1 × F 2 × . . . × F n . We reiterate our assumption (without loss of generality) that supply and demands are normalized such that s = 1, and we further assume that the (normalized) demands are bounded by d H . Fixing a parameter > 0 such that −1 ∈ N, we then create an -grid bounded between 0 and max{1, d H } , i.e., G {k : k = 0, 1, . . . , max{1,d H } }, which we will use to discretize our state space.

Given d i ∼ F i , we define the random variabled i to be d i rounded up to the closest multiple of , i.e.,d i is the smallest member of the set G such thatd i ≥ d i . LetF i denote the CDF ofd i . We also discretize the space for the allowable values of remaining supply using the same -grid G . Based on this restriction along with the assumption that evenly divides 1, the allowable allocations are likewise contained within the set G . (Of course, the allowable allocation remains bounded by the current remaining supply and current demand.) Furthermore, the set of possible minimum FRs is bounded by |G | 2 , since an FR is determined by the allocated supply and the realized demand. Thus, the state space of DP DISC is at most n|G | 3 . For each possible realized demand (of which there are at most |G |), determining the optimal allocation requires a search over at most |G | feasible allocations. As a consequence, filling the entire table requires on the order of n|G | 5 operations.

Providing a near-optimal approximation: To implement DP DISC , whenever a demand d i is realized, we round it up tod i . We use the solution of DP DISC , denoted x DISC i , to guide our allocation decision. We also update the state of DP DISC , in particular the remaining supply, based on x DISC i . However, to ensure consistency with the discretization scheme, we only allocate the portion of x DISC i necessary to ensure that the actual minimum FR precisely matches the discretized minimum FR, and we discard the remaining supply. 36 To be

. This convention ensures that the expected minimum FR of DP DISC on the actual (continuous) instance is identical to its expected minimum FR on the discretized instance. We now show that this expected minimum FR provides a near-optimal approximation of the expected minimum FR obtained by DP CONT .

Proposition 3. For any > 0 such that −1 ∈ N, the expected minimum FR obtained by DP DISC is at least 1 − 2n times the expected minimum FR obtained by DP CONT .

Proof of Proposition 3: The proof relies on two key claims. For ease of presentation, we use the name of each DP to also denote the expected minimum FR it obtains. Furthermore, we let this expected minimum FR depend on the initial supply (which we remind is normalized to 1). First, we will establish a connection between the expected minimum FR obtained by the discretized DP, i.e., DP DISC (1), and the expected minimum FR obtained by the continuous DP when initialized with less supply. Consider the solutionx DISC i = x CONT i + 1 . We will show that allocating min{d i ,x DISC i , } at each time i is a feasible allocation in the discretized DP, i.e., it never allocates more than initial supply. We first note that

Furthermore, in order to ensure feasibility, the total allocation i∈ [n] x CONT i can never exceed 1 − 2n regardless of the sample path. As a consequence of these two observations, we must 

The first inequality comes from the facts that underx DISC i , (i) we add at least an amount to the allocation decision of the continuous DP, and (ii) we round realized demand up by at most . Thus, we are lowerbounding the numerator and upper-bounding the denominator. The second inequality comes from the fact that for any a, b, c such that b ≥ a ≥ 0 and c ≥ 0, a+c b+c ≥ a b . As a result, the FR for agent i under the proposed feasible policy in DP DISC (1) is lower-bounded by the FR for agent i in DP CONT (1 − 2n ), simply becausê

This comparison holds for the FR of each agent, which means that the minimum FR under the proposed feasible solution in DP DISC (1) is at least the minimum FR in DP CONT (1 − 2n ) under any realized sample path 1:n] . This is a sufficient condition to prove Claim 4, as the same comparison holds after taking an expectation over the stochasticity in the demand sequence.

Next, we establish a lower bound on the value of the continuous DP when initialized with less supply. decision under this alternate policy is reduced by a factor 1 − δ, the total allocation under this alternative policy will never exceed 1 − δ. Thus, this alternative policy is feasible for DP CONT (1 − δ).

Furthermore, the minimum FR under this alternative policy will always be within a factor 1 − δ of the minimum FR under DP CONT (1), since the demands are the same and the allocation decisions are off by at most a factor of 1 − δ. As a result, there is a feasible solution to DP CONT (1 − δ) that obtains an expected minimum FR of at least (1 − δ)DP CONT (1). This is a sufficient condition to prove Claim 5.

Together, Claims 4 and 5 establish that DP DISC (1) ≥ (1 − 2n )DP CONT (1), which completes the proof of Proposition 3.

We end this subsection by noting that if we set = 2n when designing our discrete grid, then DP DISC (1) ≥

(1 − )DP CONT (1). Moreover, filling the DP In Appendix B.2, we illustrated the challenges of computing the DP solution when faced with an arbitrarily correlated demand sequence. However, even in the special case of independent demands where the DP solution can be efficiently approximated (see Appendix B.3), the DP solution suffers from a number of additional drawbacks. We highlighted these limitations in Remark 2 of Section 3.1, and we summarize these points in Table 3 . While we have already discussed the first two rows of Table 3 , in the following, we illustrate the third limitation using Example 1. Finally, note that because the DP allocation decisions is based on solving a complex stochastic optimization problem, it inherently lacks transparency and interpretability. However, to highlight lack of interpretability even further, we use Example 1 to show how a small change in the demand of one agent can drastically change its DP allocation decision. Unlike the DP solution, the allocation decisions of the PPA policy are minimally impacted by small changes in demand. 38

Example 1. Consider an instance with two agents (n = 2) with independent demand where the total expected demand is almost twice the amount of supply (µ = 2 + for small > 0). There are two possible demand sequences that occur with equal probability, either d = ( 4 3 + , 4 3 ) or d = ( 4 3 + , 0). In this example, the first agent has deterministic demand d 1 = 4 3 + . The second agent either has no demand (with probability 0.5) or has demand d 2 = 4 3 (also with probability 0.5). Suppose the first allocation is deterministically given by x 1 . (We note that neither the optimum online policy nor the PPA policy are randomized policies, and thus the first agent will receive a deterministic allocation under each policy.) Then, the expected minimum FR is equal to 0.5 3x 1 4 + 3 + 0.5 min 3x 1 4 + 3 ,

It is easy to verify numerically that this is maximized at x * 1 (d 1 ) = 4+3 8+3 , which implies that if the second demand is realized, x * 2 (d 2 ) = 4 8+3 . This achieves an expected minimum FR of 3 8+3 . In contrast, the PPA policy allocates x P P A 1 (d 1 ) = 4/3+ 4/3+ +(0.5)4/3 = 4+3 6+3 , which implies that if the second demand is realized, x P P A 2 (d 2 ) = 2 6+3 . Based on our formula above, this achieves an expected minimum FR of 0.5 3 6+3 + 0.5 1.5 6+3 = 3 8+4 , which is quite close to the expected minimum FR achieved by the optimal online policy.

We now make the following two observations based on this example:

(i) The DP solution achieves sub-optimal ex-ante fairness. The FR of the first agent under the optimum online solution is deterministically equal to

, which is larger. Thus, the ex-ante minimum FR that results from the DP solution is 3 8+3 . In contrast, under the PPA policy, the FR of the first agent is deterministically equal to

, which is larger. Thus, the ex-ante minimum FR that results from the PPA policy is 1 2+ . We highlight that not only is the minimum ex-ante minimum FR of the DP solution less than that of our PPA policy, the corresponding ex-ante fairness for the DP solution is less than the guarantee on ex-ante fairness provided our policy (as given in Theorem 4). 38 Here we assume that the small change in demand does not impact the future demand distribution.

(ii) The DP solution lacks interpretability and is sensitive to small perturbations. To see this, we show that the above DP solution can vary significantly if the demand distribution is slightly different. Suppose that we perturb this instance by decreasing the first agent's demand by 2 . In this case, the demand sequence is equally likely to be d = ( 4 3 − , 4 3 ) or d = ( 4 3 − , 0), and the ex-post minimum FR for an initial allocation decision x 1 is given by

Despite the small change in the demand sequence, the DP solution changes dramatically, as this expression is maximized at x * 1 (d 1 ) = 1. This new expected minimum FR of 3 8−6 is essentially unchanged from the original setting, but the first agent (deterministically) receives almost twice as much supply as before. Not only does this demonstrate that the DP solution is highly sensitive, but it also highlights that the DP solution can suffer from a lack of interpretability: the first agent receives more supply in this case, even though the only change was a deterministic and negligible decrease in that agent's demand. In sharp contrast, the allocation decision of our PPA policy barely changes, as the first agent deterministically receives x P P A 1 (d 1 ) = 4/3− 4/3− +(0.5)4/3 = 4−3 6−3 .

Suppose the planner has access to all the demand realizations d, and let f i be the minimum FR of the policy at the end of period i − 1, i.e., f 1 = 1, f i = min{f i−1 , Clearly, this is true when i = n, as the optimal policy is to allocate as much supply as possible, i.e.

x * n = min{d n , s n }. This policy achieves a minimum FR of f n+1 = min f n , x n d n = min f n , min{d n , s n } d n = min f n , s n d n .

We now assume this is true for i > k. In that case, given an allocation to agent k of x k , the minimum FR at the end of period k is given by f k+1 = min f k , xk 

In this appendix section, we consider another class of non-adaptive policies which we call fixed-allocation policies. A fixed-allocation policy is one which pre-determines an allocation x i for each agent i ∈ [n]. 39 The optimal fixed-allocation policy is the policy which, given a joint demand distribution F, pre-determines a vector of allocations x = (x 1 , x 2 , . . . , x n ) which maximizes E d∼ F min i∈[n]

x i d i

.

Proposition 4 (Ex-post Fairness Guarantee of the Optimal Fixed-allocation Policy). Given a fixed number of agents n ∈ N and supply scarcity µ ∈ R ≥0 , the optimal fixed-allocation policy achieves an ex-post fairness guarantee of

We remark that the ex-post fairness guarantee κ fa (µ, n) tends to 0 as the number of agents n gets large.

This is in stark contrast to the guarantees provided by the PPA policy and the optimal TFR policy, which are lower-bounded by a constant regardless of the number of agents.

We prove this proposition by first showing that there exists a distribution where no fixed-allocation policy achieves ex-post fairness greater than κ fa (µ, n), which thus serves as an upper bound on the ex-post fairness guarantee of the optimal fixed-allocation policy. We then show that there exists a fixed-allocation policy that achieves ex-post fairness of at least κ fa (µ, n) for any demand distribution, which means the bound κ fa (µ, n)

is tight.

Upper bound: We prove the hardness result by considering two separate cases corresponding to nµ < 2 and nµ ≥ 2. For each case, we provide an instance of the problem under which any fixed-allocation policy obtains ex-post fairness no larger than κ fa (µ, n).

(i) If nµ < 2, consider a joint demand distribution such that with probability 1− nµ 2 there is no demand, and with probability nµ 2 one agent chosen uniformly at random has demand 2 n . In this case, with probability 1 − nµ 2 , the minimum FR is 1, and with probability nµ 2 , the minimum FR is equal to the allocation of a randomly selected agent (which is at most 1/n) divided by the total demand 2/n. Therefore, the minimum expected FR for this instance is upper-bounded by 1 − nµ 4 .

(ii) If nµ ≥ 2, consider a joint demand distribution where one agent chosen uniformly at random has demand equal to the expected total demand µ. In this instance, the minimum expected FR is upper-bounded by the allocation of a randomly selected agent (which is at most 1/n) divided by the total demand µ.

Taken together, these instances provide an upper bound on the expected minimum FR one can hope to achieve with a fixed-allocation policy. We then scale each instance by our normalization factor, namely W = min{1, 1/µ}, which provides an upper bound of κ fa (µ, n) on the ex-post fairness guarantee (see Definition 3) of any fixed-allocation policy. 39 As explained in Footnote 19, we focus on deterministic policies without loss of generality.

Lower bound: Consider a policy which allocates an equal amount of supply to each agent, i.e., x i = 1 n for all i ∈ [n]. In that case, the minimum FR is lower-bounded by min 1,

We note that the right hand side of the above inequality is convex in i∈[n] d i . Therefore, using Jensen's inequality, the expected minimum FR must be at least 1 − nµ 4 , nµ ∈ [0, 2) 1 nµ , nµ ∈ [2, +∞)

We then scale each this lower bound on the expected minimum FR by our normalization factor, namely W = min{1, 1/µ}. This provides a lower bound on the ex-post fairness guarantee (see Definition 3) that is equal to κ fa (µ, n). Thus, we have shown that κ fa (µ, n) is a tight bound on the ex-post fairness guarantee of the optimal fixed-allocation policy.

E. Missing Proofs of Section 3.4 E.1. Proofs of Claims in Lemma 2 (Section 3.4)

In this subsection, we present the proofs of the three claims which appear in the proof of Lemma 2.

E.1.1. Proof of Claim 1 (Section 3.4) Suppose we have a distribution G such that R G (q) is not flat in (Q G (1), 1], that is, ∃ q 1 , q 2 ∈ (Q G (1), 1] such that R G (q 1 ) = R G (q 2 ). Let R G (q) attain its maximum in [Q G (1), 1] at the quantileq. Now consider a quantile q ∈ (Q G (1), 1] so that R G (q ) < R G (q). Let δ q − Q G (T G (q ) + ) be the total probability mass in the TFR interval (T G (q ), T G (q ) + ]. Then pick a small enough > 0 such that:

(i) + δ + · δ < R G (q) − R G (q ),

Now consider a distributionḠ that is generated from G by moving all the δ probability mass in (T G (q ), T G (q ) + ] to the point T G (q ) + . With this modification in the distribution, the EAFR of every TFR τ ∈ [0, T G (q )] ∪ (T G (q ) + , 1] remains the same. Moreover, the maximum EAFR in the interval (T G (q ), T G (q ) + ] is also achieved at T G (q ) + . Given this target fill rate, the EAFR of the distributionḠ is equal to RḠ(T G (q ) + ) = (T G (q ) + ) Q G T G (q ) + + δ < T G (q ) · Q G T G (q ) + + + δ + · δ < R G (q ) + (R G (q) − R G (q )) = R G (q) .

Therefore, the maximum EAFR over all possible TFRs in [0, 1] is the same for G andḠ, i.e., τ (1 − G(τ )) .

Also, E v∼G 1 v = E v∼G 1 v = µ. Therefore, dropping such a distribution G from the feasible set in the outer optimization of eq. (16) does not change the infimum value, which proves Claim 1.

E.1.2. Proof of Claim 2 (Section 3.4) Suppose we have a distribution G that has non-zero total mass in the interval (1, +∞). Now shift all the probability mass in (1, +∞) to +∞. LetḠ be the resulting distribution. Note that the maximum EAFR among targets in [0, 1] is the same for G andḠ, as the EAFR for any target in [0, 1] remains the same. However,μ = E v∼Ḡ 1 v < E v∼G 1 v = µ, as we have moved the probability mass of v towards larger values (equivalently, the probability mass of demand to lower values).

By using the same trick as above in Appendix E.1.1, we conclude that dropping such a distribution G from the feasible set in the outer optimization of eq. (16) does not change the infimum value, which proves Claim 2.

E.1.3. Proof of Claim 3 (Section 3.4) Consider an inverse demand distributionḠ that satisfies the two constraints given by Claim 1 and Claim 2, namely (i) R G (q) = R G (q ), ∀q, q ∈ [Q G (1), 1], and (ii) G(v) = G(1) for all v ∈ [1, +∞). Let us defineq such that QḠ(1) =q, or equivalently, TḠ(q) = 1.

The EAFR curve forḠ by definition attains a value of RḠ(q) =q. SinceḠ has a constant EAFR curve in the interval [q, 1], its EAFR curve must be constantly equal toq over that interval. Futher, since the EAFR curve can also be expressed as q ·Ḡ −1 (1 − q), we must haveḠ(q/q) = 1 − q for all q ∈ [q, 1]. Equivalently, using a change of variable v =q/q, we must haveḠ(v) = 1 −q/v for all v ∈ [q, 1] (which impliesḠ(q) = 0).

Further, since the CDFḠ pushes all the probability mass in the interval (1, +∞) to +∞,Ḡ must be constant in the interval [1, +∞). Thus, we have uniquely describedḠ, up to a constantq:

For this distribution to have an expected demand of µ, consider the corresponding CDF for demand F : R ≥0 → [0, 1] for the random variable x 1 v where v ∼Ḡ. We have:

if x ∈ ( 1 q , +∞) As a result,

The unique solution to 1 2 1 q −q = µ satisfyingq ≥ 0 isq = 1 µ+ √ µ 2 +1

. We highlight thatḠ withq = 1 µ+ √ µ 2 +1 is identical to the distributionĜ defined in eq. (15). This shows thatĜ is the unique worst-case distribution.

If we target a fill rate of τ ∈ [0, 1], the expected minimum fill rate is at least

By Cantelli's inequality (a generalization of Chebyshev's inequality for single-tailed distributions), if the variance of i∈[n] d i is given by σ, then for any δ ≥ 0, Pr i∈[n] d i − µ ≥ δ ≤ σ 2 σ 2 +δ 2 . Taking δ = (1−τ µ) τ and letting τ = x/µ, we can lower bound the expected minimum fill rate with

where the inequality comes from the assumption that the coefficient of variation is at most c.

We now optimize over feasible thresholds (i.e., over the domain x ∈ [0, min{1, µ}], which ensures δ ≥ 0 and τ ∈ [0, 1]). The ex-post minimum fill rate of the optimal TFR policy is thus lower-bounded by max x∈[0,min{1,µ}]

x µ

Scaling by the normalization factor of W = min{1, 1/µ} completes the proof of Proposition 2.

We prove this theorem by first providing two instances which together show that no policy achieves ex-ante fairness greater than κ a (µ, n). We then show that our PPA policy achieves ex-ante fairness of at least κ a (µ, n)

for any demand distribution, which means the bound is tight.

Upper bound: We prove the hardness result by considering two separate cases corresponding to µ < 2 and µ ≥ 2. For each case, we provide an instance of the problem under which no policy can obtain ex-ante fairness larger than κ a (µ, n).

(i) If µ < 2, consider a joint demand distribution for an arbitrary number of agents such that with probability 1 − µ 2 there is no demand, and with probability µ 2 the total demand is 2 (arbitrarily and deterministically split among agents). In this case, with probability 1 − µ 2 , each agent achieves an FR of 1, and with probability µ 2 , the expected FR cannot exceed 1 2 for at least one agent. Therefore, in this instance the minimum expected FR is upper-bounded by 1 − µ 4 .

(ii) If µ ≥ 2, consider a deterministic demand distribution where total demand is equal to its expectation µ (arbitrarily split among n agents). In this case, the minimum expected FR is clearly upper-bounded by 1 µ . Taken together, these instances provide an upper bound on the minimum expected FR one can hope to achieve with any policy. We then scale each instance by our normalization factor, namely W = min{1, 1/µ}, which provides an upper bound of κ a (µ, n) on the ex-ante fairness guarantee (see Definition 3) of any policy.

Lower bound: We now show that the PPA policy achieves an ex-ante fairness guarantee of κ a (µ, n).

First, we show via induction that after the arrival of any agent i ∈ [n], the ratio d i +µ i+1 s i , i.e., expected total remaining demand to remaining supply, is in expectation at most µ. We then place a lower bound on the expected FR of agent i when following the PPA policy which is a decreasing and convex function of the ratio d i +µ i+1 s i . We conclude by applying Jensen's inequality to lower bound the expected FR of agent i. Since this lower bound is the same for each agent, it constitutes a lower bound on the minimum expected FR.

Claim 6 (Upper Bound on Demand-to-Supply Ratio). When following the PPA policy, for all i ∈

Proof: We proceed by induction. Clearly, when i = 1, E d∼ F d 1 +µ 2 s 1 = E d∼ F µ 1 s 1 = µ. We now assume that this holds for i = k and attempt to prove the claim for i = k + 1. According to the PPA policy, x k ≤ s k d k d k +µ k+1 . Thus, s k+1 is at least s k µ k+1 d k +µ k+1 . Consequently,

The final inequality comes from our inductive hypothesis, which completes the proof by induction.

Given a current demand d i and expected future demand µ i+1 , the FR of agent i is min 1, s i d i +µ i+1 . It is straightforward to show that this FR is lower-bounded by the following function of the ratio

We remark that h is a decreasing and convex function of its argument. Hence, by Jensen's inequality and Claim 6,

Since this lower bound on the expected FR holds for each agent i, we have shown a lower bound on the minimum expected FR when following the PPA policy. When scaled by our normalization factor, namely W = min{1, 1/µ}, this lower bound exactly matches the upper bound of κ a (µ, n) established above, and thus completes the proof of Theorem 4.

G. Missing Proofs of Section 5.2 G.1. Proof of Corollary 1

Suppose the allocation given by the PPA policy is x. By Theorem 2, the PPA policy achieves ex-post fairness of κ p (µ, n) when the social welfare function is the minimum FR, which implies U +∞ ( x) = min i∈[n] { x i d i } ≥ κ p (µ, n)W . Now consider a new allocation vector x such that for all i ∈ [n],

We remark that x i ≥ x i for all i ∈ [n]. Further, it is easy to verify that U α ( x ) = U +∞ ( x) for any α ∈ [0, +∞). Since U α ( x) is non-decreasing in each x i , U α ( x) ≥ U +∞ ( x).

Therefore, the expected WPM social welfare when following the PPA policy is weakly greater than the minimum FR (regardless of the fairness parameter α). Scaling by the achievable social welfare when demand is deterministic, we have shown that ex-post fairness, i.e., E d∼ F [U α ( x)] /W , must be at least κ p (µ, n).

Theorem 2 establishes that the PPA policy guarantees an expected minimum FR for resource j of at least κ p µ j s j , n max 1, µ j s j . Consequently, we can place a lower bound on the expected minimum weighted FR:

λ j κ p µ j s j , n max 1, µ j s j .

Consider a correlated distribution for demands-correlated across agents and resource types-where the marginal distribution of demands for each resource type matches the worst-case joint distribution of the single-type problem described in the proof of Theorem 1. These marginal distributions are then coupled such that the last-arriving agent with non-zero demand for resource j is the same as the last-arriving agent with non-zero demand for resource j , for any resources j, j ∈ [m] for which at least one agent has non-zero demand. Given the marginal distributions, each agent is equally likely to be this last-arriving agent. For any sample path drawn from this distribution, it is without loss of generality to only consider policies where the allocation is decreasing (i.e., where this last-arriving agent has the worst FR) for every resource. 40

If supply of resource j is s j , Theorem 1 establishes an upper-bound of κ p µ j s j , n max 1, µ j s j on the expected minimum FR of any policy in the single-type problem corresponding to resource j. Since for every sample path the agent with the worst FR (i.e., the last-arriving agent) is the same across resources, aggregating these bounds establishes an upper-bound of j∈[m] λ j κ p µ j s j , n max 1, µ j s j on the expected minimum weighted FR.

40 For intuition as to why this is without loss of generality, note (i) each agent with non-zero demand for resource j ∈ [m] has identical demand for that resource, (ii) each agent linearly aggregates their FRs by the same weights {λ j } j∈ [m] , and (iii) each agent is equally likely to be the last-arriving agent. Consequently, if agent i has a strictly larger allocation than agent i (where i < i) for any resource j, a policy which switches their allocations for that resource will have a weakly greater expected minimum weighted FR.

An empirical framework for sequential assignment: The allocation of deceased donor kidneys

Fast algorithms for online stochastic convex programming

Bayesian combinatorial auctions: Expanding single buyer mechanisms to many buyers

Optimal auctions vs. anonymous pricing

Online fair division: Analysing a food bank problem

Modelling the impact of testing, contact tracing and household quarantine on second waves of covid-19

Infectious diseases of humans: dynamics and control

Information design for congested social services: Optimal need-based persuasion

Design of lotteries and wait-lists for affordable housing allocation

Social Choice and Individual Values

Kidney exchange in dynamic sparse heterogenous pools

On matching and thickness in heterogeneous dynamic markets

On the measurement of inequality

How to allocate goods in an online market?

The best of many worlds: Dual mirror descent for online allocation problems

Approximations to stochastic dynamic programs via information relaxation duality

Multiagent mechanism design without money

Dragos Florin Ciocan, and Vahab Mirrokni. Fair resource allocation in a volatile marketplace

The price of fairness

On the efficiency-fairness trade-off

Fairness, efficiency, and flexibility in organ allocation for kidney transplantation

Semimechanistic bayesian modeling of covid-19 with renewal processes

Capacity choice and allocation: Strategic behavior and supply chain performance. Management science

Fair allocation through selective information acquisition

Revenue management with repeated customer interactions

Group-fair online allocation in continuous time

Efx exists for three agents

Model predictive control for dynamic resource allocation

Price discrimination with fairness constraints. Available at SSRN 3459289

Mathematical epidemiology of infectious diseases: model building, analysis and interpretation

Fairness and utilization in allocating resources with uncertain demand

The humanitarian pickup and distribution problem

Fair allocation of scarce medical resources in the time of covid-19

Online contention resolution schemes

Best of both worlds: Ex-ante and ex-post fairness in resource allocation

The remarkable robustness of the repeated fisher market. Available at SSRN 3411444

From monetary to nonmonetary mechanism design via artificial currencies

Effective, fair and equitable pandemic rationing. Available at SSRN 3646539

Individual fairness in hindsight

Mechanism design and approximation

Prophet inequalities with linear correlations and augmentations

A re-solving heuristic with bounded revenue loss for network revenue management with customer choice

Achieving high individual service-levels without safety stock? optimal rationing policy of pooled resources. Optimal Rationing Policy of Pooled Resources

Managing the Demand for Public Housing

No agent left behind: Dynamic fair division of multiple resources

On semiamarts, amarts, and processes with finite value. Probability on Banach spaces

Optimal online contention resolution schemes via ex-ante prophet inequalities

Information distortion in a supply chain: The bullwhip effect

Dynamic matching in overloaded waiting lists. Available at SSRN 2967011

Optimal interventions for increasing healthy food consumption among low income households

Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov-2)

Sequential resource allocation for nonprofit operations

An economic view of prophet inequalities

Group-level fairness maximization in online bipartite matching

Online matching and ad allocation

Comparing the responses of the uk, sweden and denmark to covid-19 using counterfactual modelling

Fair end-to-end window-based congestion control

One year of modeling and forecasting covid-19 transmission to support policymakers in connecticut

Balancing the tradeoff between profit and fairness in rideshare platforms during high-demand hours

Robust optimization approaches for the equitable and effective distribution of donated food

Reconciling early-outbreak estimates of the basic reproductive number and its uncertainty: framework and applications to the novel coronavirus (sars-cov-2) outbreak

Fair allocation of vaccines, ventilators and antiviral treatments: leaving no ethical value behind in health care rationing

A theory of justice

Optimal stopping values and prophet inequalities for some dependent random variables

Comparison of threshold stop rules and maximum for independent nonnegative random variables

Sequential fair allocation of limited resources under stochastic demands

The stop-and-drop problem in nonprofit food distribution networks

The division problem with single-peaked preferences: a characterization of the uniform allocation rule

Prophet inequality with correlated arrival probabilities, with application to two sided matchings

The duration of infectiousness of individuals infected with sars-cov-2

Online cake cutting

Measuring and achieving equity in multiperiod emergency material allocation

Desperate for medical equipment, states encounter a beleaguered national stockpile

desperate-for-medical-equipment-states-encounter-a-beleaguered-national-stockpile

Ford and gm are undertaking a warlike effort to produce ventilators. it may fall short and come too late

The authors would like to thank Itai Ashlagi, Amin Saberi, and Ed Kaplan for helpful comments and insights at early stages of this work.