key: cord-0474970-gsq6rngl
authors: Carmona, Rene; Dayanikli, Gokce; Lauriere, Mathieu
title: Mean Field Models to Regulate Carbon Emissions in Electricity Production
date: 2021-02-18
journal: nan
DOI: nan
sha: 385042e56e3f6b449bc7ec032677f3b385366b00
doc_id: 474970
cord_uid: gsq6rngl

The most serious threat to ecosystems is the global climate change fueled by the uncontrolled increase in carbon emissions. In this project, we use mean field control and mean field game models to analyze and inform the decisions of electricity producers on how much renewable sources of production ought to be used in the presence of a carbon tax. The trade-off between higher revenues from production and the negative externality of carbon emissions is quantified for each producer who needs to balance in real time reliance on reliable but polluting (fossil fuel) thermal power stations versus investing in and depending upon clean production from uncertain wind and solar technologies. We compare the impacts of these decisions in two different scenarios: 1) the producers are competitive and hopefully reach a Nash Equilibrium; 2) they cooperate and reach a Social Optimum. We first prove that both problems have a unique solution using forward-backward systems of stochastic differential equations. We then illustrate with numerical experiments the producers' behavior in each scenario. We further introduce and analyze the impact of a regulator in control of the carbon tax policy, and we study the resulting Stackelberg equilibrium with the field of producers.

Nowadays, it is widely accepted that the most serious threat to ecosystems is the global warming fueled by the uncontrolled increase in carbon emissions, and for the last twenty-some years, starting with the Kyoto Protocol in 1997, international treaties have sprung out in hope to address this negative externality. The most recent of these treaties is the Paris Agreement with 196 signatories aiming at keeping the increase in temperature below 2˝C. Throughout the world, local and federal governments try to disincentivize reliance on polluting means of production by introducing carbon taxes or cap-and-trade programs. In the latter case, regulators put a limit on the allowable quantity of Green House Gas (GHG) emissions, any quantity above this limit having to be covered by emission certificates (allowances) or the payment of a penalty. In the former case, whether they are levied upstream or downstream, carbon taxes aim at penalizing the use of fossil fuels for their carbon content. The interested reader is referred to [12, 13, 11] for a review of the state of affairs in the early days of the European Union Emission Trading System, and mathematical treatments of thorough partial equilibrium models for the comparison of realistic implementations of these policies in the electricity sector.

According to the Environmental Protection Agency, electricity production claims the lion share (25%) of the total Greenhouse Gas emissions in the US. 1 So here, we concentrate on the electricity sector and we propose a model for the analysis of the impact of investments in clean means of production (e.g. solar and wind). While a model of the electricity sector should comprise at least three types of agents: electricity producers, resellers / retailers, and the end-users, we shall concentrate our modeling effort on the producers. Until the challenging technological problem of electricity storage is resolved at a larger scale, the demand for this commodity remains inelastic, and we shall penalize the producers for not matching the demand, forcing the Independent System Operator (ISO) to rely on costly reserves. In the following, we shall use the term renewable to mean electricity produced from wind turbines or solar panels. Alternatively, we shall use the term non-renewable to mean electricity produced by burning fossil fuels like coal, crude oil or natural gas. We chose this convention for convenience, even if this literary license is not completely accurate.

Individual producers control over time their usage of fossil fuels, and hence, the amount of CO 2 emissions they are responsible for. They also control their possible investment in solar or wind production, should they decide to go that route. Notice that while the decision to use fossil fuels changes over time, the investment in solar panels or wind turbines is a one-time decision made at the beginning of the time period under consideration. In our model, producing electricity from renewable sources involves an initial investment and no extra cost over time since the marginal cost of running these production assets is practically zero (except from maintenance costs and possible subsidies which we ignore here). While the zero cost of production is an attractive feature, it comes with the very high risks due to the difficulties to predict the weather and the uncertainty associated with the high volatility of these predictions. On the other hand, production from traditional power plants is more predictable, the costs depending upon the prices of the fuels and the price put on the CO 2 emissions by the regulator. Each producer has to find the right balance between the pros and the cons of the two major means of production we single out in our stylized model. The overarching goal is to decarbonize so as to meet emission targets, harnessing demand-side policies through the establishment of a tax, as well as supply-side resources including wind and solar production technologies.

Our economic model is based on the premises that the individual producers and the regulator have only access to aggregate quantities. Basically, they only have access to the statistical distributions of the productions, emissions, investments, etc of the individual producers. As a result, we propose two separate frameworks for the individual producers to optimize the mix of renewable and nonrenewable production they should include in their portfolios. We compute and compare the optimal centralized strategies by solving mean field control problems, and the optimal decentralized strategies by solving mean field game problems. Our theoretical analysis relies on the probabilistic approach to construct forward-backward stochastic differential equation (FBSDE) systems for which we show, in both settings, existence and uniqueness of the solutions. Further, we propose a numerical approach to monitor the effect of a carbon tax on the optimal and equilibrium decisions in both cases. Quantifying the differences between the two approaches is reminiscent of what is known as the Price of Anarchy (PoA).

Among the conclusions drawn from the analysis of our model, we confirm that a carbon tax is an effective incentive for the use of renewables. Also intuitive is the fact that in the absence of a carbon tax, the overall pollution is greater when producers compete than when they cooperate. Less obvious is the fact that cooperating producers will pollute less than when they compete, even if the carbon tax is significant. We also show that stricter regulations tend to reduce the differences between competitive and cooperative equilibria. Further, we argue that the best way for the regulator to encourage producers to match the demand is to incentivize competition over cooperation among the producers.

Mean Field Game (MFG) models appeared simultaneously and independently in the original works of [9] and [23] . The thrust of these works was to propose a paradigm to overcome the challenges of the search for Nash equilibria in large games by considering models for which the interactions between the players were of a mean field type, and deriving effective equations in the limit when the number of players goes to infinity. Models in which a single player plays a different role from the field of remaining players were introduced and studied under the name of MFGs with major and minor players. In their Stackelberg version, they had a significant impact on problems in economic contract theory. See for example [7] , [26] , [18] , or [14] , [17] or [5] . Notice that in these models, the major player uses a time dependent control, while in this paper, we shall assume that the regulator uses time independent controls.

Using mean field models for energy applications is very natural. Competition in the oil industry and the impact of the renewable energy competition was analyzed in [19] and [15] . The early work [19] was extended with the addition of a regulator in [1] . In [3] , optimal entry and exit times for two types of agents, electricity producers using either renewable or nonrenewable energy resources, are analyzed using MFGs. Competition among electricity producers is analyzed in [16] by using Mean Field Type Game where the mean field interactions come through conditional expectation of the electricity price and in [4] by using a model where the interactions enter the electricity spot price. In [2] and [17] , electricity consumers constitute the mean field population and a single electricity producer plays the role of the principal, in contrast to our model where we take the electricity producers as the mean field population and the regulator as the principal.

Mean field models have also been used to model environmental impacts. In [6] , a MFG model is proposed to model climate change negotiations among countries interacting through a CO 2 emission permit market. Emission certificate markets are also studied in [27] and [28] , again without the presence of a regulator.

The paper is structured as follows. In Section 2, we introduce the minor players' model and the various equilibrium notions used in the sequel. In Section 3 (resp. 4), the main theoretical (resp. numerical) results for the minor players' model are given. In Section 5, we introduce the regulator and define the relevant notions of equilibrium. Finally, we provide numerical results for the combined model with minor players and the regulator in Section 6 and we summarize our findings in a short Section 7.

Although we will focus on mean field limits involving an infinite number of players, we start with the description of what the N -player version of the game would be. For symmetry reasons, we assume that the total electricity demand is split equally between all the agents, and each agent faces the same demand, say D t at time t. The state of producer i is five-dimensional: instantaneous electricity production Q i t P R`, instantaneous irradiance S i t P R`, instantaneous emission level E i t P R`, cumulative pollution P i t P R`, and instantaneous nonrenewable energy productionÑ i t P R`. Producer i controls their state by choosing at time t " 0, their initial investment R i e P R in renewable production assets (e.g. the number of solar panels they purchase), and at each subsequent time t, by choosing the rate of change N i t P R in nonrenewable energy production. Notice that N i t is time dependent while R i e is time independent. This will be a challenging feature of the mathematical analysis of our model. Remark 2.1. For the sake of definiteness, we use the terminology of solar power production. However, other types of renewable energy can be modelled in a similar way. For example, for wind power, S i t would stand for the instantaneous output of a wind farm and R e would be the corresponding units of initial investment.

So with these proviso out of the way, we define the time evolution of the state of producer i as:

The instantaneous electricity production changes depend on the instantaneous nonrenewable energy usage (given by term 1) and the instantaneous yield from the renewable energy investment (given by term 2). This second term includes a seasonality component (sinusoidal term) and a random shock for the variability of the sun irradiance. The form of the seasonality component was chosen for the sake of simplicity. It can easily be extended to several harmonics to include nightly and daily, monthly and yearly effects. In any case, we have Q i t " Q i 0`κ1Ñ i t`κ2 R i e psinpαtq`S i t q where κ 1 , κ 2 ą 0 are constants that give the efficiency of the production from nonrenewable and renewable energy, respectively. The constant α ą 0 gives the period of the seasonality of the renewable energy.

We model the idiosyncratic noise terms S i t in the renewable productions as independent stationary processes. For the sake of definiteness, we assume that they are Ornstein-Uhlenbeck processes with the same mean θ ą 0 and volatility σ 0 ą 0, the | W i being independent Wiener processes. The dynamics of the instantaneous emissions E i t have two components: the contribution from the production from nonrenewable energy power plants, and idiosyncratic random shocks with constant volatility σ 1 ą 0 given by independent Wiener processes W i , also independent of the | W i 's. The choice of the constant δ could include the effects of some abatement measures such as carbon capture, sequestration and the use of filters.

Using the notationÑ i t for the instantaneous nonrenewable given byÑ i t "Ñ i 0`ş t 0 N i s ds, the expected cost to producer i over the whole period is: , (2.1)

whereQ " ř N j"1 Q j {N and p : R`Þ Ñ R`is the price function for the investment in renewable energy. Term 1 with c 1 ą 0, is a penalty (ie. delay cost) for attempting to ramp up and down nonrenewable energy power plants too quickly. Term 2 represents the costs of the fossil fuels used in nonrenewable power plants. The constant p 1 ą 0 can be understood as the average cost of one unit of fossil fuel. In lieu of storage which is not included in our models because of its scarcity, Term 3 with c 3 ą 0, imposes a penalty on producers for not matching the demand and forcing the system operator to use costly reserves. Term 4 represents the revenues from electricity production,`ρ 0`ρ1 pD t´Qt q˘being the inverse demand function which is assumed to be linear in excess demand or supply. Here ρ 0 and ρ 1 are strictly positive constants. It captures the fact that the price increases if there is excess demand, and it decreases if there is excess supply. We assume that the producers are selling what they produce. This term introduces the mean field interactions into the model. Term 5 gives the carbon tax levied by the regulator. We emphasize its role by assuming it is proportional to the square of the terminal pollution. Term 6 is the total cost related to the initial investment in renewable electricity production including the price of the solar panels and the cost of the land used.

In discussing the mean field regime of the model, we focus on a representative producer interacting with the field of the other producers, so we drop the superscript i and the dynamics equations become:

where W and | W are independent Wiener processes. Accordingly, the total expected cost becomes:

whereQ t " ErQ t s. We shall sometimes use the notationQ t pN, R e q to emphasize the fact that the expectation is computed under the state dynamics controlled by the admissible control pN, R e q.

We consider two different models: mean field game (MFG) and mean field control (MFC). In the mean field game model, producers behave competitively and minimize their total expected costs (search for their best responses) given the other players' decisions. A Nash equilibrium is then characterized as a fixed point of the best response map so defined. In the sequel, we restrict our attention to admissible strategies pN, R e q such that Er ş T 0 |N t | 2 dts ă`8 and R e P R`.

Definition 2.2 (MFG Nash Equilibrium). An admissible strategy and mean field flow tuple, pN ,R e ,Qq, is called an MFG Nash equilibrium for any admissible pN, R e q, we have:

C´pN, R e q;Q¯ě C´pN ,R e q;Q¯, andQ "QpN ,R e q.

In the mean field control case, we assume that the producers cooperate and leave the choice of the control to a social planner minimizing the total expected cost as defined in (2.3) . In the realistic setup, the producers can be thought as the production facilities of a monopolistic electricity production firm and the social planner's decisions refer to the decisions taken by the headquarter. In this case, if one player changes their behavior, every player changes in the same way, and the mean field is affected. The problem is now an optimal control problem.

). An admissible strategy and mean field flow tuple, pN ,R e ,Qq, is called an MFC optimum if for any admissible pN, R e q, we have:

C´pN, R e q;QpN, R e q¯ě C´pN ,R e q;Q¯, andQ "QpN ,R e q.

In this section, the following forward backward stochastic differential equation system (FBSDE) is going to be of interest:

Qq is a Nash equilibrium if and only if pN ,R e q is given by: 

q is a solution to the FBSDE given in (3.1) where the equation for pY 1 t q t is replaced by

Theorem 3.5. Assume Condition 3.2 holds, then there exists a unique mean field control optimum flow Q.

For numerical purposes, given the technical challenges posed by the solution of the large FBSDE in 3.1 with the existence of time dependent and independent controls, we implement an analytic approach for which we give the details below. For this reason, we first notice that:

and we assume that R e is fixed in a first analysis. Next, we rewrite the model in matrix form using X t :" rQ t S t E t P tÑt s J as 5-dimensional state process at time t, and rewrite the optimization problem as:

where R and J t are the scalars given by R " 2c 1 and J t " c 2 D 2 t and: 

Furthermore, we define Ă W t and a as:

, and the value function upt, Xq as:

For R e fixed, if there exists a function t Þ Ñ pη t , r t ,X t q solving the following system of Ordinary Differential Equations (ODEs):

and if s 0 is given by:

thenN t pR e q "´R´1B J pη t X t`rt q is the MFG equilibrium given R e fixed, and the expected cost to the representative producer in this equilibrium is:

For R e fixed, if T is small enough, there exists a unique MFG equilibrium.

Given R e , if there exists a function t Þ Ñ pη t , r t ,X t q solving the ODE system (4.3) with the equation (4.3b) replaced by:

and the same s 0 given by (4.4), then Nt pR e q "´R´1B J pη t X t`rt q is an optimum for the MFC problem given R e , and the minimal expected cost is Numerically, we search for the R e and the corresponding equilibrium N " pN t q t that minimizes the cost of the minor players by using the ODE systems given in (4.3) and (4.6).

As emphasized earlier, the main difference between MFC and MFG is whether the mean field is affected by the decision of the representative producer (MFC), or taken to be fixed (MFG). This difference translates into the addition of a fixed point argument in the MFG case. For pedagogical reasons, we first discuss the MFC case, then the MFG. After solving the Riccati equation which is the same in both cases, we solve the coupled ODE system directly in the MFC case in order to find the mean field; on the other hand, notice that in the MFG case, the ODEs are decoupled since the mean field is assumed to be fixed in each iteration of the fixed point algorithm.

In order to solve the system of MFC coupled ODEs for pX t q t and r t given by equations (4.3c) and (4.6), we discretize the time with uniform step size ∆t and solve the following linear equation:

whereX " rX 0 ,X ∆t ,X 2∆t , . . . ,X T s J , r " rr 0 , r ∆t , r 2∆t , . . . , r T s J .

Algorithm 1 Computation of the Mean Field Control Cost over pN t q t given R e

Calculate pη t q t by solving the Riccati Equation in (4.3a)

3:

Solve the coupled pX t q t and pr t q t linear system (4.8) given pη t q t and R e

Calculate s given R e , pr t q t and pη t q t using the equation in (4.4)

Calculate the expected cost associated with R e ,ĉ :" infC MF C pN ; R e ,Xq using (4.7)

6:

return (ĉ,X) Search for the optimalR e where the optimal cost R e Ñ cpR e q and optimal mean field R e ÑXpR e q are computed by Optim-MFC-N

Letĉ " cpR e q andX "XpR e q 4: return pĉ,R e ,Xq

In Mean Field Game case, since in each iteration it is assumed that the pX t q t is fixed, the ODE for pr t q t in equation (4.3b) can be solved directly by using the following linear equation after we discretize time:

where r " rr 0 , r ∆t , r 2∆t , . . . , r T s J . Then with this pr t q t , the time discretization of pX t q t with dynamics given by the equation (4.3c) can be written as:

whereX " rX 0 ,X ∆t ,X 2∆t , . . . ,X T s J . The numerical algorithms to find the Mean Field Control and Game Equilibria are given in detail in the following sections.

Algorithm 3 Computation of the Expected Cost over pN t q t given R e , pX t q t

Calculate pη t q t by solving Riccati Equation in (4.3a)

3:

Solve the linear system for pr t q t in (4.9) given pX t q t , R e and pη t q t

Calculate s given R e , pr t q t and pη t q t using the equation in (4.4)

Calculate the cost associated with R e and pX t q t ,ĉ :" infC MF G pN ; R e ,Xq using (4.5) Initialize pX 0 t q t

while ||X k´X k´1 || ą ǫ do

Search for the optimalR e givenX k where the optimal cost pR e ,X k q Ñ c k pR e ,X k q is computed by Optim-MFG-N

LetR k e " arg min Re c k pR e ,X k q 6:

Compute pX k`1 t q t givenR k e , pX t k q t by solving the linear equation (4.10)

LetR e "R k e ,ĉ " c k pR e q andX "X k 8:

return pĉ,R e ,Xq

In the numerical experiments reported below, we use the following parameter values:

Furthermore, we assume that ppR e q " p 2 R e´p3 a R e pR max e´Re q`ǫ where p 2 , p 3 are positive constants and ǫ ą 0 is a small constant that ensures the nonnegativity of the price of the units of the renewable energy investment. 3 We focus on natural gas as the source of nonrenewable energy. In our numerical experiments, we ignore the effect of the COVID-19 pandemic, and we run simulations for 20 years starting from March 2020. For the cost of solar power, we use the current assumption that a 1MW solar farm needs a 1M$ investment 4 , and use daily peak sun hours data to compute daily average production from solar panels. We assume that on average, peak hours last approximately 5 hours in the US, and we infer that a solar farm built with an initial investment of $10, 000 generates on average 0.5MWh in 10 days. We choose α to take into account the seasonality, since sun exposure levels are maximum during summers, minimum during winters. Therefore, we infer that one unit investment of R e corresponds to «$10,000 and on average it generates κ 2 pθ˘1q " 0.1p5˘1qMWh electricity in 10 days.

According to the data provided by the U.S. Energy Information Administration (EIA), in 2018, 1, 365, 822 Million KWh were produced by the electric sector by using 10,215 billion cubic feet of natural gas 5 . Therefore, we assume that 1000 cubic feet of natural gas produce approximately 0.13MWh. Again, according to the data provided by EIA, typical natural gas power plants produce 0.91 pound of carbon emission per kWh electricity generation. 6 Since 1000 cubic feet of natural gas produce around 0.13 MWh, we conclude that around 122 pounds of carbon is emitted with the use of natural gas. Moreover, if coal or other fossil fuels were used, this carbon pollution would have to be much more than doubled. Taking this into account and converting to metric tone we end up with δ « 0.15.

We assume that the average demand of electricity for each plant is around the capacity of the plants. According to EIA 7 , the average daily capacity of a natural gas plant is around 2166.5 MWh in 2018. Furthermore, the monthly seasonal component is found by using the monthly residential electricity consumption in 2018 data given by EIA 8 . Therefore, 10 day demand is taken sinusoidal to show the seasonality around 20, 000MWh. According to the data provided by EIA 9 nonrenewable energy has 40% of its fuel cost as the operation and maintenance costs on top of the fuel cost in 2018 and the price of 1000 cu ft natural gas can be assumed $5. Therefore, we take p 1 " $7. Finally by using the daaata given by EIA 10 , we see that the average price of wholesale electricity is around $40 per MWh, therefore we take ρ 0 " 40.

From the heat maps in Figure 1 , we see that the expected cost of the representative producer is increasing with the carbon tax τ and the penalty c 2 . The second observation is that as expected, for any given couple pτ, c 2 q, the expected cost is higher in the Nash equilibrium than for the Social Optimum. Next, we quantify how inefficient the Nash equilibrium is, and the effect of τ and c 2 on this inefficiency. In other words, we quantify the adverse effect of the non-cooperative behavior of the producers by computing the Price of Anarchy (PoA) defined in (4.11) for different values of τ and c 2 .

The results are given in the bottom subfigure in Figure 1 . Since for any given pτ, c 2 q the expected cost in a MFG equilibrium is higher, PoA is expected to be greater than 1 and as it gets higher, the Nash Equilibrium is getting less efficient. It can be seen that PoA gets smaller as we increase τ and c 2 . This means that for higher levels of τ and c 2 , the expected costs of to the producers become closer. In other words, the impact of the social planner diminishes and the advantages of cooperation lessen as the regulator imposes stricter regulations. 

Here, we analyze the effect of the penalty c 2 for not matching the demand and the carbon tax τ , on the optimal energy production portfolio in both MFC and MFG models. Figure 2 shows the total production and the decomposition of this production over a 20 year period together with a detailed zoom in behavior between years 1 and 3. The left subfigure in Figure 2 , shows that the demand is not matched by the producers in the MFC case. This is because the penalty coefficient c 2 is low and the increased revenue from scarce supply is more advantageous. Here, we see that in the control setting, producers behave as a big monopoly when not matching the demand is inexpensive. When the penalty is increased the middle subfigure in Figure 2 shows that producers try to match the demand and their behaviors in the MFC and MFG cases are similar. In both of these figures there is no carbon tax, therefore the producers do not have incentives to invest in renewable energy, and as a result, all the production is exclusively from the nonrenewable sources. On the right subfigure in Figure 2 when the carbon tax is increased we see that the producers have an incentive to invest in renewable energy.

We also analyze the effect of the planning horizon where we compare the cases in which the producers are planning for the next 2 years vs. planning for the next 20 years. As it can be seen in the left and middle subfigures in Figure 3 , when the planning horizon is short, the fixed costs of renewable energy outweigh its advantages. Short-sighted producers do not have an incentive to invest in renewable energy production. 

The right subfigure in Figure 3 , shows that whatever the level of the carbon tax, the terminal pollution levels are higher when the producers are competitive (MFG). Further, in the absence of a carbon tax, producers can decrease the pollution levels further by cooperating and following a social planner instead of implementing a carbon tax.

In this section, we describe how the previous models can be extended to include a major player in charge of choosing the tax level τ on behalf of a policy maker, and the penalty c 2 for not matching the demand on behalf of system operator. We shall treat this major player as a regulator, and we shall often speak of minor players when we talk about the producers. We extend the "minor player only" model used previously by offering the producers the option to withdraw their entire production, de facto walking away from the contract imposed by the regulator. This decision is made when the expected cost to the producer is higher than a fixed level above which producing at such a level of loss does not make sense. If we refer to the plots in Figure 1 , we can see that the cost of the minor player is increasing with higher tax and the penalty for not matching the demand. Therefore, the regulator should be careful not to enact policies with very high values of τ and c 2 . In the new model, the regulator does not have a private state per se. It only has 2 controls which are the carbon tax level (τ P R`) and the penalty pc 2 P R`q. Both controls are assumed to be time independent. This assumption is especially realistic when the period r0, T s is too short for changes in regulation to make sense. The cost function of the regulator is given as: .

(5.1)

The first term is the cost for exceeding the pollution targetPT ěP 0 announced at t " 0. Since we use the notation x`" maxp0, xq, there is no penalty if the terminal pollution level is below the target. The constant α 1 ą 0 quantify the size of the penalty. The second term is the revenue from the carbon tax. To prevent the regulator from choosing an abusive high tax to increase their revenue, Term 3 is added to represent a reputation cost (α 2 , α 3 ą 0). The joint roles of Term 4 and Term 5 is to insure that the responsibility of matching the demand is not only incumbent on the producers, but also on the regulator, influencing the choice of α 4 ą 0. This is consistent with our characterization of our major player / regulator as a policy maker as well as a system operator bearing the brunt of managing the ancillary services to avoid disruptions like system black-outs.

We analyze two types of equilibria in the models with a regulator. In both cases, we consider that the regulator announces their policy first, and the producers react accordingly. This is in the realm of Stackelberg games. We call the first equilibrium Stackelberg MFC equilibrium. In this case, the regulator assumes that a social planner chooses the controls used by the electricity producers. The latter behave like one big monopolistic firm. Therefore, the regulator chooses the tax level, τ and penalty coefficient c 2 , assuming that the producers will settle in a MFC optimum. Note that in this interpretation the regulator and the social planner are two different entities. We define this equilibrium formally as: J´pτ, c 2 q;X`N˚pτ, c 2 q, Re pτ, c 2 q˘¯ě J´pτ˚, c2 q;X`N˚pτ˚, c4 q, Re pτ˚, c2 q˘¯.

The second equilibrium is called Stackelberg MFG Equilibrium. In this one regulator assumes that electricity producers are competitive and it chooses τ and c 2 levels by assuming that the minor player population is at Nash Equilibrium. We can define this Equilibrium formally as Definition 5.2 (Stackelberg MFG Equilibrium). For every pτ, c 2 q, letN pτ, c 2 q,R e pτ, c 2 q¯be the producers MFG Nash equilibrium given the tax level τ and the demand satisfaction coefficient c 2 . In other words, for any admissible`N, R e˘, we have:

C´`N, R e˘;X`N pτ, c 2 q,R e pτ, c 2 q˘, pτ, c 2 q¯ě C´`N pτ, c 2 q,R e pτ, c 2 q˘;X`N pτ, c 2 q,R e pτ, c 2 q˘, pτ, c 2 q¯.

Then the strategy profile pτ ,ĉ 2 q is a Stackelberg MFG equilibrium with a regulator if, for any admissible pτ, c 2 q, we have:

J´pτ, c 2 q;X`N pτ, c 2 q,R e pτ, c 2 q˘¯ě J´pτ ,ĉ 2 q;X`N pτ ,ĉ 2 q,R e pτ ,ĉ 2 q˘¯.

6 Numerical Results in the Presence of a Regulator

To implement the walk-away option of the producers, we modify the SocialOpt and NashEq algorithms. This is done by simply adding an IF condition to these algorithms that assigns Accept=1 if the cost of the minor player is lower than the threshold and Accept=0 otherwise. After the algorithms for the producers are modified and called "ModifiedSocialOpt" and "ModifiedNashEq" respectively, we implement the Stackelberg Equilibrium Algorithm where we assume that if the producers reject the contract (Accept=0), the regulator cost is equal to infinity. if Accept = 1 then

Compute the regulator's cost J " Jpτ, c 2 ;Xq by using (5.1) if Type = MFC then 3:

Search for optimal pτ ,ĉ 2 q couple where optimal mean field pτ, c 2 q ÑXpτ, c 2 q, investment in renewable pτ, c 2 q Ñ R e pτ, c 2 q and minor cost pτ, c 2 q Ñ cpτ, c 2 q are computed by ModifiedSocialOpt algorithm and optimal cost of regulator pτ, c 2 q Ñ Jpτ, c 2 ;Xpτ, c 2 q, Acceptq is found by using RegulatorCost 4: else if Type = MFG then

Search for optimal pτ ,ĉ 2 q couple where optimal mean field pτ, c 2 q ÑXpτ, c 2 q, investment in renewable pτ, c 2 q Ñ R e pτ, c 2 q and minor cost pτ, c 2 q Ñ cpτ, c 2 q are computed by ModifiedNashEq algorithm and optimal cost of regulator pτ, c 2 q Ñ Jpτ, c 2 ;Xpτ, c 2 q, Acceptq is found by using RegulatorCost

Let pτ ,ĉ 2 q " arg min τ,c2 Jpτ, c 2 ;Xpτ, c 2 qq,X "Xpτ ,ĉ 2 q,R e " R e pτ ,ĉ 2 q,ĉ " cpτ ,ĉ 2 q andĴ " Jpτ ,ĉ 2 ;Xq 7:

return pτ ,ĉ 2 ,X,R e ,ĉ,Ĵq Remark 6.1. In the two Stackelberg equilibria, the numerical algorithms only differ in the solution of producers' problem. For the experiments of this section, we used the same parameters as for the producers' model in the previous section. For the regulator we used 11 : α 1 " 1, α 2 " 3.3, α 3 " 500, α 4 " 0.01, α 5 " 0.25 τ P t0, 10, 15, 20, 25, 30, 40, 50, 75, 100u c 2 P t50, 100, 250, 500, 750, 1000, 1500, 2000, 2500, 3000, 4000, 5000u

First, we analyze the regulator's expected cost for different values of the carbon tax given a fixed penalty for not matching the demand. Then we switch the roles of the two controls of the regulator. Plots in Figure 4 show that the cost of the regulator is convex as a function of the carbon tax or the penalty, when the other control is fixed. We also analyze the effect of the coefficients in the regulator's cost. First we start with the analysis of the importance given to demand matching by the regulator by tracking the effect of α 4 in regulator's cost. The left subfigure in Figure 5 shows that when the tax is fixed, the regulator's minimum cost is attained at higher c 2 values when α 4 is higher. This shows that the regulator should impose higher penalties for not matching the demand to producers when demand matching is more important for the regulator. The middle subfigure shows that when the penalty for not matching the demand is fixed and when α 4 is higher then the optimal tax is lower. The reasoning here is that when the regulator cares about demand matching since the production from renewable energy is more unpredictable, the regulator is not opposed to the nonrenewable energy usage in order to have more stable demand matching. Finally the right subfigure shows the effect of the importance given to minimizing the excess pollution by the regulator by tracking the effect of α 1 in regulator's cost. Here, it can be seen that when the penalty for not matching the demand is fixed, the optimal carbon tax is higher when the regulator wants to keep pollution at a lower level. These subfigures are for the MFC case but similar results hold in the MFG case as well. Finally, Figure 6 gives 3D plots of the regulator cost as a function of their controls τ and c 2 . Here, the minimum is attained at pτ˚, c2 q " p50, 1000q in the MFC case and at pτ ,ĉ 2 q " p75, 1000q in the MFG case. Also, we see that for any given tax and penalty, the expected cost of the regulator is higher if the producers are cooperative instead of competitive when the regulator gives more importance to demand matching. This is because for any given couple pτ, c 2 q, in the cooperative setting producers are behaving like a big monopolistic firm and care less about matching the demand than in the competitive setting in order to maximize their revenues by keeping the prices higher. When demand matching is important for the regulator, the regulator benefits from the competition among the electricity producers even if this competition creates adverse effect for the producers themselves. 

In this paper, we investigate the behavior of rational electricity producers in the presence of a carbon tax. We analyze how they manage the trade-off between reliance on traditional and predictable fossil fuel power production assets which emit greenhouse gas and hence cost revenues because of the carbon tax, and the temptation to invest in clean energy production assets which will not be the source of emissions but which make matching the demand problematic because of the volatility of their output.

We study a large population of producers in two different models, a first one in which they compete and hopefully reach a Nash equilibrium, and a second one in which they cooperate and rely on the solution of a centralized optimization problem. In a second set of models, we introduce a regulator choosing the level of the carbon tax in hope to control the overall emissions in the economy, and a penalty to be imposed on producers failing to meet the demand in hope to avoid power outages and reputation costs. Our models are based on recent progress in the theory and the numerical analysis of mean field games and mean field control problems. We showed that when the producers cooperate, they are better off by behaving like a single monopolistic firm. However, if the regulator raises excessively the penalty to match the demand, they can take advantage of the competitive behavior of the producers. While our models remain stylized, they open the door to more complex models, e.g. involving time dependent policies the regulator could base on the response of the producers. Furthermore, our models could be the used to include more features of the energy markets such as storage, and the interactions between neighboring states or countries.

Proof of Theorem 3.1. Assume that the strategy couple pN ,R e q is optimal in the mean field game and the corresponding mean field flow is given asQ "QpN ,R e q. Now, assume that the representative player deviates from the optimal strategy and uses p p N`ǫ q N , p R e`ǫ q R e q. Then:

(A.1)

Furthermore we have the following dynamics:

with initial conditions: q Q 0 " q E 0 " q P 0 " | N 0 " 0. We can introduce the adjoint variables with the following dynamics:

T " 0. Plugging these dynamics in the perturbed cost function (A.1) and applying integration by parts:

By optimality, the above expression should be equal to 0 for any given q N t and q R e ; therefore:

For the proof of the FBSDE system in the mean field control setting, assume that strategy couple p p N , p R e q is optimal in the mean field control problem and the corresponding mean field is given as p Q. Now assume that the representative player deviates from optimal strategy and uses p p N`ǫ q N , p R e`ǫ q R e q. Since in the mean field control version every player is going to deviate, the mean field term also changes to p Q`ǫ q Q. Then:

For the sufficiency, let us assume that pN t , R e q is the Nash equilibrium. Then, we want to show by using FBSDE result, @p q N t , q R e q we have:

CpN, R e ;Qq´CpŇ ,Ř e ;Qq ď 0.

Therefore we have:

Now we apply integration by parts:

By using the optimality conditions we have:

Therefore we have:

" ppR e q´pp q R e q´pR e´q R e qp 1 pR e q ď 0, by using the convexity of function pp¨q.

Proof of Theorem 3.3. In order to show existence of the MFG equilibrium mean field flow system, we show the existence of the solution of the FBSDE given in (3.1). We first fix an R e , then solve the FBSDE and calculate a new R e by using the optimality condition in (3.2) . In other words, we need to show that there exists a fixed point for a function f , f pR e q " R e , by using Brouwer Fixed Point Theorem. Therefore, we need to show that f : r0, R max e s Þ Ñ r0, R max e s is continuous in R e . In order to simplify the notations we define X t :" rQ t , S t , E t , P t ,Ñ t s, Y t :" rY 1 .

Further, we define: 

When R e is fixed, we can write the FBSDE system as

For the proof of continuity, we first focus on the mean processes:

(A.5)

By introducing ansatzȲ t "Ā tXt`Bt , we can decouple the forward backward ODE and end up with

whereĀ t is the solution of the following matrix Riccati differential equation andB t is the solution of the linear ODE system:

(A.7)

By using the general ODE continuity results with respect to parameters, we can analyze the continuity ofĀ t with respect to R e . Since the solution of the matrix Riccati equation is bounded (conditions that give the boundedness of a matrix Riccati differential equation can be found in [21] ), the necessary lipschitzness assumption holds and we can conclude thatĀ t is continuous in R e . Further, again by using the general ODE continuity results and the fact thatĀ t is continuous in R e , we can conclude thatB t is continuous in R e . In this way, we have shown thatX t is continuous in R e . Now assume thatX is exogenous and define U t :" κ 2`α cospαtq`θ´S t˘, ∆Y 

where the first inequality comes from the convexity of p, the second inequality comes from Cauch-Schwarz inequality and the last inequality is the result of [20, Theorem 5.4] . By using the continuity ofX in R e and the continuity of pp 1 q´1p¨q, as R e´Re goes to 0 the upper bound goes to 0. Therefore, we can infer thatR Re r´R r Re r also goes to 0, which gives continuity. Since, we assume that pp 1 q´1 : R Þ Ñ r0, R max e s, we also have f : r0, R max e s Þ Ñ r0, R max e s. We conclude the existence proof by using Brouwer Fixed Point Theorem.

Uniqueness can be concluded as follows: Assume there exist two mean field game equilibria: pN, R e ,Qq " pN t , R e ,Q t q tPr0,T s and pN 1 , R 1 e ,Q 1 q " pN 1 t , R 1 e ,Q 1 t q tPr0,T s such thatQ ‰Q 1 . Then the control processes pN, R e q and pN 1 , R 1 e q should differ since if they are the same we would have the same state processes and the distributions would be the same. By using the definition of "minimizer" of a cost functional, we have: CpN, R e ;Qq ď CpN 1 , R 1 e ;Qq, CpN 1 , R 1 e ;Q 1 q ď CpN, R e ;Q 1 q.

By adding the two inequalities, we get:

CpN, R e ;Qq´CpN, R e ;Q 1 q¯´´CpN 1 , R 1 e ;Qq´CpN 1 , R 1 e ;Q 1 q¯ď 0. (A.9)

Now we use the fact that the drift and the volatility terms are independent of the state distribution, LpXq " µ. Therefore, in environment µ , the controlled path driven by pN 1 , R 1 e q isQ 1 and in environment µ 1 , the controlled path driven by pN, R e q isQ. By using this, we write: CpN, R e ;Qq´CpN, R e ;Q 1 q " c 3 ρ 1 ż T 0Q t pQ t´Q 1 t qdt.

In the same way, we have: By contradiction, we conclude the uniqueness.

Proof of Theorem 3.4. We introduce the same adjoint variables as for the MFG FBSDE except that for Y 1 t we take:

By plugging the adjoint variable dynamics in the perturbed cost (A.3) and applying integration by parts, we end up with the same optimality conditions as in (A.2). The sufficiency condition is proved by following the same ideas as in the proof of Theorem 3.1: we have CpN t , R e ;Q t q´CpŇ t ,Ř e ;Q t q " ppR e q´ppŘ e q´p 1 pR e qpR e´Ře q E " ż T 0 " c 3 ρ 1 pQ tQt´QtQt q´2c 3 ρ 1Qt pQ t´Qt q ‰ dt ı " ppR e q´ppŘ e q´p 1 pR e qpR e´Ře q´ż T 0 " c 3 ρ 1 pQ t`Qt q 2 ‰ dt ď 0.

which is obtained by using the convexity of the function pp¨q.

Proof of Theorem 3.5. The proof of the existence of the solution in the MFC case follows the same ideas of the proof of Theorem 3.3 and for the sake of space, it is omitted. To prove uniqueness of the MFC optimal mean field term, we introduce an auxiliary MFG which has the same FBSDE as the MFC problem and for which we prove uniqueness. To wit, we first focus on the mean field game problem that has the same dynamics as in (2.2) and that has the following cost functional for an infinitesimal agent given a mean field flow pQ t q t :

Following the idea given in the proof of Theorem 3.1, the FBSDE system that characterizes the solution of this new game is found to be the same FBSDE that characterizes the solution of the mean field control. Uniqueness of the mean field flow of the new mean field game can be proved by using the approach given in the proof of Theorem 3.3 and it is omitted for the sake of space. This in turn concludes the uniqueness for the mean field control problem.

Therefore we have defined a mapping r Þ Ñ r 1 . We now show that it is a contraction mapping to be able to use the Banach Fixed Point theorem.

Step 1. Using Itô's formula, we write the dynamics for || Ă X t || 2 :

By using these dynamics we can find a bound for || Ă X t || 2 : where the third to last inequalities stem from the Gronwall's inequality, and we define C p1q " exp´T`2||A||2

p||BR´1B J ||q||η|| T`| |BR´1B J ||˘¯||BR´1B J || with ||η|| T :" sup 0ďtďT ||η t ||.

Step 2. We write the dynamics for ||r ď exp´T p||A||`p||BR´1B J ||q||η|| T`| |F J ||qż

A class of short-term models for the oil industry addressing speculative storage

A McKean-Vlasov approach to distributed electricity generation development

The entry and exit game in the electricity markets: a mean-field game approach

An extended mean field game for storage in smart grids

Optimal incentives to mitigate epidemics: a Stackelberg mean field game approach

Limit Game Models for Climate Change Negotiations

Mean field games with a dominating player

Mean field games and mean field type control theory

Large population stochastic dynamic games: closed-loop McKean-Vlasov systems and the nash certainty equivalence principle

FBSDEs and the Solution of MFGs without common noise

The clean development mechanism and CER price formation in the carbon emissions markets

Optimal stochastic control and carbon price formation

Market design for emissions markets trading schemes

Finite-state contract theory with a principal and a field of agents

Fracking, renewables, and mean field games

Electricity price dynamics in the smart grid : A mean-field-type game perspective

Mean-field moral hazard for optimal energy demand response management

A tale of a principal and many, many agents, Mathematics of Operations Research

Mean Field Games and oil production

The wellposedness of path-dependent multidimensional forward-backward sde

New conditions for boundedness of the solution of a matrix Riccati differential equation

Riccati equations and its solutions

Mean field games

On numerical methods for mean field games and mean field type control

Forward-Backward Stochastic Differential Equations and their Applications

A dynamic collective choice model with an advertiser

Optimal generation and trading in solar renewable energy certificate (srec) markets

Modeling and computation of mean field equilibria in producers' game with emission permits trading

Proof of Lemma 4.1. Following [10, ch. 3] , we write the Hamiltonian:Hpt, N, X,X, qq " pAxq J q`pB¨N q J q`C J t q`R 2 |N | 2`H J t X`X J F X`X J GX`J t , (A. 10) where q is the adjoint process. Therefore, the optimal N to optimize H can be expressed as:N pqq "´R´1B J q.(A. 11) By pluggingN pqq in the Hamiltonian (A.10), the optimal Hamiltonian can be written as:Hpt, X,X, qq "´1 2Then the Hamilton Jacobi Bellman (HJB) equation can be written as:Bupt, Xq Bt´t rpaD 2 upt, Xqq "Ĥpt, X,X t , Dupt, Xqq, upT, Xq " X J S T X`p 2 R e , (A. 13) whereX t " ş R 5 Xmpt, XqdX. The Kolmogorov Fokker Planck (KFP) equation for our MFG problem can be written as:

Bt´t rpaD 2 mpt, Xqq`∇ x`m pt, XqpAX´Br´1B J Dupt, Xq`C t q˘" 0, mp0, Xq " m 0 pXq.(A.14)Introduce the following ansatz for the value function:We have Dupt, Xq " η t X`r t and D 2 upt, xq " η t . By plugging (A.15) into the HJB equation in (A.13), we obtain that η t is the solution of the following symmetric matrix Riccati equation:This Riccati equation has a unique positive symmetric solution, see [22, ch. 14.3] . By plugging (A.15) into the HJB equation (A.13), we also obtain the differential equation for r t that is coupled withX t :The differential equation forX t can be found by plugging the ansatz in the KFP equation:Finally from the HJB equation where the ansatz is plugged in we find that:Therefore we have:The expected cost of the representative minor player given fixed mean field and R e can be calculated by using:Proof of Theorem 4.2. For the existence and uniqueness proof, we make use of Banach Fixed Point theorem. We follow the line of proof used for a stochastic system in [25, Thm 5.1]. First we fix pr 1 t q t and pr 2 t q t , then corresponding pX i t q t can be found by solving the following ODE:Further let Ă X t "X 1 t´X 2 t andr t " r 1 t´r 2 t , then we have:Now we introduce pr i 1 t q t that solves:

Now define ||r|| T :" sup 0ďtďT ||r t ||, then we have:With small T we have c T ă 1, which concludes the proof.Proof of Lemma 4.3. For Mean Field Control problems we have the following HJB and FP systems. The detailed proof of the derivation can be found in [8, ch. 6] .Bupt, Xq Bt´t rpaD 2 upt, Xqq "Ĥpt, X, mptq, Dupt, Xqq ż R n BĤ Bm pt, ξ, mptq, Dupt, ξqqpxqmpt, ξqdξ,Bt´twhere BĤ Bm denotes the Gâteaux differential ofĤ on L 2 pR 5 q. As it can be seen, the KFP equation stays same as in MFC but HJB equation changes. By rewriting (A.12) as:We find that ż R n BĤpt, X, m, qq Bm pxqmpξqdξ "Therefore, the HJB equation becomes:

Bt´t rpaD 2 uq "Ĥpt, X,X, Duq`X J FX, upX T , T q " X J T S T X T`p2 R e . (A.23)We introduce the same ansatz as in (A.15). By plugging this ansatz in the HJB equation given in (A.23), we end up with the same Riccati equation and equation for s 0 . Only the differential equation of r t changes as follows:´dSince the Kolmogorov-Fokker-Planck equation stayed the same, we have the same expression for the differential equation ofX t as in (A.18). We obtain the MFC cost given any fixed R e by using the ansatz (see e.g. [24] for more details):Proof of Theorem 4.4. For the sake of space, we omit the proof of existence and uniqueness, which follows the same steps as in the proof of Theorem 4.2.