Smart Representations: Rationality and Evolution in a Richer Environment Paolo Galeazzi and Michael Franke Abstract Standard applications of evolutionary game theory look at a single, fixed game and focus on the evolution of behavior for that game alone. Instead, this paper uses tools from evolutionary game theory to study the evolutionary competition between choice mechanisms in a rich and variable multi-game environment. A choice mechanism is a way of subjectively representing a decision situation, paired with a method for choosing an act based on this subjective representation. We demonstrate the usefulness of this approach by a case study that shows how subjective representations in terms of regret that differ from the actual fitness can be evolutionarily advantageous. 1 Introduction If agents deal with a rich and variable environment, they have to face many different choice situa- tions. Standard evolutionary game models frequently simplify reality in at least two ways. Firstly, the environment is represented as a fixed stage game; secondly, the focus of evolutionary selection is behavior for that stage game alone. In contrast, some argue for studying the evolutionary com- petition of general choice mechanisms in a rich and variable environment (e.g., Fawcett, Hamblin, and Giraldeaub, 2013; Hammerstein and Stevens, 2012; McNamara, 2013). In response to this and adding to recent like-minded approaches, this paper introduces a general meta-game model that conservatively extends the scope of evolutionary game theory to deal with evolutionary selection of choice mechanisms in variable environments (see also Bednar and Page, 2007; Harley, 1981; O’Connor, forthcoming; Rayo and Becker, 2007; Skyrms and Zollman, 2010; Smead and Zoll- man, 2013; Zollman, 2008; Zollman and Smead, 2010).1 1Some of these contributions are very closely related to ours. Bednar and Page (2007) use a multi-game framework, composed of a fixed selection of six possible games, to study the emergence of different cultural behaviors, and model agents as finite-state automata playing games from the fixed selection. Zollman (2008) explains seemingly “irrational” fair behavior in social dilemmas (like the Ultimatum game) by means of a model where agents have to play the Ultimatum game together with the Nash bargaining game, but they are constrained to choose the same strategy for both games. Fi- nally, Rayo and Becker (2007) consider, in a more decision-theoretic setting, what subjective utility function a cognitively limited agent should be endowed with in order to maximize her evolutionary fitness. Our framework can then be viewed 1 A choice mechanism associates decision situations with action choices. A crucial part of a choice mechanism is the subjective representation of the decision situation, in particular the manner of forming preferences and beliefs about a possibly uncertain world. To show the usefulness of the meta-game approach, this paper asks: which preference and belief representations are ecologically valuable and lead to high fitness? The evolution of preferences has been subject of recent interest in theoretical economics (e.g., Alger and Weibull, 2013; Dekel, Ely, and Ylankaya, 2007; Robson and Samuelson, 2011). Here, we argue that questions of preference evolution should take variability in uncertainty representation into account as well. We demonstrate that if agents have imprecise probabilistic beliefs (e.g., Gardenfors and Sahlin, 1982; Levi, 1974; Walley, 1996), faithful and objective representations in terms of true evolutionary fitness can be outperformed by subjective (e.g., regret-based) preference representations that deviate from the true fitness that natural selection operates on. The paper is organized as follows. Section 2 sets the scene by reviewing different perspectives on rational choice. Section 3 introduces the meta-game approach. In doing so, it covers key notions such as choice mechanisms, decision rules and subjective representations, all with an eye towards the evolutionary application of Section 4. Section 5 contains the main results for that application, and Section 6 discusses some interesting extensions. Finally, Section 7 concludes. 2 Rationality and Subjective Representations The standard textbook definition of rationality in economics and decision theory traces back to the seminal work by de Finetti (1937), von Neumann and Morgenstern (1944) and Savage (1954). It says that a choice is rational only if it maximizes (subjective) expected utility. Expected utility is subjective in the sense that it is a function of subjective beliefs and subjective preferences of the decision maker (DM). To wit, a choice can be rational, i.e., the best choice from the DM’s point of view, even if based on peculiar beliefs and/or aberrant preferences. If beliefs and preferences are subjective, there is room for rationalization or redescriptionism of ob- servable behavior. For example, in the case of social decision making, including considerations of fairness allows us to describe as rational empirically observed behavior, such as in experimental Prisoner’s Dilemmas or public goods games, that might otherwise appear irrational (e.g., Charness and Rabin, 2002; Fehr and Schmidt, 1999). The main objection to redescriptionism is that, without additional constraints, the notion of ra- tionality is likely to collapse, as it seems possible to deem rational almost everything that is observed, given the freedom to adjust beliefs and preferences at will. Normativism therefore emphasizes that there are many ways in which ascriptions of beliefs and preferences should be constrained by nor- mative considerations of rationality as well: e.g., subjective beliefs should reflect objective chance where possible; subjective preferences should be oriented towards tracking objective fitness. For instance, profit maximization seems a necessary requirement for evolution in a competitive mar- ket because only firms behaving according to profit maximization will survive in the long run (e.g., as a generalization of those models, mainly in that here players do not necessarily have any specific cognitive limitations, and we allow for larger and possibly variable classes of games. 2 Alchian, 1950; Friedman, 1953). An alternative view on rationality of choice is adaptationism (e.g., Anderson, 1991; Chater and Oaksford, 2000; Hagen et al., 2012). Adaptationism aims to explain rational behavior by appeal- ing to evolutionary considerations: DMs have acquired choice mechanisms that have proved to be adaptive with respect to the variable environment where they have evolved. A choice mecha- nism can be a set of distinct heuristics (the DM’s adaptive toolbox) that have little in common (e.g., Gigerenzer and Goldstein, 1996; Scheibehenne, Rieskamp, and Wagenmakers, 2013; Tversky and Kahnemann, 1981). But to closely relate to the literature on evolution of preferences and to the philosophical debate about the nature of rational choice, we here suggest to think of a choice mech- anism as a map from choice situations to action choices which includes an explicit level of subjec- tive representation of the situation. Specifically, a subjective representation is a general way of forming preferences and beliefs about the choice situation. We are most interested in the question which subjective representations, and which choice mechanisms in general, are better than others from an evolutionary point of view. 3 Choice Mechanisms and Meta-Games We view a choice mechanism as the combination of three different things: a subjective utility (or preference), a subjective belief and a decision rule. In general, the agent’s action choice will depend both on the agent’s utility at different possible outcomes of the choice situation and on the agent’s beliefs about the realization of these outcomes. The decision rule then combines the agent’s subjective utility and belief, and dictates how the agent should act: a decision rule is a function that associates an action choice with the agent’s utility and beliefs: Decision Rule: Utility × Beliefs → Actions. The subjective utility of an agent can be formally expressed by a function u : W × A →R, where A stands for a (finite) set of actions available to the agent and W is a (finite) set of possible states of the world. There are many different ways to describe beliefs, but for concreteness of later applications we here assume that the agent’s beliefs are represented in terms of a (possibly singleton) convex compact set of probability functions Γ ⊆ ∆(W) over the possible states of the world. Given a utility u and a belief Γ, examples of well-known decision rules from the literature that we will encounter later are: 1. Maxmin: a∗(u,Γ) = arg max a∈A min µ∈Γ ∑ w∈W u(w,a)µ(w) 2. Maximax: a∗(u,Γ) = arg max a∈A max µ∈Γ ∑ w∈W u(w,a)µ(w) 3 3. Laplace rule: a∗(u,Γ) = arg max a∈A ∑ w∈W 1 |W| u(w,a) 4. Expected utility maximization (for Γ singleton): a∗(u,Γ) = arg max a∈A ∑ w∈W u(w,a)µ(w) It is worth noticing that both maxmin and maximax boil down to expected utility maximization when the set Γ is a singleton, and in turn expected utility maximization reduces to the Laplace rule when the belief µ is a uniform probability over the states. As mentioned previously, for a choice mechanism to prescribe an action, the decision rule needs to be given a specific utility u and belief Γ as input. We call the pair (u,Γ) a subjective representation of the decision situation. In the following, we investigate the evolutionary fitness of general and systematic ways of forming such subjective representations across many different decision situations. A fitness game is an interactive decision situation. For a given fitness game G = 〈N,(Ai,πGi )i∈N〉, let us denote the evolutionary payoff, or fitness, of player i by the function πGi : Πi∈N Ai → R, where Ai is player i’s (finite) set of actions. For simplicity of exposition we assume that all games that are played are symmetric two-player games where N := {1,2}, A1 = A2 and πG1 (a1,a ′ 2 ) = πG 2 (a′ 1 ,a2) =: πG(a,a′).2 The fitness of a choice mechanism c with decision rule a∗c and subjective representation (uc,Γc) is measured in terms of the expected evolutionary payoff of c. Formally, the fitness of choice mechanism c against choice mechanism c′ in a symmetric two-player game G = 〈{1,2}, A,πG〉 is given by: FG(c,c ′) = πG(a∗c(u G c ,Γc),a ∗ c′(u G c′,Γc′)). 3 Given the game theoretic setting, the subjective utility uGc is now a function u G c : A × A → R, and the subjective belief Γc is a set of probability functions over the co-player’s actions, Γc ⊆ ∆(A). Going beyond a single fixed fitness game, we consider a class of possible games. For concreteness, let G be a class of two-player symmetric games, together with a probability measure PG(G) for the occurrence probability of game G ∈ G. Intuitively, the probability PG encodes the statistical properties of the environment. A meta-game is then a tuple MG = 〈C M,G, PG, F〉, where C M is a set of choice mechanisms, G is a class of possible games, PG(G) is the probability of game G to occur, and F : C M × C M →R is the (meta-)fitness function, defined as: F(c,c′) = ∫ PG(G) FG(c,c ′) dG . (1) 2Since payoff functions are symmetric, we simply write πG(a,a′) for πG 1 (a1,a′2) and A := A1 = A2, as usual. However, notice that all definitions and results can be extended to more general cases. 3Whenever a choice mechanism would not select a unique action, we assume that the player chooses one of the equally optimal actions at random. I.e., FG(c,c′) = ∑ a∈a∗c(u G c ,Γc) ∑ a′∈a∗ c′ (uG c′ ,Γc′) 1 |a∗c(u G c ,Γc)| 1 |a∗ c′ (uG c′ ,Γc′)| πG(a,a′). 4 Hence, F(c,c′) determines the evolutionary payoff of choice mechanism c against c′ in the meta- game. The set C M can be thought of as the set of choice mechanisms that are present within a given population playing the games from the class G. Consequently, it is possible to compute the average fitness of c against the population, that is given by: F(c) = ∫ PM(c ′) F(c,c′) dc′ = ∫ ∫ PG(G) PM(c ′) FG(c,c ′) dc′ dG , (2) where PM(c′) is the probability of encountering a co-player with choice mechanism c′. Meta-games are then abstract models for the evolutionary competition between choice mech- anisms in interactive decision making contexts. Standard notions of evolutionary game theory apply to meta-games as well. For example, a choice mechanism c is a strict Nash equilibrium if F(c,c) > F(c′,c) for all c′; it is evolutionarily stable if for all c′ either (i) F(c,c) > F(c′,c) or (ii) F(c,c) = F(c′,c) and F(c,c′) > F(c′,c′); it is neutrally stable if for all c′ either (i) F(c,c) > F(c′,c) or (ii) F(c,c) = F(c′,c) and F(c,c′) ≥ F(c′,c′) (Maynard Smith, 1982). Similarly, evolutionary dynamics can be applied to meta-games. Later we will also turn towards a dynamical analysis in terms of replicator dynamics (Taylor and Jonker, 1978) and replicator mutator dynamics (e.g., Nowak, 2006). 4 Evolution of Preferences To demonstrate the usefulness of a meta-game approach, we compare a selection of general ways of forming belief and preference representations against each other. As for subjective preferences, consider initially: 1. the objective utility, defined by: for all G ∈G, obj G(a,a′) = πG(a,a′); 2. the regret, defined by: for all G ∈G, reg G(a,a′) = πG(a,a′)−max a′′∈A πG(a′′,a′). As motivation for this comparison, it is to be stressed that regret minimization is one of the main alternatives to utility (or value) maximization in the literature on decision criteria (see also Bleichrodt and Wakker, 2015). For a start, the subjective beliefs that we take into consideration are also two: 1. prc: a precise uniform belief µ such that µ(a) = 1 |A| for all a ∈ A; 2. imp: a maximally imprecise belief Γ = ∆(A). 5 Although a thorough discussion of this issue goes beyond the scope of this work, let us say that these two kinds of belief underlie two different and alternative views on uncertainty. Faced with uncertain events, a strict Bayesian will always form a precise belief, specified by a single probability µ. In the absence of any information about future uncertain events, the Bayesian would mostly invoke the principle of insufficient reason, and accordingly choose a uniform probability over the possible outcomes. In contrast, others have argued against the obligation of representing a belief by means of a single probability measure, opposite to the Bayesian paradigm (e.g., Gilboa and Marinacci, 2013). They argue instead in favor of a more encompassing account, according to which uncertainty can be unmeasurable, and represented by a (convex and compact) set of probabilities (e.g., Gilboa and Schmeidler, 1989). This line of thought has its origin in decision theory, motivated by Ellsberg’s famous paradoxes (Ellsberg, 1961), and appears extremely relevant in game-theoretic contexts too. Indeed, in a recent paper Battigalli et al. (2015) write: Such [unmeasurable] uncertainty is inherent in situations of strategic interaction. This is quite obvious when such situations have been faced only a few times. (p. 646) In evolutionary game theory, for instance, players obviously face uncertainty about the composition of the population that they are part of, and consequently about the (type of) co-player that they are randomly paired with at each round and about the co-player’s action. In case of complete lack of information about the composition of the population, a non-Bayesian player would thus entertain maximal unmeasurable uncertainty, i.e., a maximally imprecise belief.4 As already anticipated, we will see that the way agents form beliefs, and the possibility of holding imprecise beliefs in particular, can have a fundamental impact on their evolutionary success. As for the decision rule, we assume that players use the maxmin rule. This is in line with many representation results of decision making under unmeasurable uncertainty (e.g., Ghirardato and Marinacci, 2002; Gilboa and Schmeidler, 1989), and seems corroborated by empirical findings too. Ellsberg’s paradoxes are prominent examples (Ellsberg, 1961), and evidence from experimental literature suggests that agents are generally averse to unmeasurable uncertainty (e.g., Trautmann and Kuilen, 2016). Finally, note that when the maxmin rule acts on subjective representations of type (obj, imp), i.e., objective preferences and imprecise beliefs, the generated behavior corresponds to the classic maxmin strategy (von Neumann and Morgenstern, 1944). When the maxmin rule acts on subjective representation (reg, imp), the agent’s behavior is known as regret minimization.5 Two facts follow from 4Such a radical uncertainty could ensue, for example, if agents have no conception of their co-player or her prefer- ences. Unsophisticated agents, as considered in evolutionary game theory, might be entirely unaware of the fact that they are engaged in social decision making (see Heifetz, Meier, and Schipper, 2013, for game-theoretic models of unaware- ness). It is therefore not ludicrous to consider radical uncertainty first and tend to more sophisticated ways of forming beliefs later (more on this below). 5The notion of regret in decision theory dates back at least to the work by Savage (1951), and has later been developed by Bell (1982), Fishburn (1982) and Loomes and Sugden (1982) independently. Recently, Halpern and Pass (2012) showed how the use of regret minimization can give solutions to game-theoretic puzzles (like the Traveller’s dilemma and the 6 I II I 1 0 II 0 2 (a) Coordination game G. I II I 0 -2 II -1 0 (b) Regret-based representation of G. Figure 1: A coordination game (left) and the associated regret representation (right). these observations. The first is related to our focus on different types of uncertainty that players may entertain. Fact 1. For any precise (Bayesian) belief µ, maximization of expected (objective) utility based on µ and minimization of expected regret based on µ are behaviorally equivalent. The second fact highlights another behavioral equivalence, which we will make use of shortly in the following section. Fact 2. In the class of 2×2 symmetric games, the acts selected by the Laplace rule are exactly the acts selected by regret minimization. Here is a simple example that shows these choice mechanisms in action. Consider the coordi- nation fitness game G depicted in figure 1a. Since the game is symmetric, it suffices to specify the evolutionary payoffs for the row player. Figure 1a also represents the objective utility objG, since objG = πG by definition, whereas figure 1b pictures the representation of G in terms of regret- based utilities. While classic maxmin is indifferent between I and I I (figure 1a), regret minimization uniquely selects I I (figure 1b). 5 Results 5.1 Simulation Results Since for now we keep the decision rule fixed to maxmin, a player’s choice mechanism will only depend on the player’s subjective representation (u,Γ). For brevity, from now on we will refer to the pair (u,Γ), like (reg, imp) or (obj,prc), as the type of the player. Sometimes we will also distinguish types by referring to the subjective utility only, for instance (reg, imp) and (reg,prc) are regret types. As observed earlier, meta-games factor in statistical properties of the environment. For par- ticular empirical purposes, one could consult a specific class of games G with appropriate, maybe Centipede game) in a way that is closer to everyday intuition and empirical data. In this paper the notion of regret defined earlier is the same as in Halpern and Pass (2012). 7 reg, imp obj, imp reg,prc obj,prc reg, imp 6.663 6.662 6.663 6.663 obj, imp 6.486 6.484 6.486 6.486 reg,prc 6.663 6.662 6.663 6.663 obj,prc 6.663 6.662 6.663 6.663 Table 1: Average evolutionary fitness from Monte Carlo simulations of 100.000 symmetric 2×2 games. empirically informed probability PG in order to match the natural environment of a given popula- tion. For our present purposes, letGbe a set of symmetric two-player fitness games with two acts for a start. Each game G ∈ G is then individuated solely by its payoff function, i.e., by a quadruple of numbers G = (a,b,c,d). As for the occurrence probability PG(G) of game G, we imagine that the values a,b,c,d are i.i.d. random variables sampled from the set {0, . . . ,10} according to uniform probability PV. Using Monte Carlo simulations, we can then approximate the values of equation (1) to construct meta-game payoffs. Results based on 100.000 randomly sampled games are given in table 1.6 Simulation results obviously reflect Fact 2 in that all encounters in which types (reg, imp), (reg,prc) or (obj,prc) are substituted for one another yield identical results. More interestingly, table 1 shows that (obj, imp), the maxmin strategy, is strictly dominated by the three other types: in each column (i.e., for each type of co-player), the maxmin strategy is strictly worse than any of the three competitors. This has a number of interesting consequences. If we restrict attention to subjective representations with imprecise beliefs only, then a monomor- phic state in which every agent has regret-based preferences is the only evolutionarily stable state. More strongly, since (obj, imp) is strictly dominated by (reg, imp), we expect selection that is driven by (expected) fitness to invariably weed out maxmin players (obj, imp) in favor of (reg, imp), regret minimization. In terms of choice rules, this means that regret minimization is evolutionarily better than maxmin over the class of games considered. In terms of subjective preferences, it shows that players using the objective representation that directly looks at fitness (possibly money, or profit) are outperformed by non-veridical (regret) representations, when players’ beliefs are imprecise. Next, if we look at the competition between all four types represented in table 1, (reg, imp) is no longer evolutionarily stable. Given behavioral equivalence (Fact 2), types (reg, imp), (reg,prc), and (obj,prc) are all neutrally stable (Maynard Smith, 1982). But since (obj, imp) is strictly dominated and so disfavored by fitness-based selection, we are still drawn to conclude that maxmin behavior is weeded out in favor of a population with a random distribution of the remaining three types. 6Concretely, 100.000 games were sampled repeatedly by choosing independently four integers between 0 and 10 uniformly at random. For each game, the action choices of all four choice mechanisms were determined and payoffs from all pairwise encounters recorded. The number in each cell of table 1 is the average payoff for the choice mechanism listed in the row when matched with the choice mechanism in the column. 8 Simulation results of the (discrete time) replicator dynamics (Taylor and Jonker, 1978) indeed show that random initial population configurations are attracted to states with only three player types: (reg, imp), (reg,prc) and (obj,prc). The relative proportions of these depend on the initial shares in the population. This variability fully disappears if we add a small mutation rate to the dynam- ics. Take a fixed, small mutation rate � for the probability that a player’s subjective utility or her subjective belief changes to another utility or belief. The probability that a player’s subjective rep- resentation randomly mutates into a completely different representation with altogether different utility and belief would then be�2. With these assumptions about “component-wise mutations,” nu- merical simulations of the (discrete time) replicator mutator dynamics (Nowak, 2006) show that already for very small mutation rates almost all initial population states converge to a single fixed point in which the majority of players have regret-based utility. For instance, with � = 0.001, almost all initial populations are attracted to a final distribution with proportions: (reg, imp) (obj, imp) (reg,prc) (obj,prc) 0.289 0.021 0.398 0.289 What this suggests is that, if biological evolution selects behavior-generating mechanisms, not behavior as such, it need not be the case that behaviorally equivalent mechanisms are treated equally all the while. If mutation probabilities are a function of individual components, it can be the case that certain components of such behavior-generating mechanisms are more strongly favored by a process of random mutation and selection. This is exactly the case with regret-based preferences. Since regret-based preferences are much better in connection with imprecise beliefs than veridical preferences are, the proportion of expected regret minimizers, (reg,prc), in the attracting state is substantially higher than that of expected utility maximizers, (obj,prc), even though these types are behaviorally equivalent. 5.2 Analytical Results Results based on the single meta-game in table 1 are not fully general and possibly spoiled by random fluctuations in the sampling procedure. Fortunately, for the case of 2×2 symmetric games, the main result that maxmin types (obj, imp) are strictly dominated by regret minimizers can also be shown analytically for considerably general conditions. Proposition 1. Let G be the class of 2 × 2 symmetric games G = (a,b,c,d) generated by i.i.d. sampling a,b,c,d from a set of values with at least three elements in the support. Then, (reg, imp) strictly dominates (obj, imp) in the resulting meta-game. Proof. All proofs are in Appendix A. � Corollary 1. Let G be as in Proposition 1. If we only consider imprecise belief types, (obj, imp) and (reg, imp), then the unique evolutionarily stable state is a monomorphic population of (reg, imp) players. 9 The result shows that there is support for the main conceptual point that we wanted to make: objective preference representations are not necessarily favored by natural selection; objective pref- erences are outperformed by non-veridical regret preferences if agents have imprecise beliefs. This tells us that the main conclusions drawn in the previous section based on the approximated meta- game of table 1 hold more generally for arbitrary 2×2 symmetric games with i.i.d. sampled payoffs. This result presupposes at least occasional imprecise beliefs. The assumed imprecise beliefs do not need to be maximally uncertain, however. Let the uncertainty held by a player be defined by a convex compact set of probabilities [s, t] ⊆ ∆(A) over the co-player’s actions, where s is the lower probability and t is the upper probability of action I I. We can then prove the following proposition, which is the analogue of Proposition 1 for any possible (not necessarily maximal) degree of uncertainty [s, t], with s , t. There is only one difference: we are now going to require i.i.d. drawing of a continuous random variable. This is due to the fact that, for arbitrarily small intervals [s, t], objective players ([s, t],obj) and regret players ([s, t],reg) can behave as holding a unique probability measure (precise belief) if the underlying payoff space is not dense. The reason for this technical requirement will become clearer from the proof. Proposition 2. In the class of 2×2 symmetric games generated by i.i.d. drawing of a continuous random variable from a distribution with density PV and support (r,r) ⊆R, for any imprecise belief [s, t], the only evolutionarily stable state of a population with regret players ([s, t],reg) and objective players ([s, t],obj) is a monomorphic state of ([s, t],reg) players. This tells us that regret-based preferences can outperform objective preference representations when agents are also capable of learning or otherwise restricting their assumptions about the co- player’s behavior as long as there is, at least on occasion, some imprecision in their beliefs. We will enlarge on the issue of belief formation after having covered some more relevant extensions in the next section. 6 Extensions How do the basic results from the previous section carry over to richer models? Section 6.1 first introduces further conceptually interesting subjective representations that have been considered in the literature. Section 6.2 then addresses the case of symmetric two-player n × n games for n ≥ 2. Finally, Section 6.3 ends with a brief comparison to the case of solitary decision making. 6.1 More Preference Types The space of possible preference types is enormous, and we have only compared regret and objective types so far. Let us now look at two other types of subjective preferences that have been investigated, especially in behavioral economics and in evolutionary game theory. A famous example is the altruistic preference (e.g., Becker, 1976; Bester and Güth, 1998), summoned to explain the possibility of altruistic behavior. At the other end of the spectrum, the competitive preference is located. The two subjective utilities are defined as follows: 10 (reg, imp) (obj, imp) (com, imp) (alt, imp) (reg,prc) (obj,prc) (com,prc) (alt,prc) (reg, imp) 6.663 6.662 5.829 7.105 6.663 6.663 5.829 7.489 (obj, imp) 6.486 6.484 6.088 6.703 6.486 6.486 6.088 6.875 (com, imp) 6.323 6.758 5.496 6.977 6.323 6.323 5.496 7.149 (alt, imp) 5.949 5.722 5.326 6.396 5.949 5.949 5.326 6.568 (reg,prc) 6.663 6.662 5.829 7.105 6.663 6.663 5.829 7.489 (obj,prc) 6.663 6.662 5.829 7.105 6.663 6.663 5.829 7.489 (com,prc) 6.323 6.758 5.496 6.977 6.323 6.323 5.496 7.149 (alt,prc) 6.331 5.893 5.497 6.566 6.331 6.331 5.497 7.152 Table 2: Average evolutionary fitness from Monte Carlo simulations of 100.000 symmetric 2×2 games. 1. altruistic utility: for all G ∈G, altG(a,a′) = πG(a,a′) +πG(a′,a);7 2. competitive utility: for all G ∈G, comG(a,a′) = πG(a,a′)−πG(a′,a). Table 2 shows results of Monte Carlo simulations that approximate the expected fitness in the relevant meta-game with all the subjective representations considered so far. These results confirm basic intuitions about altruistic and competitive types: everybody would like to have an altruistic co-player and nobody would like to play against a competitive player. Perhaps more surprisingly, (alt, imp) comes up strictly dominated by (com, imp), but competitive types themselves are worse off against all types except against maxmin players (obj, imp) than any of the behaviorally equivalent types (reg, imp), (obj,prc), and (reg,prc). It is thus easy to see that the previous results still obtain for the larger meta-game in table 2: (reg, imp), (obj,prc), and (reg,prc) are still neutrally stable; simulation runs of the (discrete-time) replicator dynamics on the 8×8 meta-game from table 2 end up in population states consisting of only these three types in variable proportion. In sum, the presence of other subjective representations, such as those based on altruistic or competitive utilities, does not undermine, but rather strengthens our previous results. 6.2 More Actions Results from Section 5 relied heavily on Fact 2, which is no longer true when we look at arbitrary n × n games. Table 3 gives approximations of expected fitness in the class of n × n symmetric games. Concretely, the numbers in table 3 are averages of evolutionary payoffs obtained in 100.000 randomly sampled symmetric games, where each fitness game G was sampled by first picking a number of acts nG ∈ {2, . . . ,10} uniformly at random, and then filling the necessary nG ×nG payoff matrix with i.i.d. sampled numbers, as before. The most important result is that the regret minimizing type (reg, imp) is strictly dominated by (reg,prc) and by (obj,prc) in the meta-game from table 3. This means that while simple regret 7A more general formulation would be to defineα-altruistic utility, forα ∈ [0,1], uGα(a,a ′) = πG(a,a′)+απG(a′,a). Since we are not interested in the evolution of degrees of altruism, here we simply fix α = 1. Analogously for α-competitive utilities too. 11 (reg, imp) (obj, imp) (com, imp) (alt, imp) (reg,prc) (obj,prc) (com,prc) (alt,prc) (reg, imp) 6.567 6.570 5.650 6.992 6.564 6.564 5.593 7.409 (obj, imp) 6.476 6.483 5.896 6.818 6.484 6.484 5.850 7.124 (com, imp) 6.468 6.647 5.512 7.169 6.578 6.578 5.577 7.354 (alt, imp) 5.968 5.923 5.363 6.685 5.975 5.975 5.086 6.973 (reg,prc) 6.908 6.918 5.988 7.456 6.929 6.929 5.934 7.783 (obj,prc) 6.908 6.918 5.988 7.456 6.929 6.929 5.934 7.783 (com,prc) 6.529 6.680 5.445 7.276 6.542 6.542 5.521 7.440 (alt,prc) 6.450 6.337 5.772 6.978 6.457 6.457 5.479 7.500 Table 3: Average evolutionary fitness for 100.000 randomly generated n × n symmetric games with n randomly drawn from {2, . . . ,10}. reg obj com alt reg 6.926 6.926 5.942 7.757 obj 6.924 6.924 5.948 7.751 com 6.566 6.570 5.481 7.434 alt 6.463 6.461 5.478 7.469 Table 4: Meta-game for the evolutionary competition between subjective utilities when beliefs are exogenously given (see main text). minimization can thrive in some evolutionary contexts, there are also contexts where it is demon- strably worse off. While this may be bad news for regret minimizing types (reg, imp), it is not the case that regret types as such are weeded out by selection. Since, by Fact 1, (reg,prc) and (obj,prc) are behaviorally equivalent in general, it remains that selection based on meta-games constructed from n × n games will still not eradicate regret preferences. On the other hand, there are plenty of ways in which the basic insights from Propositions 1 and 2 can make for situations in which evolution would favor regret types, even in n × n games. If, for example, the belief of a player is a trait that biological evolution has no bite on, but rather some- thing that the particular choice situation would exogenously give us (possibly because of the different amount of information available in different choice situations), then regret-based preferences can again drive out veridical preferences altogether. For example, suppose that only preference repre- sentations compete and that agents’ beliefs are exogenously given, in such a way that both players hold precise (Bayesian) uniform beliefs with probability p and they both have maximally imprecise beliefs otherwise. This transforms the meta-game from table 3 into a simpler 4 × 4 meta-game in which the payoff obtained by a subjective preference is the weighted average over the payoffs of the subjective representations including that preference in table 3. Setting p = 0.98 for illus- tration, we get the meta-game in table 4. The only evolutionarily stable state of this meta-game is again a monomorphic population of regret types. Accordingly, all our simulation runs of the 12 (reg, imp) (obj, imp) (reg,prc) (obj,prc) 6.318 6.237 6.661 6.661 Table 5: Expected fitness of choice mechanisms approximated from 100.000 simulated solitary decision problems (see main text). (discrete-time) replicator dynamics converge to monomorphic regret-type populations. The reason why regret-based utilities prosper is because they have a substantial fitness advantage when paired with imprecise beliefs (Propositions 1 and 2). If unmeasurable uncertainty is exogenously given as something that happens to agents because of the information available in some choice situations, and even if that happens only very infrequently (i.e., for rather low p), regret preferences will out- perform objective preferences, as well as competitive and altruistic preferences. 6.3 Solitary Decisions To see how different choice mechanisms behave in evolutionary competition based on solitary de- cision making, we approximated, much in the spirit of meta-games, average accumulated fitness obtained in randomly generated solitary decision problems. For our purposes, a decision problem D = 〈 W, A,πD 〉 consists of a set of states of the world W, a set of acts A, and a payoff function πD : W × A → R. We generate arbitrary decision problems by selecting, uniformly at random, numbers of states and acts nDw,n D a ∈ {2, . . . ,10} and then filling the payoff table, so to speak, by i.i.d. samples for each πD(w,a) ∈ {0,10}. Unlike with two-player games, we need to also sample the actual state of the world, which we selected uniformly at random from the available states in the current decision problem. Accordingly, the fitness of choice mechanism c in decision problem D is given by: FD(c) = π D(w,a∗c(u D c ,Γc))µ(w), with µ(w) = 1 nDw for all w. As subjective representations, we considered the original cast of four from table 1, since altruistic and competitive types are meaningless in solitary decision situations. As before, the relevant fitness measure, defined in equation (3), was approximated by Monte Carlo simulations, the results of which are given in table 5. F(c) = ∫ PD(D) FD(c) dD (3) Facts 1 and 2 still apply: (reg,prc) and (obj,prc) are behaviorally equivalent in general, and (reg, imp) is behaviorally equivalent to the former two in decision problems with two states and two acts. This shows in the results from table 5 in that the averages for (reg,prc) and (obj,prc) are identical. But since we included decision problems with more acts and more states as well, the average for regret minimizers (reg, imp) is not identical to the one of (reg,prc) and (obj,prc). It is, in fact, lower, but again not as low as that of (obj, imp). 13 This means that every relevant result we have seen about game situations is also borne out for solitary decisions. Evolutionary selection based on objective fitness will not select against regret pref- erences, as these are indistinguishable from veridical preferences when paired with precise beliefs. But when paired with imprecise beliefs, regret-based utilities outperform objective utilities. Conse- quently, if there is a chance, however small, that agents fall back on imprecise beliefs, evolution will actually positively select for non-veridical regret-based preferences. 6.4 Sophisticated Beliefs Since one of our main purposes was to illustrate the usefulness of a meta-game approach by the case study of objective and regret preferences, we have partially neglected an important and interesting issue, namely the evolution of ways of forming beliefs about co-players’ behavior or the actual state of the world. For reasons of space we must, unfortunately, leave a deeper exploration of belief type evolution to another occasion. Two remarks are in order nonetheless. Firstly, belief type evolution can be studied without conceptual hurdles in the meta-game framework, so that there is no principled argument against the main methodological contribution of this paper. Secondly, our results regarding the comparison between regret and objective types remain to be informative, even if we allow agents to learn or reason strategically.8 This is because we know from Fact 1 that regret and objective preferences come up behaviorally equivalent when paired with precise probabilistic beliefs (given identical decision rule). This holds no matter what the content of that belief is. So, if learning, reasoning or statistical knowledge about a recurrent situation can be brought to bear, this will not make evolution select against regret-based preferences. If, on the other hand, agents resort to imprecise beliefs at least occasionally (e.g., when they are unaware of the co-player or her utilities, or when strategic reasoning cannot reduce all uncertainty about the co-player’s choice), then regret-based preferences can be favored by natural selection over objective preferences. 7 Conclusion The assumption that players and decision makers maximize their (subjective) utility is central through in the economics literature, and the maximization of actual (objective) payoffs is often justified by appealing to evolutionary arguments and natural selection. In contrast to the standard view, we showed the existence of player types with subjective utilities different from the actual evolutionary payoffs that can outperform types whose subjective utilities coincide with the evolutionary payoffs. The claim is not that regret preferences are the best on the market, but rather that utilities that perfectly mirror evolutionary fitness can be outclassed by subjective utilities that differ from the objective fitness. While the literature on evolution of preferences has focused on fixed games, we have adopted a more general approach here. We suggested that attention to “meta-games” is cru- cial, because what may be a good subjective representation in one type of game (e.g., cooperative 8Some research has recently been done along these lines. See in particular Mengel, 2012; Mohlin, 2012; Robalino and Robson, 2016. 14 preferences in the Prisoner’s Dilemma class) need not be generally beneficial. Taken together, we presented a variety of plausible circumstances in which evolutionary competition between choice mechanisms on a larger class of games can favor non-veridical preference representations focusing on regret. A Proofs The proof of Proposition 1 relies on a partition of G, and on some lemmas. For brevity, let us denote the regret minimizer (reg, imp) by R and the maximinimizer (obj, imp) by M. Following equation (1), let FG(X,Y) denote the expected payoff of choice mechanism X against choice mech- anism Y on the possibly restricted class of fitness games G. Proof of Proposition 1. By definition of strict dominance, we have to show that in the class G of symmetric 2×2 games with payoffs sampled from a set of i.i.d. values with at least 3 elements in the support, it holds that: (i) FG(R,R) > FG(M,R); (ii) FG(M, M) < FG(R, M). To show this we use the following partition of G, based on payoffs parametrized as follows: I I I I a b I I c d 1. Coordination games C: a > c and d > b; 2. Anti-coordination games A: a < c and d < b; 3. Strong dominance games S: aut (a > c and b > d) aut (a < c and b < d); 4. Weak dominance games W: aut a = c aut b = d; 5. Boring games B: a = c and b = d. Before proving the lemmas, it is convenient to fix some notation. Let us call x,y,z the 3 elements in the support, and without loss of generality suppose that x > y > z. We denote by C a coordination game in C with payoffs aC, bC, cC, and dC; similarly for games A ∈A, S ∈S, W ∈W, and B ∈B. Let us denote by IRC the event that a R-player plays action I in the game C; and similarly for action I I, for player M, and for games A, S , W and B. We first consider the case of i.i.d. sampling with finite support. Lemma 1. R and M perform equally well in S and in B. 15 Proof. By definition of regret minimization and maxmin it is easy to check that whenever in a game there is a strongly dominant action a$, then a$ is both the maxmin action and the regret minimizing action. Then, for all the games in S, R chooses action a if and only if M chooses action a. Conse- quently, R and M always perform equally (well) in S. In the case of B it is trivial to see that all the players perform equally. � Lemma 2. In W, R strictly dominates M. Proof. Assume without loss of generality that b = d, and that a > c. There are two cases that we have to check: (i) c < b = d and (ii) c ≥ b = d. In the first case it is easy to see that R and M perform equally: act I is the choice of both R and M. In the case of (ii) instead we have that I is the regret minimizing action, whereas both actions have the same minimum and M plays (1 2 I; 1 2 I I), since both I and I I maximize the minimal payoff. Consider now a population of R and M playing games from the class W. Whenever (i) is the case R and M perform equally well. But suppose W ∈ W and (ii) is the case. Then, πW(R,R) = a > 1 2 a + 1 2 c = πW(M,R), whereas πW(M, M) = 1 4 a + 1 4 b + 1 4 c + 1 4 d < 1 2 a + 1 2 b = πW(R, M). Hence, we have that in general FW(R,R) > FW(M,R), and FW(M, M) < FW(R, M). � Since it is not difficult to see that both (R,R) and (M, M) are strict Nash equilibria in C, and that (R,R) and (M, M) are not Nash equilibria in A, the main part of the proof will be to show that R strictly dominates C in the class C∪A, that is: (i’) FC∪A(R,R) > FC∪A(M,R), (ii’) FC∪A(M, M) < FC∪A(R, M). This part needs some more lemmas to be proven, but firstly we introduce the following bijective function φ between coordination and anti-coordination games. Definition 3 (φ). The permutationφ(a,b,c,d) = (c,d,a,b) defines a bijective functionφ : C→ A that for each coordination game C ∈C with payoffs (aC,bC,cC,dC) gives the anti-coordination game A ∈Awith payoffs (aA,bA,cA,dA) = (cC,dC,aC,bC). Essentially,φ swaps rows in the payoff matrix. Lemma 4. Occurrence probability of C equals that of φ(C): P(φ(C)) = P(C). Proof. By definition, each game C ≡ (aC,bC,cC,dC) is such that aC > cC and dC > bC, and each game A ≡ (aA,bA,cA,dA) is such that aA < cA and dA < bA. Given that a,b,c,d are i.i.d. random variables and that a sequence of i.i.d. random variables is exchangeable, it is clear that the probability of (aC,bC,cC,dC) equals the probability of (cC,dC,aC,bC). Hence, P(φ(C)) = P(C). � Lemma 5. Let P(E) be the probability of event E, e.g., P(IRC) is the probability that a random R-player plays act I in coordination game C, which is either 0, .5 or 1. It then holds that: · P(IRC) = P(I IRφ(C)), and P(I IRC) = P(IRφ(C)); 16 · P(IMC) = P(I IMφ(C)), and P(I IMC) = P(IMφ(C)). Proof. It is easy to check that if bC − dC > cC − aC, a R-player plays action I in C; that if bC − dC < cC − aC, R plays I I; and that if bC − dC = cC − aC, a R-player is indifferent between I and I I in C, and so randomizes with (1 2 I; 1 2 I I). Similarly, if aA − cA > dA − bA, a R-player plays action I in A; if aA − cA < dA − bA, R plays I I; and if aA − cA = dA − bA, a R-player is indifferent between I and I I in A, and randomizes with (1 2 I; 1 2 I I). Consequently, if bC − dC > cC − aC, then P(IRC) = 1, and by definition of φ we have P(I IRφ(C)) = 1. Likewise, if bC − dC < cC − aC, then P(I IRC) = 1 = P(IRφ(C)); and if bC − dC = cC − aC, then P(IRC) = P(I IRC) = 1 2 = P(I IRφ(C)) = P(IRφ(C)). In the same way, in coordination games we have that if bC > cC, a M-player plays I; if cC > bC, a M-player plays I I; and if bC = cC, M is indifferent between I and I I, and plays ( 1 2 I; 1 2 I I). In anti-coordination games instead, if aA > dA, M plays I; if aA < dA, M plays I I; if aA = dA, M plays (1 2 I; 1 2 I I). By definition of φ: P(IMC) = 1 = P(I IMφ(C)) if bC > cC; P(I IMC) = 1 = P(IMφ(C)) if cC > bC; and P(IMC) = P(I IMC) = 1 2 = P(I IMφ(C)) = P(IMφ(C)) if bC = cC. � Lemma 6. It holds that: · aC > dC → (IMC ⊆ IRC); · aC = dC → IMC = IRC. · aC < dC → (I IMC ⊆ I IRC); Proof. The event that R plays action I, IRC, with positive probability is the event that bC − dC ≥ cC − aC: if bC − dC > cC − aC, R plays I, and if bC − dC = cC − aC, R plays ( 1 2 I; 1 2 I I). Similarly, the event that IMC has positive occurrence is the event that bC ≥ cC: if bC > cC, M plays I, and if bC = cC, M plays ( 1 2 I; 1 2 I I). Then, IRC implies that bC−dC ≥ cC−aC, and IMC implies that bC ≥ cC. Moreover, on the assumption that aC > dC, it is easy to check that bC ≥ cC implies bC−dC > cC−aC. Hence, in any C with aC > dC it holds that IMC implies IRC, i.e., aC > dC → (IMC ⊆ IRC). Instead, it is possible that aC > dC, bC − dC > cC − aC and bC < cC hold simultaneously, so that IMC + IRC. By a symmetric argument it can be shown that aC < dC → (I IMC ⊆ I IRC) too. Finally, when aC = dC it holds that: bC − dC > cC − aC iff bC > cC; bC − dC < cC − aC iff bC < cC; and bC − dC = cC − aC iff bC = cC. Hence, aC = dC → IMC = IRC. � We are now ready to prove that FC∪A(R,R) > FC∪A(M,R). With notation like P(IRC ∩ IRC) denoting the probability that a random R-player plays I and another R-player plays I as well in 17 game C, rewrite the inequality as:∑ C∈C P(C)[P(IRC ∩ IRC) · aC + P(I IRC ∩ I IRC) · dC + P(IRC ∩ I IRC) · bC + P(I IRC ∩ IRC) · cC] + ∑ A∈A P(A)[P(IRA ∩ IRA) · aA + P(I IRA ∩ I IRA) · dA + P(IRA ∩ I IRA) · bA + P(I IRA ∩ IRA) · cA] > ∑ C∈C P(C)[P(IRC ∩ IMC) · aC + P(I IRC ∩ I IMC) · dC + P(IRC ∩ I IMC) · cC + P(I IRC ∩ IMC) · bC] + ∑ A∈A P(A)[P(IRA ∩ IMA) · aA + P(I IRA ∩ I IMA) · dA + P(IRA ∩ I IMA) · cA + P(I IRA ∩ IMA) · bA] By Lemma 4 and Lemma 5, we can express everything in terms of C only:∑ C P(C)[P(IRC ∩ IRC) · aC + P(I IRC ∩ I IRC) · dC + P(IRC ∩ I IRC) · bC + P(I IRC ∩ IRC) · cC+ P(I IRC ∩ I IRC) · cC + P(IRC ∩ IRC) · bC + P(I IRC ∩ IRC) · dC + P(IRC ∩ I IRC) · aC] > ∑ C P(C)[P(IRC ∩ IMC) · aC + P(I IRC ∩ I IMC) · dC + P(IRC ∩ I IMC) · cC + P(I IRC ∩ IMC) · bC+ P(I IRC ∩ I IMC) · cC + P(IRC ∩ IMC) · bC + P(I IRC ∩ IMC) · aC + P(IRC ∩ I IMC) · dC] This simplifies to:∑ C P(C)[aC · (P(IRC ∩ IRC) + P(IRC ∩ I IRC)) + bC · (P(IRC ∩ I IRC) + P(IRC ∩ IRC))+ cC · (P(I IRC ∩ IRC) + P(I IRC ∩ I IRC)) + dC · (P(I IRC ∩ I IRC) + P(I IRC ∩ IRC))] > ∑ C P(C)[aC · (P(IRC ∩ IMC) + P(I IRC ∩ IMC)) + bC · (P(I IRC ∩ IMC) + P(IRC ∩ IMC))+ cC · (P(IRC ∩ I IMC) + P(I IRC ∩ I IMC)) + dC · (P(I IRC ∩ I IMC) + P(IRC ∩ I IMC))] Now let us split into a > d and a < d, and consider a > d first. Notice that, by Lemma 6, the case a = d is irrelevant in order to discriminate between R and M. If a > d, by Lemma 6 we can eliminate the cases where R plays I I and M plays I:∑ Ca>d P(C)[aC · (P(IRC ∩ IRC) + P(IRC ∩ I IRC)) + bC · (P(IRC ∩ I IRC) + P(IRC ∩ IRC)) + cC · (P(I IRC ∩ IRC) + P(I IRC ∩ I IRC)) + dC · (P(I IRC ∩ I IRC) + P(I IRC ∩ IRC))] > ∑ Ca>d P(C)[aC · P(IRC ∩ IMC) + bC · P(IRC ∩ IMC) + cC · (P(IRC ∩ I IMC) + P(I IRC ∩ I IMC)) + dC · (P(I IRC ∩ I IMC) + P(IRC ∩ I IMC))] Rewrite:∑ Ca>d P(C)[aC · (P(IRC ∩ IRC) + P(IRC ∩ I IRC)− P(IRC ∩ IMC)) + bC · (P(IRC ∩ I IRC) + P(IRC ∩ IRC)− P(IRC ∩ IMC)) + cC · (P(I IRC ∩ IRC) + P(I IRC ∩ I IRC)− P(IRC ∩ I IMC)− P(I IRC ∩ I IMC)) + dC · (P(I IRC ∩ I IRC) + P(I IRC ∩ IRC)− P(I IRC ∩ I IMC)− P(IRC ∩ I IMC))] > 0 18 We now distinguish between two cases: (1) a − c = d − b and (2) a − c , d − b. Notice that P(IRC ∩ I IRC) , 0 if and only if case (1) obtains, and that a > d and (1) imply I IMC. Then, from (1) we have9: ∑ Ca>d P(C)[aC · (P(IRC ∩ IRC) + P(IRC ∩ I IRC)) + bC · (P(IRC ∩ I IRC) + P(IRC ∩ IRC)) + cC · (P(I IRC ∩ IRC) + P(I IRC ∩ I IRC)− P(IRC ∩ I IMC)− P(I IRC ∩ I IMC)) + dC · (P(I IRC ∩ I IRC) + P(I IRC ∩ IRC)− P(I IRC ∩ I IMC)− P(IRC ∩ I IMC))] > 0 that is∑ Ca>d P(C)[aC · ( 1 4 + 1 4 ) + bC · ( 1 4 + 1 4 ) + cC · ( 1 4 + 1 4 − 1 2 − 1 2 ) + dC · ( 1 4 + 1 4 − 1 2 − 1 2 )] > 0 Since we have assumed a − c = d − b, the last inequality is not satisfied. We have instead:∑ Ca>d P(C)[ 1 2 aC + 1 2 bC − 1 2 cC − 1 2 dC] = 0 This means that where aC > dC and where (1) is the case, R and M are equally fit. This changes when we turn to (2). In that case, since aC > dC → (IMC ⊆ IRC) by Lemma 6, we have that P(IRC ∩ IRC) − P(IRC ∩ IMC) = P(IRC ∩ I IMC). Moreover, when aC > dC, bC ≥ cC implies bC − dC > cC − aC (see Lemma 6). Consequently, when M plays either I or ( 1 2 I; 1 2 I I), R always plays I. Hence, whenever aC > dC and (2) obtain, it also holds that P(I IRC∩I IMC) = P(I IRC∩I IRC). In this case we can simplify:∑ Ca>d P(C)[aC · (P(IRC ∩ IRC)− P(IRC ∩ IMC)) + bC · (P(IRC ∩ IRC)− P(IRC ∩ IMC)) + cC · (P(I IRC ∩ I IRC)− P(IRC ∩ I IMC)− P(I IRC ∩ I IMC)) + dC · (P(I IRC ∩ I IRC)− P(I IRC ∩ I IMC)− P(IRC ∩ I IMC))] > 0 ∑ Ca>d P(C)[aC · P(IRC ∩ I IMC) + bC · P(IRC ∩ I IMC) −cC · P(IRC ∩ I IMC)− dC · P(IRC ∩ I IMC)] > 0 ∑ Ca>d P(C)[P(IRC ∩ I IMC) · (aC + bC − cC − dC)] > 0 9Note that when we have only 3 elements in the support it is not guaranteed that case (1), together with a > d, may arise in a coordination game, whereas it is guaranteed that case (2), together with a > d, occurs with some positive probability. If we take for instance x = 5,y = 2,z = 1, then case (1) cannot obtain, whereas if we take x = 3,y = 2,z = 1, both (1) and (2) may obtain (a = 3,b = 1,c = 2,d = 2 for case (1), and a = 3,b = 1,c = 2,d = 2 for case (2)). Moreover, under the assumption that a > d, having 3 elements in the support is a necessary and sufficient condition for case (2) to have positive occurrence in a coordination game. As it will be clear in the following, a positive occurrence of case (2) only is enough for the theorem to hold. 19 We know that IRC implies that aC − cC ≥ dC − bC. Since we have assumed that aC − cC , dC − bC, we have that aC − cC > dC − bC. Hence, the inequality∑ Ca>d P(C)[P(IRC ∩ I IMC) · (aC + bC − cC − dC)] > 0 is satisfied. So, when aC > dC, R strictly dominates M. Symmetrically, from a < d and by distin- guishing between the two cases (1) and (2) as before, in the end we get: (1) ∑ Ca 0. Hence, we can conclude that R strictly dominates M in the class C∪A. Notice that in case of i.i.d. sampling with continuous support, games in B and W never arise, but the proof is the same for the remaining games in S, C and A. It remains to be shown that FC∪A(M, M) < FC∪A(R, M). As before, spell this out as:∑ C P(C)[P(IMC ∩ IMC) · aC + P(I IMC ∩ I IMC) · dC + P(IMC ∩ I IMC) · bC + P(I IMC ∩ IMC) · cC] + ∑ A P(A)[P(IMA ∩ IMA) · aA + P(I IMA ∩ I IMA) · dA + P(IMA ∩ I IMA) · bA + P(I IMA ∩ IMA) · cA] < ∑ C P(C)[P(IRC ∩ IMC) · aC + P(I IRC ∩ I IMC) · dC + P(IRC ∩ I IMC) · bC + P(I IRC ∩ IMC) · cC] + ∑ A P(A)[P(IRA ∩ IMA) · aA + P(I IRA ∩ I IMA) · dA + P(IRA ∩ I IMA) · bA + P(I IRA ∩ IMA) · cA] When a > d, similarly to the above derivation, we get:∑ Ca>d P(C)[aC · (P(IMC ∩ IMC) + P(IMC ∩ I IMC)− P(IRC ∩ IMC)− P(IRC ∩ I IMC)) + bC · (P(IMC ∩ IMC) + P(IMC ∩ I IMC)− P(IRC ∩ IMC)− P(IRC ∩ I IMC)) + cC · (P(I IMC ∩ IMC) + P(I IMC ∩ I IMC)− P(I IRC ∩ I IMC)) + dC · (P(I IMC ∩ I IMC) + P(I IMC ∩ IMC)− P(I IRC ∩ I IMC))] < 0 We now distinguish between (1) b = c, (2) b > c, and (3) b < c. Notice that either (1) or (2), together with a > d, implies IRC. Then we obtain:10 (1) ∑ Ca>d P(C)[− 1 2 aC − 1 2 bC + 1 2 cC + 1 2 dC] < 0; (2) ∑ Ca>d P(C)[aC ·(P(IMC ∩ IMC)−P(IRC ∩ IMC)) + bC ·(P(IMC ∩ IMC)−P(IRC ∩ IMC))] = 0; 10Note that here, when we only have 3 elements in the support, case (2) is impossible, but cases (1) and (3) can occur with positive probability, and this enough for our purpose. 20 P(I I) = 0 | s | t a′ c′ d′ b′ c - - I a - - - - - - b - -I I d 1 d − b a − c Figure 2: Example of strictly informative coordination game. (3) ∑ Ca>d P(C)[aC ·(−P(IRC ∩I IMC))+ bC ·(−P(IRC ∩I IMC))+ cC ·(P(I IMC ∩I IMC)−P(I IRC ∩ I IMC)) + dC · (P(I IMC ∩ I IMC)− P(I IRC ∩ I IMC))] ≤ 0. When a < d, the derivation proceeds symmetrically and we get: (1) ∑ Ca