Smart Representations: Rationality and Evolution in a Richer

Environment

Paolo Galeazzi and Michael Franke

Abstract

Standard applications of evolutionary game theory look at a single, fixed game and

focus on the evolution of behavior for that game alone. Instead, this paper uses tools

from evolutionary game theory to study the evolutionary competition between choice

mechanisms in a rich and variable multi-game environment. A choice mechanism is a

way of subjectively representing a decision situation, paired with a method for choosing

an act based on this subjective representation. We demonstrate the usefulness of this

approach by a case study that shows how subjective representations in terms of regret

that differ from the actual fitness can be evolutionarily advantageous.

1 Introduction

If agents deal with a rich and variable environment, they have to face many different choice situa-

tions. Standard evolutionary game models frequently simplify reality in at least two ways. Firstly,

the environment is represented as a fixed stage game; secondly, the focus of evolutionary selection

is behavior for that stage game alone. In contrast, some argue for studying the evolutionary com-

petition of general choice mechanisms in a rich and variable environment (e.g., Fawcett, Hamblin,

and Giraldeaub, 2013; Hammerstein and Stevens, 2012; McNamara, 2013). In response to this

and adding to recent like-minded approaches, this paper introduces a general meta-game model that

conservatively extends the scope of evolutionary game theory to deal with evolutionary selection

of choice mechanisms in variable environments (see also Bednar and Page, 2007; Harley, 1981;

O’Connor, forthcoming; Rayo and Becker, 2007; Skyrms and Zollman, 2010; Smead and Zoll-

man, 2013; Zollman, 2008; Zollman and Smead, 2010).1

1Some of these contributions are very closely related to ours. Bednar and Page (2007) use a multi-game framework,

composed of a fixed selection of six possible games, to study the emergence of different cultural behaviors, and model

agents as finite-state automata playing games from the fixed selection. Zollman (2008) explains seemingly “irrational”

fair behavior in social dilemmas (like the Ultimatum game) by means of a model where agents have to play the Ultimatum

game together with the Nash bargaining game, but they are constrained to choose the same strategy for both games. Fi-

nally, Rayo and Becker (2007) consider, in a more decision-theoretic setting, what subjective utility function a cognitively

limited agent should be endowed with in order to maximize her evolutionary fitness. Our framework can then be viewed

1


A choice mechanism associates decision situations with action choices. A crucial part of a choice

mechanism is the subjective representation of the decision situation, in particular the manner of forming

preferences and beliefs about a possibly uncertain world. To show the usefulness of the meta-game

approach, this paper asks: which preference and belief representations are ecologically valuable and

lead to high fitness? The evolution of preferences has been subject of recent interest in theoretical

economics (e.g., Alger and Weibull, 2013; Dekel, Ely, and Ylankaya, 2007; Robson and Samuelson,

2011). Here, we argue that questions of preference evolution should take variability in uncertainty

representation into account as well. We demonstrate that if agents have imprecise probabilistic beliefs

(e.g., Gardenfors and Sahlin, 1982; Levi, 1974; Walley, 1996), faithful and objective representations

in terms of true evolutionary fitness can be outperformed by subjective (e.g., regret-based) preference

representations that deviate from the true fitness that natural selection operates on.

The paper is organized as follows. Section 2 sets the scene by reviewing different perspectives

on rational choice. Section 3 introduces the meta-game approach. In doing so, it covers key notions

such as choice mechanisms, decision rules and subjective representations, all with an eye towards

the evolutionary application of Section 4. Section 5 contains the main results for that application,

and Section 6 discusses some interesting extensions. Finally, Section 7 concludes.

2 Rationality and Subjective Representations

The standard textbook definition of rationality in economics and decision theory traces back to the

seminal work by de Finetti (1937), von Neumann and Morgenstern (1944) and Savage (1954). It

says that a choice is rational only if it maximizes (subjective) expected utility.

Expected utility is subjective in the sense that it is a function of subjective beliefs and subjective

preferences of the decision maker (DM). To wit, a choice can be rational, i.e., the best choice from

the DM’s point of view, even if based on peculiar beliefs and/or aberrant preferences.

If beliefs and preferences are subjective, there is room for rationalization or redescriptionism of ob-

servable behavior. For example, in the case of social decision making, including considerations of

fairness allows us to describe as rational empirically observed behavior, such as in experimental

Prisoner’s Dilemmas or public goods games, that might otherwise appear irrational (e.g., Charness

and Rabin, 2002; Fehr and Schmidt, 1999).

The main objection to redescriptionism is that, without additional constraints, the notion of ra-

tionality is likely to collapse, as it seems possible to deem rational almost everything that is observed,

given the freedom to adjust beliefs and preferences at will. Normativism therefore emphasizes that

there are many ways in which ascriptions of beliefs and preferences should be constrained by nor-

mative considerations of rationality as well: e.g., subjective beliefs should reflect objective chance

where possible; subjective preferences should be oriented towards tracking objective fitness. For

instance, profit maximization seems a necessary requirement for evolution in a competitive mar-

ket because only firms behaving according to profit maximization will survive in the long run (e.g.,

as a generalization of those models, mainly in that here players do not necessarily have any specific cognitive limitations,

and we allow for larger and possibly variable classes of games.

2


Alchian, 1950; Friedman, 1953).

An alternative view on rationality of choice is adaptationism (e.g., Anderson, 1991; Chater and

Oaksford, 2000; Hagen et al., 2012). Adaptationism aims to explain rational behavior by appeal-

ing to evolutionary considerations: DMs have acquired choice mechanisms that have proved to

be adaptive with respect to the variable environment where they have evolved. A choice mecha-

nism can be a set of distinct heuristics (the DM’s adaptive toolbox) that have little in common (e.g.,

Gigerenzer and Goldstein, 1996; Scheibehenne, Rieskamp, and Wagenmakers, 2013; Tversky and

Kahnemann, 1981). But to closely relate to the literature on evolution of preferences and to the

philosophical debate about the nature of rational choice, we here suggest to think of a choice mech-

anism as a map from choice situations to action choices which includes an explicit level of subjec-

tive representation of the situation. Specifically, a subjective representation is a general way of forming

preferences and beliefs about the choice situation. We are most interested in the question which

subjective representations, and which choice mechanisms in general, are better than others from an

evolutionary point of view.

3 Choice Mechanisms and Meta-Games

We view a choice mechanism as the combination of three different things: a subjective utility (or

preference), a subjective belief and a decision rule. In general, the agent’s action choice will depend both

on the agent’s utility at different possible outcomes of the choice situation and on the agent’s beliefs

about the realization of these outcomes. The decision rule then combines the agent’s subjective

utility and belief, and dictates how the agent should act: a decision rule is a function that associates

an action choice with the agent’s utility and beliefs:

Decision Rule: Utility × Beliefs → Actions.

The subjective utility of an agent can be formally expressed by a function u : W × A →R, where A
stands for a (finite) set of actions available to the agent and W is a (finite) set of possible states of the
world. There are many different ways to describe beliefs, but for concreteness of later applications

we here assume that the agent’s beliefs are represented in terms of a (possibly singleton) convex

compact set of probability functions Γ ⊆ ∆(W) over the possible states of the world. Given a utility
u and a belief Γ, examples of well-known decision rules from the literature that we will encounter
later are:

1. Maxmin:

a∗(u,Γ) = arg max
a∈A

min
µ∈Γ

∑
w∈W

u(w,a)µ(w)

2. Maximax:

a∗(u,Γ) = arg max
a∈A

max
µ∈Γ

∑
w∈W

u(w,a)µ(w)

3


3. Laplace rule:

a∗(u,Γ) = arg max
a∈A

∑
w∈W

1

|W|
u(w,a)

4. Expected utility maximization (for Γ singleton):

a∗(u,Γ) = arg max
a∈A

∑
w∈W

u(w,a)µ(w)

It is worth noticing that both maxmin and maximax boil down to expected utility maximization

when the set Γ is a singleton, and in turn expected utility maximization reduces to the Laplace rule
when the belief µ is a uniform probability over the states.

As mentioned previously, for a choice mechanism to prescribe an action, the decision rule needs

to be given a specific utility u and belief Γ as input. We call the pair (u,Γ) a subjective representation
of the decision situation. In the following, we investigate the evolutionary fitness of general and

systematic ways of forming such subjective representations across many different decision situations.

A fitness game is an interactive decision situation. For a given fitness game G = 〈N,(Ai,πGi )i∈N〉,
let us denote the evolutionary payoff, or fitness, of player i by the function πGi : Πi∈N Ai → R,
where Ai is player i’s (finite) set of actions. For simplicity of exposition we assume that all games
that are played are symmetric two-player games where N := {1,2}, A1 = A2 and πG1 (a1,a

′
2
) =

πG
2
(a′

1
,a2) =: πG(a,a′).2 The fitness of a choice mechanism c with decision rule a∗c and subjective

representation (uc,Γc) is measured in terms of the expected evolutionary payoff of c. Formally,
the fitness of choice mechanism c against choice mechanism c′ in a symmetric two-player game
G = 〈{1,2}, A,πG〉 is given by:

FG(c,c
′) = πG(a∗c(u

G
c ,Γc),a

∗
c′(u

G
c′,Γc′)).

3

Given the game theoretic setting, the subjective utility uGc is now a function u
G
c : A × A → R, and

the subjective belief Γc is a set of probability functions over the co-player’s actions, Γc ⊆ ∆(A).
Going beyond a single fixed fitness game, we consider a class of possible games. For concreteness,

let G be a class of two-player symmetric games, together with a probability measure PG(G) for
the occurrence probability of game G ∈ G. Intuitively, the probability PG encodes the statistical
properties of the environment. A meta-game is then a tuple MG = 〈C M,G, PG, F〉, where C M is
a set of choice mechanisms, G is a class of possible games, PG(G) is the probability of game G to
occur, and F : C M × C M →R is the (meta-)fitness function, defined as:

F(c,c′) =
∫

PG(G) FG(c,c
′) dG . (1)

2Since payoff functions are symmetric, we simply write πG(a,a′) for πG
1
(a1,a′2) and A := A1 = A2, as usual.

However, notice that all definitions and results can be extended to more general cases.

3Whenever a choice mechanism would not select a unique action, we assume that the player chooses one of the

equally optimal actions at random. I.e., FG(c,c′) =
∑

a∈a∗c(u
G
c ,Γc)

∑
a′∈a∗

c′
(uG

c′
,Γc′)

1
|a∗c(u

G
c ,Γc)|

1
|a∗

c′
(uG

c′
,Γc′)|
πG(a,a′).

4


Hence, F(c,c′) determines the evolutionary payoff of choice mechanism c against c′ in the meta-
game. The set C M can be thought of as the set of choice mechanisms that are present within a
given population playing the games from the class G. Consequently, it is possible to compute the

average fitness of c against the population, that is given by:

F(c) =
∫

PM(c
′) F(c,c′) dc′ =

∫ ∫
PG(G) PM(c

′) FG(c,c
′) dc′ dG , (2)

where PM(c′) is the probability of encountering a co-player with choice mechanism c′.
Meta-games are then abstract models for the evolutionary competition between choice mech-

anisms in interactive decision making contexts. Standard notions of evolutionary game theory

apply to meta-games as well. For example, a choice mechanism c is a strict Nash equilibrium if
F(c,c) > F(c′,c) for all c′; it is evolutionarily stable if for all c′ either (i) F(c,c) > F(c′,c) or (ii)
F(c,c) = F(c′,c) and F(c,c′) > F(c′,c′); it is neutrally stable if for all c′ either (i) F(c,c) > F(c′,c)
or (ii) F(c,c) = F(c′,c) and F(c,c′) ≥ F(c′,c′) (Maynard Smith, 1982). Similarly, evolutionary
dynamics can be applied to meta-games. Later we will also turn towards a dynamical analysis

in terms of replicator dynamics (Taylor and Jonker, 1978) and replicator mutator dynamics (e.g.,

Nowak, 2006).

4 Evolution of Preferences

To demonstrate the usefulness of a meta-game approach, we compare a selection of general ways

of forming belief and preference representations against each other. As for subjective preferences,

consider initially:

1. the objective utility, defined by: for all G ∈G,

obj
G(a,a′) = πG(a,a′);

2. the regret, defined by: for all G ∈G,

reg
G(a,a′) = πG(a,a′)−max

a′′∈A
πG(a′′,a′).

As motivation for this comparison, it is to be stressed that regret minimization is one of the main

alternatives to utility (or value) maximization in the literature on decision criteria (see also Bleichrodt

and Wakker, 2015). For a start, the subjective beliefs that we take into consideration are also two:

1. prc: a precise uniform belief µ such that µ(a) = 1
|A| for all a ∈ A;

2. imp: a maximally imprecise belief Γ = ∆(A).

5


Although a thorough discussion of this issue goes beyond the scope of this work, let us say that

these two kinds of belief underlie two different and alternative views on uncertainty. Faced with

uncertain events, a strict Bayesian will always form a precise belief, specified by a single probability

µ. In the absence of any information about future uncertain events, the Bayesian would mostly

invoke the principle of insufficient reason, and accordingly choose a uniform probability over the possible

outcomes. In contrast, others have argued against the obligation of representing a belief by means of

a single probability measure, opposite to the Bayesian paradigm (e.g., Gilboa and Marinacci, 2013).

They argue instead in favor of a more encompassing account, according to which uncertainty can

be unmeasurable, and represented by a (convex and compact) set of probabilities (e.g., Gilboa and

Schmeidler, 1989). This line of thought has its origin in decision theory, motivated by Ellsberg’s

famous paradoxes (Ellsberg, 1961), and appears extremely relevant in game-theoretic contexts too.

Indeed, in a recent paper Battigalli et al. (2015) write:

Such [unmeasurable] uncertainty is inherent in situations of strategic interaction. This

is quite obvious when such situations have been faced only a few times. (p. 646)

In evolutionary game theory, for instance, players obviously face uncertainty about the composition

of the population that they are part of, and consequently about the (type of) co-player that they are

randomly paired with at each round and about the co-player’s action. In case of complete lack of

information about the composition of the population, a non-Bayesian player would thus entertain

maximal unmeasurable uncertainty, i.e., a maximally imprecise belief.4 As already anticipated, we

will see that the way agents form beliefs, and the possibility of holding imprecise beliefs in particular,

can have a fundamental impact on their evolutionary success.

As for the decision rule, we assume that players use the maxmin rule. This is in line with many

representation results of decision making under unmeasurable uncertainty (e.g., Ghirardato and

Marinacci, 2002; Gilboa and Schmeidler, 1989), and seems corroborated by empirical findings too.

Ellsberg’s paradoxes are prominent examples (Ellsberg, 1961), and evidence from experimental

literature suggests that agents are generally averse to unmeasurable uncertainty (e.g., Trautmann

and Kuilen, 2016).

Finally, note that when the maxmin rule acts on subjective representations of type (obj, imp),

i.e., objective preferences and imprecise beliefs, the generated behavior corresponds to the classic

maxmin strategy (von Neumann and Morgenstern, 1944). When the maxmin rule acts on subjective

representation (reg, imp), the agent’s behavior is known as regret minimization.5 Two facts follow from

4Such a radical uncertainty could ensue, for example, if agents have no conception of their co-player or her prefer-

ences. Unsophisticated agents, as considered in evolutionary game theory, might be entirely unaware of the fact that they

are engaged in social decision making (see Heifetz, Meier, and Schipper, 2013, for game-theoretic models of unaware-

ness). It is therefore not ludicrous to consider radical uncertainty first and tend to more sophisticated ways of forming

beliefs later (more on this below).

5The notion of regret in decision theory dates back at least to the work by Savage (1951), and has later been developed

by Bell (1982), Fishburn (1982) and Loomes and Sugden (1982) independently. Recently, Halpern and Pass (2012) showed

how the use of regret minimization can give solutions to game-theoretic puzzles (like the Traveller’s dilemma and the

6


I II

I 1 0

II 0 2

(a) Coordination game G.

I II

I 0 -2

II -1 0

(b) Regret-based representation of G.

Figure 1: A coordination game (left) and the associated regret representation (right).

these observations. The first is related to our focus on different types of uncertainty that players may

entertain.

Fact 1. For any precise (Bayesian) belief µ, maximization of expected (objective) utility based on

µ and minimization of expected regret based on µ are behaviorally equivalent.

The second fact highlights another behavioral equivalence, which we will make use of shortly in the

following section.

Fact 2. In the class of 2×2 symmetric games, the acts selected by the Laplace rule are exactly the
acts selected by regret minimization.

Here is a simple example that shows these choice mechanisms in action. Consider the coordi-

nation fitness game G depicted in figure 1a. Since the game is symmetric, it suffices to specify the
evolutionary payoffs for the row player. Figure 1a also represents the objective utility objG, since

objG = πG by definition, whereas figure 1b pictures the representation of G in terms of regret-
based utilities. While classic maxmin is indifferent between I and I I (figure 1a), regret minimization
uniquely selects I I (figure 1b).

5 Results

5.1 Simulation Results

Since for now we keep the decision rule fixed to maxmin, a player’s choice mechanism will only

depend on the player’s subjective representation (u,Γ). For brevity, from now on we will refer
to the pair (u,Γ), like (reg, imp) or (obj,prc), as the type of the player. Sometimes we will also
distinguish types by referring to the subjective utility only, for instance (reg, imp) and (reg,prc) are
regret types.

As observed earlier, meta-games factor in statistical properties of the environment. For par-

ticular empirical purposes, one could consult a specific class of games G with appropriate, maybe

Centipede game) in a way that is closer to everyday intuition and empirical data. In this paper the notion of regret defined

earlier is the same as in Halpern and Pass (2012).

7


reg, imp obj, imp reg,prc obj,prc

reg, imp 6.663 6.662 6.663 6.663

obj, imp 6.486 6.484 6.486 6.486

reg,prc 6.663 6.662 6.663 6.663

obj,prc 6.663 6.662 6.663 6.663

Table 1: Average evolutionary fitness from Monte Carlo simulations of 100.000 symmetric

2×2 games.

empirically informed probability PG in order to match the natural environment of a given popula-
tion. For our present purposes, letGbe a set of symmetric two-player fitness games with two acts for

a start. Each game G ∈ G is then individuated solely by its payoff function, i.e., by a quadruple of
numbers G = (a,b,c,d). As for the occurrence probability PG(G) of game G, we imagine that the
values a,b,c,d are i.i.d. random variables sampled from the set {0, . . . ,10} according to uniform
probability PV. Using Monte Carlo simulations, we can then approximate the values of equation (1)
to construct meta-game payoffs. Results based on 100.000 randomly sampled games are given in
table 1.6

Simulation results obviously reflect Fact 2 in that all encounters in which types (reg, imp),
(reg,prc) or (obj,prc) are substituted for one another yield identical results. More interestingly,
table 1 shows that (obj, imp), the maxmin strategy, is strictly dominated by the three other types:
in each column (i.e., for each type of co-player), the maxmin strategy is strictly worse than any of

the three competitors. This has a number of interesting consequences.

If we restrict attention to subjective representations with imprecise beliefs only, then a monomor-

phic state in which every agent has regret-based preferences is the only evolutionarily stable state. More

strongly, since (obj, imp) is strictly dominated by (reg, imp), we expect selection that is driven by
(expected) fitness to invariably weed out maxmin players (obj, imp) in favor of (reg, imp), regret
minimization. In terms of choice rules, this means that regret minimization is evolutionarily better

than maxmin over the class of games considered. In terms of subjective preferences, it shows that

players using the objective representation that directly looks at fitness (possibly money, or profit) are

outperformed by non-veridical (regret) representations, when players’ beliefs are imprecise.

Next, if we look at the competition between all four types represented in table 1, (reg, imp) is
no longer evolutionarily stable. Given behavioral equivalence (Fact 2), types (reg, imp), (reg,prc),
and (obj,prc) are all neutrally stable (Maynard Smith, 1982). But since (obj, imp) is strictly dominated
and so disfavored by fitness-based selection, we are still drawn to conclude that maxmin behavior

is weeded out in favor of a population with a random distribution of the remaining three types.

6Concretely, 100.000 games were sampled repeatedly by choosing independently four integers between 0 and 10
uniformly at random. For each game, the action choices of all four choice mechanisms were determined and payoffs

from all pairwise encounters recorded. The number in each cell of table 1 is the average payoff for the choice mechanism

listed in the row when matched with the choice mechanism in the column.

8


Simulation results of the (discrete time) replicator dynamics (Taylor and Jonker, 1978) indeed show

that random initial population configurations are attracted to states with only three player types:

(reg, imp), (reg,prc) and (obj,prc). The relative proportions of these depend on the initial shares
in the population. This variability fully disappears if we add a small mutation rate to the dynam-

ics. Take a fixed, small mutation rate � for the probability that a player’s subjective utility or her

subjective belief changes to another utility or belief. The probability that a player’s subjective rep-

resentation randomly mutates into a completely different representation with altogether different

utility and belief would then be�2. With these assumptions about “component-wise mutations,” nu-

merical simulations of the (discrete time) replicator mutator dynamics (Nowak, 2006) show that already

for very small mutation rates almost all initial population states converge to a single fixed point in

which the majority of players have regret-based utility. For instance, with � = 0.001, almost all
initial populations are attracted to a final distribution with proportions:

(reg, imp) (obj, imp) (reg,prc) (obj,prc)

0.289 0.021 0.398 0.289

What this suggests is that, if biological evolution selects behavior-generating mechanisms, not

behavior as such, it need not be the case that behaviorally equivalent mechanisms are treated equally

all the while. If mutation probabilities are a function of individual components, it can be the case

that certain components of such behavior-generating mechanisms are more strongly favored by a

process of random mutation and selection. This is exactly the case with regret-based preferences.

Since regret-based preferences are much better in connection with imprecise beliefs than veridical

preferences are, the proportion of expected regret minimizers, (reg,prc), in the attracting state is
substantially higher than that of expected utility maximizers, (obj,prc), even though these types are
behaviorally equivalent.

5.2 Analytical Results

Results based on the single meta-game in table 1 are not fully general and possibly spoiled by random

fluctuations in the sampling procedure. Fortunately, for the case of 2×2 symmetric games, the main
result that maxmin types (obj, imp) are strictly dominated by regret minimizers can also be shown
analytically for considerably general conditions.

Proposition 1. Let G be the class of 2 × 2 symmetric games G = (a,b,c,d) generated by
i.i.d. sampling a,b,c,d from a set of values with at least three elements in the support. Then,
(reg, imp) strictly dominates (obj, imp) in the resulting meta-game.

Proof. All proofs are in Appendix A. �

Corollary 1. Let G be as in Proposition 1. If we only consider imprecise belief types, (obj, imp)
and (reg, imp), then the unique evolutionarily stable state is a monomorphic population of (reg, imp)
players.

9


The result shows that there is support for the main conceptual point that we wanted to make:

objective preference representations are not necessarily favored by natural selection; objective pref-

erences are outperformed by non-veridical regret preferences if agents have imprecise beliefs. This

tells us that the main conclusions drawn in the previous section based on the approximated meta-

game of table 1 hold more generally for arbitrary 2×2 symmetric games with i.i.d. sampled payoffs.
This result presupposes at least occasional imprecise beliefs. The assumed imprecise beliefs do

not need to be maximally uncertain, however. Let the uncertainty held by a player be defined

by a convex compact set of probabilities [s, t] ⊆ ∆(A) over the co-player’s actions, where s is
the lower probability and t is the upper probability of action I I. We can then prove the following
proposition, which is the analogue of Proposition 1 for any possible (not necessarily maximal) degree

of uncertainty [s, t], with s , t. There is only one difference: we are now going to require i.i.d.
drawing of a continuous random variable. This is due to the fact that, for arbitrarily small intervals

[s, t], objective players ([s, t],obj) and regret players ([s, t],reg) can behave as holding a unique
probability measure (precise belief) if the underlying payoff space is not dense. The reason for this

technical requirement will become clearer from the proof.

Proposition 2. In the class of 2×2 symmetric games generated by i.i.d. drawing of a continuous
random variable from a distribution with density PV and support (r,r) ⊆R, for any imprecise belief
[s, t], the only evolutionarily stable state of a population with regret players ([s, t],reg) and objective
players ([s, t],obj) is a monomorphic state of ([s, t],reg) players.

This tells us that regret-based preferences can outperform objective preference representations

when agents are also capable of learning or otherwise restricting their assumptions about the co-

player’s behavior as long as there is, at least on occasion, some imprecision in their beliefs. We will

enlarge on the issue of belief formation after having covered some more relevant extensions in the

next section.

6 Extensions

How do the basic results from the previous section carry over to richer models? Section 6.1 first

introduces further conceptually interesting subjective representations that have been considered in

the literature. Section 6.2 then addresses the case of symmetric two-player n × n games for n ≥ 2.
Finally, Section 6.3 ends with a brief comparison to the case of solitary decision making.

6.1 More Preference Types

The space of possible preference types is enormous, and we have only compared regret and objective

types so far. Let us now look at two other types of subjective preferences that have been investigated,

especially in behavioral economics and in evolutionary game theory. A famous example is the

altruistic preference (e.g., Becker, 1976; Bester and Güth, 1998), summoned to explain the possibility

of altruistic behavior. At the other end of the spectrum, the competitive preference is located. The

two subjective utilities are defined as follows:

10


(reg, imp) (obj, imp) (com, imp) (alt, imp) (reg,prc) (obj,prc) (com,prc) (alt,prc)

(reg, imp) 6.663 6.662 5.829 7.105 6.663 6.663 5.829 7.489
(obj, imp) 6.486 6.484 6.088 6.703 6.486 6.486 6.088 6.875
(com, imp) 6.323 6.758 5.496 6.977 6.323 6.323 5.496 7.149
(alt, imp) 5.949 5.722 5.326 6.396 5.949 5.949 5.326 6.568
(reg,prc) 6.663 6.662 5.829 7.105 6.663 6.663 5.829 7.489
(obj,prc) 6.663 6.662 5.829 7.105 6.663 6.663 5.829 7.489
(com,prc) 6.323 6.758 5.496 6.977 6.323 6.323 5.496 7.149
(alt,prc) 6.331 5.893 5.497 6.566 6.331 6.331 5.497 7.152

Table 2: Average evolutionary fitness from Monte Carlo simulations of 100.000 symmetric

2×2 games.

1. altruistic utility: for all G ∈G, altG(a,a′) = πG(a,a′) +πG(a′,a);7

2. competitive utility: for all G ∈G, comG(a,a′) = πG(a,a′)−πG(a′,a).

Table 2 shows results of Monte Carlo simulations that approximate the expected fitness in the

relevant meta-game with all the subjective representations considered so far. These results confirm

basic intuitions about altruistic and competitive types: everybody would like to have an altruistic

co-player and nobody would like to play against a competitive player. Perhaps more surprisingly,

(alt, imp) comes up strictly dominated by (com, imp), but competitive types themselves are worse off
against all types except against maxmin players (obj, imp) than any of the behaviorally equivalent
types (reg, imp), (obj,prc), and (reg,prc). It is thus easy to see that the previous results still obtain
for the larger meta-game in table 2: (reg, imp), (obj,prc), and (reg,prc) are still neutrally stable;
simulation runs of the (discrete-time) replicator dynamics on the 8×8 meta-game from table 2 end
up in population states consisting of only these three types in variable proportion.

In sum, the presence of other subjective representations, such as those based on altruistic or

competitive utilities, does not undermine, but rather strengthens our previous results.

6.2 More Actions

Results from Section 5 relied heavily on Fact 2, which is no longer true when we look at arbitrary

n × n games. Table 3 gives approximations of expected fitness in the class of n × n symmetric
games. Concretely, the numbers in table 3 are averages of evolutionary payoffs obtained in 100.000

randomly sampled symmetric games, where each fitness game G was sampled by first picking a
number of acts nG ∈ {2, . . . ,10} uniformly at random, and then filling the necessary nG ×nG payoff
matrix with i.i.d. sampled numbers, as before.

The most important result is that the regret minimizing type (reg, imp) is strictly dominated by
(reg,prc) and by (obj,prc) in the meta-game from table 3. This means that while simple regret

7A more general formulation would be to defineα-altruistic utility, forα ∈ [0,1], uGα(a,a
′) = πG(a,a′)+απG(a′,a).

Since we are not interested in the evolution of degrees of altruism, here we simply fix α = 1. Analogously for
α-competitive utilities too.

11


(reg, imp) (obj, imp) (com, imp) (alt, imp) (reg,prc) (obj,prc) (com,prc) (alt,prc)

(reg, imp) 6.567 6.570 5.650 6.992 6.564 6.564 5.593 7.409
(obj, imp) 6.476 6.483 5.896 6.818 6.484 6.484 5.850 7.124
(com, imp) 6.468 6.647 5.512 7.169 6.578 6.578 5.577 7.354
(alt, imp) 5.968 5.923 5.363 6.685 5.975 5.975 5.086 6.973
(reg,prc) 6.908 6.918 5.988 7.456 6.929 6.929 5.934 7.783
(obj,prc) 6.908 6.918 5.988 7.456 6.929 6.929 5.934 7.783
(com,prc) 6.529 6.680 5.445 7.276 6.542 6.542 5.521 7.440
(alt,prc) 6.450 6.337 5.772 6.978 6.457 6.457 5.479 7.500

Table 3: Average evolutionary fitness for 100.000 randomly generated n × n symmetric
games with n randomly drawn from {2, . . . ,10}.

reg obj com alt

reg 6.926 6.926 5.942 7.757

obj 6.924 6.924 5.948 7.751

com 6.566 6.570 5.481 7.434

alt 6.463 6.461 5.478 7.469

Table 4: Meta-game for the evolutionary competition between subjective utilities when

beliefs are exogenously given (see main text).

minimization can thrive in some evolutionary contexts, there are also contexts where it is demon-

strably worse off. While this may be bad news for regret minimizing types (reg, imp), it is not the
case that regret types as such are weeded out by selection. Since, by Fact 1, (reg,prc) and (obj,prc)
are behaviorally equivalent in general, it remains that selection based on meta-games constructed

from n × n games will still not eradicate regret preferences.
On the other hand, there are plenty of ways in which the basic insights from Propositions 1 and

2 can make for situations in which evolution would favor regret types, even in n × n games. If, for
example, the belief of a player is a trait that biological evolution has no bite on, but rather some-

thing that the particular choice situation would exogenously give us (possibly because of the different

amount of information available in different choice situations), then regret-based preferences can

again drive out veridical preferences altogether. For example, suppose that only preference repre-

sentations compete and that agents’ beliefs are exogenously given, in such a way that both players

hold precise (Bayesian) uniform beliefs with probability p and they both have maximally imprecise
beliefs otherwise. This transforms the meta-game from table 3 into a simpler 4 × 4 meta-game
in which the payoff obtained by a subjective preference is the weighted average over the payoffs

of the subjective representations including that preference in table 3. Setting p = 0.98 for illus-
tration, we get the meta-game in table 4. The only evolutionarily stable state of this meta-game

is again a monomorphic population of regret types. Accordingly, all our simulation runs of the

12


(reg, imp) (obj, imp) (reg,prc) (obj,prc)

6.318 6.237 6.661 6.661

Table 5: Expected fitness of choice mechanisms approximated from 100.000 simulated

solitary decision problems (see main text).

(discrete-time) replicator dynamics converge to monomorphic regret-type populations. The reason

why regret-based utilities prosper is because they have a substantial fitness advantage when paired

with imprecise beliefs (Propositions 1 and 2). If unmeasurable uncertainty is exogenously given as

something that happens to agents because of the information available in some choice situations,

and even if that happens only very infrequently (i.e., for rather low p), regret preferences will out-
perform objective preferences, as well as competitive and altruistic preferences.

6.3 Solitary Decisions

To see how different choice mechanisms behave in evolutionary competition based on solitary de-

cision making, we approximated, much in the spirit of meta-games, average accumulated fitness

obtained in randomly generated solitary decision problems. For our purposes, a decision problem

D =
〈
W, A,πD

〉
consists of a set of states of the world W, a set of acts A, and a payoff function

πD : W × A → R. We generate arbitrary decision problems by selecting, uniformly at random,
numbers of states and acts nDw,n

D
a ∈ {2, . . . ,10} and then filling the payoff table, so to speak, by

i.i.d. samples for each πD(w,a) ∈ {0,10}. Unlike with two-player games, we need to also sample
the actual state of the world, which we selected uniformly at random from the available states in the

current decision problem. Accordingly, the fitness of choice mechanism c in decision problem D is
given by:

FD(c) = π
D(w,a∗c(u

D
c ,Γc))µ(w),

with µ(w) = 1
nDw

for all w. As subjective representations, we considered the original cast of four
from table 1, since altruistic and competitive types are meaningless in solitary decision situations.

As before, the relevant fitness measure, defined in equation (3), was approximated by Monte Carlo

simulations, the results of which are given in table 5.

F(c) =
∫

PD(D) FD(c) dD (3)

Facts 1 and 2 still apply: (reg,prc) and (obj,prc) are behaviorally equivalent in general, and
(reg, imp) is behaviorally equivalent to the former two in decision problems with two states and
two acts. This shows in the results from table 5 in that the averages for (reg,prc) and (obj,prc)
are identical. But since we included decision problems with more acts and more states as well, the

average for regret minimizers (reg, imp) is not identical to the one of (reg,prc) and (obj,prc). It is,
in fact, lower, but again not as low as that of (obj, imp).

13


This means that every relevant result we have seen about game situations is also borne out for

solitary decisions. Evolutionary selection based on objective fitness will not select against regret pref-

erences, as these are indistinguishable from veridical preferences when paired with precise beliefs.

But when paired with imprecise beliefs, regret-based utilities outperform objective utilities. Conse-

quently, if there is a chance, however small, that agents fall back on imprecise beliefs, evolution will

actually positively select for non-veridical regret-based preferences.

6.4 Sophisticated Beliefs

Since one of our main purposes was to illustrate the usefulness of a meta-game approach by the case

study of objective and regret preferences, we have partially neglected an important and interesting

issue, namely the evolution of ways of forming beliefs about co-players’ behavior or the actual state

of the world. For reasons of space we must, unfortunately, leave a deeper exploration of belief

type evolution to another occasion. Two remarks are in order nonetheless. Firstly, belief type

evolution can be studied without conceptual hurdles in the meta-game framework, so that there is

no principled argument against the main methodological contribution of this paper. Secondly, our

results regarding the comparison between regret and objective types remain to be informative, even

if we allow agents to learn or reason strategically.8 This is because we know from Fact 1 that regret

and objective preferences come up behaviorally equivalent when paired with precise probabilistic

beliefs (given identical decision rule). This holds no matter what the content of that belief is. So,

if learning, reasoning or statistical knowledge about a recurrent situation can be brought to bear,

this will not make evolution select against regret-based preferences. If, on the other hand, agents

resort to imprecise beliefs at least occasionally (e.g., when they are unaware of the co-player or her

utilities, or when strategic reasoning cannot reduce all uncertainty about the co-player’s choice),

then regret-based preferences can be favored by natural selection over objective preferences.

7 Conclusion

The assumption that players and decision makers maximize their (subjective) utility is central through

in the economics literature, and the maximization of actual (objective) payoffs is often justified by

appealing to evolutionary arguments and natural selection. In contrast to the standard view, we

showed the existence of player types with subjective utilities different from the actual evolutionary

payoffs that can outperform types whose subjective utilities coincide with the evolutionary payoffs.

The claim is not that regret preferences are the best on the market, but rather that utilities that

perfectly mirror evolutionary fitness can be outclassed by subjective utilities that differ from the

objective fitness. While the literature on evolution of preferences has focused on fixed games, we

have adopted a more general approach here. We suggested that attention to “meta-games” is cru-

cial, because what may be a good subjective representation in one type of game (e.g., cooperative

8Some research has recently been done along these lines. See in particular Mengel, 2012; Mohlin, 2012; Robalino

and Robson, 2016.

14


preferences in the Prisoner’s Dilemma class) need not be generally beneficial. Taken together, we

presented a variety of plausible circumstances in which evolutionary competition between choice

mechanisms on a larger class of games can favor non-veridical preference representations focusing

on regret.

A Proofs

The proof of Proposition 1 relies on a partition of G, and on some lemmas. For brevity, let us

denote the regret minimizer (reg, imp) by R and the maximinimizer (obj, imp) by M. Following
equation (1), let FG(X,Y) denote the expected payoff of choice mechanism X against choice mech-
anism Y on the possibly restricted class of fitness games G.

Proof of Proposition 1. By definition of strict dominance, we have to show that in the class G

of symmetric 2×2 games with payoffs sampled from a set of i.i.d. values with at least 3 elements in
the support, it holds that:

(i) FG(R,R) > FG(M,R); (ii) FG(M, M) < FG(R, M).

To show this we use the following partition of G, based on payoffs parametrized as follows:

I I I

I a b
I I c d

1. Coordination games C: a > c and d > b;

2. Anti-coordination games A: a < c and d < b;

3. Strong dominance games S: aut (a > c and b > d) aut (a < c and b < d);

4. Weak dominance games W: aut a = c aut b = d;

5. Boring games B: a = c and b = d.

Before proving the lemmas, it is convenient to fix some notation. Let us call x,y,z the 3 elements in
the support, and without loss of generality suppose that x > y > z. We denote by C a coordination
game in C with payoffs aC, bC, cC, and dC; similarly for games A ∈A, S ∈S, W ∈W, and B ∈B.
Let us denote by IRC the event that a R-player plays action I in the game C; and similarly for action
I I, for player M, and for games A, S , W and B. We first consider the case of i.i.d. sampling with
finite support.

Lemma 1. R and M perform equally well in S and in B.

15


Proof. By definition of regret minimization and maxmin it is easy to check that whenever in a game

there is a strongly dominant action a$, then a$ is both the maxmin action and the regret minimizing
action. Then, for all the games in S, R chooses action a if and only if M chooses action a. Conse-
quently, R and M always perform equally (well) in S. In the case of B it is trivial to see that all the
players perform equally. �

Lemma 2. In W, R strictly dominates M.

Proof. Assume without loss of generality that b = d, and that a > c. There are two cases that
we have to check: (i) c < b = d and (ii) c ≥ b = d. In the first case it is easy to see that R
and M perform equally: act I is the choice of both R and M. In the case of (ii) instead we have
that I is the regret minimizing action, whereas both actions have the same minimum and M plays
(1
2

I; 1
2

I I), since both I and I I maximize the minimal payoff. Consider now a population of R and
M playing games from the class W. Whenever (i) is the case R and M perform equally well. But
suppose W ∈ W and (ii) is the case. Then, πW(R,R) = a >

1
2
a + 1

2
c = πW(M,R), whereas

πW(M, M) =
1
4
a + 1

4
b + 1

4
c + 1

4
d < 1

2
a + 1

2
b = πW(R, M). Hence, we have that in general

FW(R,R) > FW(M,R), and FW(M, M) < FW(R, M). �

Since it is not difficult to see that both (R,R) and (M, M) are strict Nash equilibria in C, and that
(R,R) and (M, M) are not Nash equilibria in A, the main part of the proof will be to show that R
strictly dominates C in the class C∪A, that is:

(i’) FC∪A(R,R) > FC∪A(M,R), (ii’) FC∪A(M, M) < FC∪A(R, M).

This part needs some more lemmas to be proven, but firstly we introduce the following bijective

function φ between coordination and anti-coordination games.

Definition 3 (φ). The permutationφ(a,b,c,d) = (c,d,a,b) defines a bijective functionφ : C→
A that for each coordination game C ∈C with payoffs (aC,bC,cC,dC) gives the anti-coordination
game A ∈Awith payoffs (aA,bA,cA,dA) = (cC,dC,aC,bC). Essentially,φ swaps rows in the payoff
matrix.

Lemma 4. Occurrence probability of C equals that of φ(C): P(φ(C)) = P(C).

Proof. By definition, each game C ≡ (aC,bC,cC,dC) is such that aC > cC and dC > bC, and
each game A ≡ (aA,bA,cA,dA) is such that aA < cA and dA < bA. Given that a,b,c,d are i.i.d.
random variables and that a sequence of i.i.d. random variables is exchangeable, it is clear that

the probability of (aC,bC,cC,dC) equals the probability of (cC,dC,aC,bC). Hence, P(φ(C)) =
P(C). �

Lemma 5. Let P(E) be the probability of event E, e.g., P(IRC) is the probability that a random
R-player plays act I in coordination game C, which is either 0, .5 or 1. It then holds that:

· P(IRC) = P(I IRφ(C)), and P(I IRC) = P(IRφ(C));

16


· P(IMC) = P(I IMφ(C)), and P(I IMC) = P(IMφ(C)).

Proof. It is easy to check that if bC − dC > cC − aC, a R-player plays action I in C; that if bC − dC <
cC − aC, R plays I I; and that if bC − dC = cC − aC, a R-player is indifferent between I and I I in C,
and so randomizes with (1

2
I; 1

2
I I). Similarly, if aA − cA > dA − bA, a R-player plays action I in A; if

aA − cA < dA − bA, R plays I I; and if aA − cA = dA − bA, a R-player is indifferent between I and I I
in A, and randomizes with (1

2
I; 1

2
I I). Consequently, if bC − dC > cC − aC, then P(IRC) = 1, and

by definition of φ we have P(I IRφ(C)) = 1. Likewise, if bC − dC < cC − aC, then P(I IRC) = 1 =
P(IRφ(C)); and if bC − dC = cC − aC, then P(IRC) = P(I IRC) =

1
2
= P(I IRφ(C)) = P(IRφ(C)).

In the same way, in coordination games we have that if bC > cC, a M-player plays I; if cC > bC,
a M-player plays I I; and if bC = cC, M is indifferent between I and I I, and plays (

1
2

I; 1
2

I I). In
anti-coordination games instead, if aA > dA, M plays I; if aA < dA, M plays I I; if aA = dA, M plays
(1
2

I; 1
2

I I). By definition of φ: P(IMC) = 1 = P(I IMφ(C)) if bC > cC; P(I IMC) = 1 = P(IMφ(C)) if
cC > bC; and P(IMC) = P(I IMC) =

1
2
= P(I IMφ(C)) = P(IMφ(C)) if bC = cC. �

Lemma 6. It holds that:

· aC > dC → (IMC ⊆ IRC);

· aC = dC → IMC = IRC.

· aC < dC → (I IMC ⊆ I IRC);

Proof. The event that R plays action I, IRC, with positive probability is the event that bC − dC ≥
cC − aC: if bC − dC > cC − aC, R plays I, and if bC − dC = cC − aC, R plays (

1
2

I; 1
2

I I). Similarly,
the event that IMC has positive occurrence is the event that bC ≥ cC: if bC > cC, M plays I, and if
bC = cC, M plays (

1
2

I; 1
2

I I). Then, IRC implies that bC−dC ≥ cC−aC, and IMC implies that bC ≥ cC.
Moreover, on the assumption that aC > dC, it is easy to check that bC ≥ cC implies bC−dC > cC−aC.
Hence, in any C with aC > dC it holds that IMC implies IRC, i.e., aC > dC → (IMC ⊆ IRC). Instead,
it is possible that aC > dC, bC − dC > cC − aC and bC < cC hold simultaneously, so that IMC + IRC.
By a symmetric argument it can be shown that aC < dC → (I IMC ⊆ I IRC) too. Finally, when
aC = dC it holds that: bC − dC > cC − aC iff bC > cC; bC − dC < cC − aC iff bC < cC; and
bC − dC = cC − aC iff bC = cC. Hence, aC = dC → IMC = IRC. �

We are now ready to prove that FC∪A(R,R) > FC∪A(M,R). With notation like P(IRC ∩ IRC)
denoting the probability that a random R-player plays I and another R-player plays I as well in

17


game C, rewrite the inequality as:∑
C∈C

P(C)[P(IRC ∩ IRC) · aC + P(I IRC ∩ I IRC) · dC + P(IRC ∩ I IRC) · bC + P(I IRC ∩ IRC) · cC]

+
∑
A∈A

P(A)[P(IRA ∩ IRA) · aA + P(I IRA ∩ I IRA) · dA + P(IRA ∩ I IRA) · bA + P(I IRA ∩ IRA) · cA]

>
∑
C∈C

P(C)[P(IRC ∩ IMC) · aC + P(I IRC ∩ I IMC) · dC + P(IRC ∩ I IMC) · cC + P(I IRC ∩ IMC) · bC]

+
∑
A∈A

P(A)[P(IRA ∩ IMA) · aA + P(I IRA ∩ I IMA) · dA + P(IRA ∩ I IMA) · cA + P(I IRA ∩ IMA) · bA]

By Lemma 4 and Lemma 5, we can express everything in terms of C only:∑
C P(C)[P(IRC ∩ IRC) · aC + P(I IRC ∩ I IRC) · dC + P(IRC ∩ I IRC) · bC + P(I IRC ∩ IRC) · cC+

P(I IRC ∩ I IRC) · cC + P(IRC ∩ IRC) · bC + P(I IRC ∩ IRC) · dC + P(IRC ∩ I IRC) · aC]

>
∑

C P(C)[P(IRC ∩ IMC) · aC + P(I IRC ∩ I IMC) · dC + P(IRC ∩ I IMC) · cC + P(I IRC ∩ IMC) · bC+

P(I IRC ∩ I IMC) · cC + P(IRC ∩ IMC) · bC + P(I IRC ∩ IMC) · aC + P(IRC ∩ I IMC) · dC]

This simplifies to:∑
C P(C)[aC · (P(IRC ∩ IRC) + P(IRC ∩ I IRC)) + bC · (P(IRC ∩ I IRC) + P(IRC ∩ IRC))+

cC · (P(I IRC ∩ IRC) + P(I IRC ∩ I IRC)) + dC · (P(I IRC ∩ I IRC) + P(I IRC ∩ IRC))]

>
∑

C P(C)[aC · (P(IRC ∩ IMC) + P(I IRC ∩ IMC)) + bC · (P(I IRC ∩ IMC) + P(IRC ∩ IMC))+

cC · (P(IRC ∩ I IMC) + P(I IRC ∩ I IMC)) + dC · (P(I IRC ∩ I IMC) + P(IRC ∩ I IMC))]

Now let us split into a > d and a < d, and consider a > d first. Notice that, by Lemma 6, the
case a = d is irrelevant in order to discriminate between R and M. If a > d, by Lemma 6 we can
eliminate the cases where R plays I I and M plays I:∑

Ca>d P(C)[aC · (P(IRC ∩ IRC) + P(IRC ∩ I IRC)) + bC · (P(IRC ∩ I IRC) + P(IRC ∩ IRC))

+ cC · (P(I IRC ∩ IRC) + P(I IRC ∩ I IRC)) + dC · (P(I IRC ∩ I IRC) + P(I IRC ∩ IRC))]

>
∑

Ca>d P(C)[aC · P(IRC ∩ IMC) + bC · P(IRC ∩ IMC) + cC · (P(IRC ∩ I IMC) + P(I IRC ∩ I IMC))

+ dC · (P(I IRC ∩ I IMC) + P(IRC ∩ I IMC))]

Rewrite:∑
Ca>d P(C)[aC · (P(IRC ∩ IRC) + P(IRC ∩ I IRC)− P(IRC ∩ IMC))

+ bC · (P(IRC ∩ I IRC) + P(IRC ∩ IRC)− P(IRC ∩ IMC))

+ cC · (P(I IRC ∩ IRC) + P(I IRC ∩ I IRC)− P(IRC ∩ I IMC)− P(I IRC ∩ I IMC))

+ dC · (P(I IRC ∩ I IRC) + P(I IRC ∩ IRC)− P(I IRC ∩ I IMC)− P(IRC ∩ I IMC))] > 0

18


We now distinguish between two cases: (1) a − c = d − b and (2) a − c , d − b. Notice that
P(IRC ∩ I IRC) , 0 if and only if case (1) obtains, and that a > d and (1) imply I IMC. Then, from (1)
we have9:

∑
Ca>d P(C)[aC · (P(IRC ∩ IRC) + P(IRC ∩ I IRC)) + bC · (P(IRC ∩ I IRC) + P(IRC ∩ IRC))

+ cC · (P(I IRC ∩ IRC) + P(I IRC ∩ I IRC)− P(IRC ∩ I IMC)− P(I IRC ∩ I IMC))

+ dC · (P(I IRC ∩ I IRC) + P(I IRC ∩ IRC)− P(I IRC ∩ I IMC)− P(IRC ∩ I IMC))] > 0

that is∑
Ca>d P(C)[aC · (

1
4
+ 1

4
) + bC · (

1
4
+ 1

4
) + cC · (

1
4
+ 1

4
−

1
2
−

1
2
) + dC · (

1
4
+ 1

4
−

1
2
−

1
2
)] > 0

Since we have assumed a − c = d − b, the last inequality is not satisfied. We have instead:∑
Ca>d

P(C)[
1

2
aC +

1

2
bC −

1

2
cC −

1

2
dC] = 0

This means that where aC > dC and where (1) is the case, R and M are equally fit. This changes
when we turn to (2). In that case, since aC > dC → (IMC ⊆ IRC) by Lemma 6, we have that
P(IRC ∩ IRC) − P(IRC ∩ IMC) = P(IRC ∩ I IMC). Moreover, when aC > dC, bC ≥ cC implies
bC − dC > cC − aC (see Lemma 6). Consequently, when M plays either I or (

1
2

I; 1
2

I I), R always
plays I. Hence, whenever aC > dC and (2) obtain, it also holds that P(I IRC∩I IMC) = P(I IRC∩I IRC).
In this case we can simplify:∑

Ca>d P(C)[aC · (P(IRC ∩ IRC)− P(IRC ∩ IMC)) + bC · (P(IRC ∩ IRC)− P(IRC ∩ IMC))

+ cC · (P(I IRC ∩ I IRC)− P(IRC ∩ I IMC)− P(I IRC ∩ I IMC))

+ dC · (P(I IRC ∩ I IRC)− P(I IRC ∩ I IMC)− P(IRC ∩ I IMC))] > 0

∑
Ca>d P(C)[aC · P(IRC ∩ I IMC) + bC · P(IRC ∩ I IMC)

−cC · P(IRC ∩ I IMC)− dC · P(IRC ∩ I IMC)] > 0

∑
Ca>d P(C)[P(IRC ∩ I IMC) · (aC + bC − cC − dC)] > 0

9Note that when we have only 3 elements in the support it is not guaranteed that case (1), together with a > d,
may arise in a coordination game, whereas it is guaranteed that case (2), together with a > d, occurs with some positive
probability. If we take for instance x = 5,y = 2,z = 1, then case (1) cannot obtain, whereas if we take x = 3,y = 2,z =
1, both (1) and (2) may obtain (a = 3,b = 1,c = 2,d = 2 for case (1), and a = 3,b = 1,c = 2,d = 2 for case (2)).
Moreover, under the assumption that a > d, having 3 elements in the support is a necessary and sufficient condition for
case (2) to have positive occurrence in a coordination game. As it will be clear in the following, a positive occurrence of

case (2) only is enough for the theorem to hold.

19


We know that IRC implies that aC − cC ≥ dC − bC. Since we have assumed that aC − cC , dC − bC,
we have that aC − cC > dC − bC. Hence, the inequality∑

Ca>d

P(C)[P(IRC ∩ I IMC) · (aC + bC − cC − dC)] > 0

is satisfied. So, when aC > dC, R strictly dominates M. Symmetrically, from a < d and by distin-
guishing between the two cases (1) and (2) as before, in the end we get:

(1)
∑

Ca<d P(C)[−
1
2
aC −

1
2
bC +

1
2
cC +

1
2
dC] = 0; and

(2)
∑

Ca<d P(C)[P(I IRC ∩ IMC) · (−aC − bC + cC + dC)] > 0.

Hence, we can conclude that R strictly dominates M in the class C∪A. Notice that in case of i.i.d.
sampling with continuous support, games in B and W never arise, but the proof is the same for the

remaining games in S, C and A.

It remains to be shown that FC∪A(M, M) < FC∪A(R, M). As before, spell this out as:∑
C

P(C)[P(IMC ∩ IMC) · aC + P(I IMC ∩ I IMC) · dC + P(IMC ∩ I IMC) · bC + P(I IMC ∩ IMC) · cC]

+
∑

A

P(A)[P(IMA ∩ IMA) · aA + P(I IMA ∩ I IMA) · dA + P(IMA ∩ I IMA) · bA + P(I IMA ∩ IMA) · cA]

<
∑

C

P(C)[P(IRC ∩ IMC) · aC + P(I IRC ∩ I IMC) · dC + P(IRC ∩ I IMC) · bC + P(I IRC ∩ IMC) · cC]

+
∑

A

P(A)[P(IRA ∩ IMA) · aA + P(I IRA ∩ I IMA) · dA + P(IRA ∩ I IMA) · bA + P(I IRA ∩ IMA) · cA]

When a > d, similarly to the above derivation, we get:∑
Ca>d P(C)[aC · (P(IMC ∩ IMC) + P(IMC ∩ I IMC)− P(IRC ∩ IMC)− P(IRC ∩ I IMC))

+ bC · (P(IMC ∩ IMC) + P(IMC ∩ I IMC)− P(IRC ∩ IMC)− P(IRC ∩ I IMC))

+ cC · (P(I IMC ∩ IMC) + P(I IMC ∩ I IMC)− P(I IRC ∩ I IMC))

+ dC · (P(I IMC ∩ I IMC) + P(I IMC ∩ IMC)− P(I IRC ∩ I IMC))] < 0

We now distinguish between (1) b = c, (2) b > c, and (3) b < c. Notice that either (1) or (2), together
with a > d, implies IRC. Then we obtain:10

(1)
∑

Ca>d P(C)[−
1
2
aC −

1
2
bC +

1
2
cC +

1
2
dC] < 0;

(2)
∑

Ca>d P(C)[aC ·(P(IMC ∩ IMC)−P(IRC ∩ IMC)) + bC ·(P(IMC ∩ IMC)−P(IRC ∩ IMC))] = 0;

10Note that here, when we only have 3 elements in the support, case (2) is impossible, but cases (1) and (3) can occur

with positive probability, and this enough for our purpose.

20


P(I I) = 0
|

s
|

t

a′

c′

d′

b′

c

-

-
I

a

-

-

-

-

-

- b

-

-I I d

1

d − b

a − c

Figure 2: Example of strictly informative coordination game.

(3)
∑

Ca>d P(C)[aC ·(−P(IRC ∩I IMC))+ bC ·(−P(IRC ∩I IMC))+ cC ·(P(I IMC ∩I IMC)−P(I IRC ∩
I IMC)) + dC · (P(I IMC ∩ I IMC)− P(I IRC ∩ I IMC))] ≤ 0.

When a < d, the derivation proceeds symmetrically and we get:

(1)
∑

Ca<d P(C)[
1
2
aC +

1
2
bC −

1
2
cC −

1
2
dC] < 0;

(2)
∑

Ca<d P(C)[aC · (P(IMC ∩ IMC) − P(IRC ∩ IMC)) + bC · (P(IMC ∩ IMC) − P(IRC ∩ IMC)) +
cC · (−P(I IRC ∩ IMC)) + dC · (−P(I IRC ∩ IMC))] ≤ 0;

(3)
∑

Ca<d P(C)[cC·(P(I IMC∩I IMC)−P(I IRC∩I IMC))+dC·(P(I IMC∩I IMC)−P(I IRC∩I IMC))] =
0.

Finally, we can conclude that FC∪A(M, M) < FC∪A(R, M). As before, notice that, when we have
i.i.d. sampling with continuous support, games in W and B never occur, but the proof is the same

for all the other cases. Hence, both when the support of a,b,c,d is finite, and when the support is
infinite, R strictly dominates M under the conditions assumed. a

Proof of Proposition 2. Take an arbitrary real interval (r,r), where r is the infimum and r is the
supremum, and let P be the PDF over (r,r). Given Lemma 4, we focus on the case of coordination
games. Now fix a set of probabilities [s, t], corresponding to the uncertainty of the players about
the co-player’s action. As we have seen, objective and regret types will behave exactly the same

when s = t and [s, t] reduces to a single probability. Notice that objective and regret will also be
indistinguishable when there is no game such that s < a−ca−c+d−b < t. It is then a necessary condition
for regret types to be better than objective types that the two types behave differently at least in some

games. Geometrically, the ratio a−ca−c+d−b denotes the intersection point of the lines corresponding

to action I and action I I (figure 2) . If there is no game where such an intersection point falls into

21


the interval [s′, t′], for some s′ , t′, then whenever players hold the set of probabilities [s′, t′] they
will perceive any game that can arise as having a dominant action, i.e., as a strong dominance game

in partition cell S. Note also that the ratio a−ca−c+d−b takes values in the interval (0,1), since to get
a−c

a−c+d−b = 0 we shall have a = c and for
a−c

a−c+d−b = 1 we need d = b, which have no probability
of occurrence for continuous random variables. Furthermore, the ratio a−ca−c+d−b is dense in (0,1),
and consequently in any interval [s, t] ⊂ (0,1). Indeed, for any rational number mn ∈ [s, t], we can
always find a,b,c,d ∈ (r,r) such that a−ca−c+d−b =

m
n as follows. It is not difficult to see that for this

to be the case we must have

a − (a − b)
m
n
= c + (d − c)

m
n
.

This equation reduces to

n − m
n

a +
m
n

b =
n − m

n
c +

m
n

d, (4)

which is an equality between two convex combinations with the same weights n−mn and
m
n . Then,

whatever a,b ∈ (r,r) we pick it is always possible to find (infinitely many) c,d ∈ (a,b) ⊂ (r,r) such
that equation (4) is satisfied. Indeed, for any x ∈ (0,1) we have that c = ( n−mn a +

m
n b)x + a(1− x)

and d = ( n−mn a +
m
n b)x + b(1 − x) satisfy equation (4). This shows that for any rational number

m
n ∈ [s, t] we can always find c,d ∈ (a,b) ⊂ (r,r) such that

a−c
a−c+d−b =

m
n , and so we can always

find a,b,c,d ∈ (r,r) such that a−ca−c+d−b =
m
n . Since rational numbers are dense in R, they are also

dense in any real interval [s, t] ⊂ (0,1). Hence, the ratio a−ca−c+d−b is dense in [s, t].
Once we have proven the density of a−ca−c+d−b for coordination games G = (a,b,c,d) in any real

interval [s, t] ⊂ (0,1), the rest of the proof is connected to the proof of Proposition 1. Specifically,
for a given interval [s, t] we define

a′ := (1− s)a + sb = a + s(b − a)
b′ := (1− t)a + tb = a + t(b − a)
c′ := (1− s)c + sd = c + s(d − c)
d′ := (1− t)c + td = c + t(d − c)

(5)

Substituting a′,b′,c′,d′ for a,b,c,d, Lemmas 1, 2, 4, 5 and 6 as well as the rest of the proof of
Proposition 1 hold unchanged. a

22


References

Alchian, Armen (1950). “Uncertainty, evolution and economic theory”. In: Journal of Political Economy

58, pp. 211–221.

Alger, Ingela and Jörgen W. Weibull (2013). “Homo Moralis: Preference Evolution Under Incom-

plete Information and Assortative Matching”. In: Econometrica 81:6, pp. 2269–2302.

Anderson, John R. (1991). “Is human cognition adaptive?” In: Behavioral and Brain Sciences 14.3RA,

pp. 471–517.

Battigalli, Pierpaolo et al. (2015). “Self-Confirming Equilibrium and Model Uncertainty”. In: Amer-

ican Economic Review 105:2, pp. 646–677.

Becker, Gary S. (1976). “Altruism, Egoism, and Genetic Fitness: Economics and Sociobiology”. In:

Journal of Economic Literature 14, pp. 817–826.

Bednar, Jenna and Scott Page (2007). “Can Game(s) Theory Explain Culture?: The Emergence of

Cultural Behavior Within Multiple Games”. In: Rationality and Society 19.1, pp. 65–97. eprint:

http://rss.sagepub.com/content/19/1/65.full.pdf+html.
Bell, David E. (1982). “Regret in Decision Making Under Uncertainty”. In: Operations Research 30:5,

pp. 961–981.

Bester, Helmut and Werner Güth (1998). “Is Altruism Evolutionarily Stable?” In: Journal of Economic

Behavior and Organization 34, pp. 193–209.

Bleichrodt, Han and Peter P. Wakker (2015). “Regret Theory: A Bold Alternative to the Alterna-

tives”. In: The Economic Journal 125.583, pp. 493–532. ISSN: 1468-0297.

Charness, Gary and Matthew Rabin (2002). “Understanding Social Preferences with Simple Tests”.

In: Quarterly Journal of Economics 117:3, pp. 817–869.

Chater, Nick and Mike Oaksford (2000). “The Rational Analysis of Mind and Behavior”. In: Synthese

122, pp. 93–131.

de Finetti, Bruno (1937). “La Prevision: Ses Lois Logiques, Ses Sources Subjectives”. In: Annales de

l’Institute Henri Poincare 7, pp. 1–68.

Dekel, Eddie, Jeffrey C. Ely, and Okan Ylankaya (2007). “Evolution of Preferences”. In: The Review

of Economic Studies 74:3, pp. 685–704.

Ellsberg, Daniel (1961). “Risk, Ambiguity, and the Savage Axioms”. In: Quarterly Journal of Economics

75:4, pp. 643–669.

Fawcett, Tim W., Steven Hamblin, and Luc-Alain Giraldeaub (2013). “Exposing the behavioral

gambit: the evolution of learning and decision rules”. In: Behavioral Ecology 24.1, pp. 2–11.

Fehr, Ernst and Klaus M. Schmidt (1999). “A Theory of Fairness, Competition and Cooperation”.

In: Quarterly Journal of Economics 114, pp. 817–868.

Fishburn, Peter C. (1982). “Nontransitive measurable utility”. In: Journal of Mathematical Psychology

26.1, pp. 31 –67. ISSN: 0022-2496.

Friedman, Milton (1953). Essays in Positive Economics. University of Chicago Press.

Gardenfors, Peter and Nils-Eric Sahlin (1982). “Unreliable Probabilities, Risk Taking, and Decision

Making”. In: Synthese 53, pp. 361–386.

23

http://rss.sagepub.com/content/19/1/65.full.pdf+html


Ghirardato, Paolo and Massimo Marinacci (2002). “Ambiguity Made Precise: A Comparative Foun-

dation”. In: Journal of Economic Theory 102, pp. 251–289.

Gigerenzer, Gerd and Daniel G. Goldstein (1996). “Reasoning the Fast and Frugal Way: Models

of Bounded Rationality”. In: Psychological Review 103.4, pp. 650–669.

Gilboa, Itzhak and Massimo Marinacci (2013). “Ambiguity and the Bayesian Paradigm”. In: Ad-

vances in Economics and Econometrics. Ed. by Daron Acemoglu, Manuel Arellano, and Eddie Dekel.

Vol. 1. Cambridge Books Online. Cambridge University Press, pp. 179–242. ISBN: 9781139060011.

Gilboa, Itzhak and David Schmeidler (1989). “Maxmin Expected Utility with Non-unique Prior”.

In: Journal of Mathematical Economics 18, pp. 141–153.

Hagen, Edward H. et al. (2012). “Decision Making: What Can Evolution Do for Us?” In: Evolution

and the Mechanisms of Decision Making. Ed. by Peter Hammerstein and Jeffrey R. Stevens. Cam-

bridge, MA: MIT Press, pp. 97–126.

Halpern, Joseph Y. and Rafael Pass (2012). “Iterated Regret Minimization: A New Solution Con-

cept”. In: Games and Economic Behavior 74, pp. 184–207.

Hammerstein, Peter and Jeffrey R. Stevens (2012). “Six Reasons for Invoking Evolution in Decision

Theory”. In: Evolution and the Mechanisms of Decision Making. Ed. by P. Hammerstein and J. R.

Stevens. MIT Press.

Harley, Calvin B. (1981). “Learning the Evolutionarily Stable Strategy”. In: Journal of Theoretical

Biology 89.4, pp. 611–633.

Heifetz, Aviad, Martin Meier, and Burkhard C. Schipper (2013). “Dynamic Unawareness and Ra-

tionalizable Behavior”. In: Games and Economic Behavior 81, pp. 50–68.

Levi, Isaac (1974). “On Indeterminate Probabilities”. In: The Journal of Philosophy 71:13, pp. 391–

418.

Loomes, Graham and Robert Sugden (1982). “Regret Theory: An Alternative Theory of Rational

Choice Under Uncertainty”. In: The Economic Journal 92:368, pp. 805–824.

Maynard Smith, John (1982). Evolution and the Theory of Games. Cambridge University Press.

McNamara, John M. (2013). “Towards a Richer Evolutionary Game Theory”. In: Journal of The

Royal Society Interface 10.88, pp. 1–9.

Mengel, Friederike (2012). “Learning across games”. In: Games and Economic Behavior 74.2, pp. 601

–619. ISSN: 0899-8256.

Mohlin, Erik (2012). “Evolution of theories of mind”. In: Games and Economic Behavior 75.1, pp. 299

–318. ISSN: 0899-8256.

von Neumann, John and Oskar Morgenstern (1944). Theory of Games and Economic Behavior. Princeton,

New Jersey: Princeton University Press.

Nowak, Martin A. (2006). Evolutionary Dynamics: Exploring the Equations of Life. Harvard University

Press.

O’Connor, Cailin (forthcoming). “Evolving to Generalize: Trading Precision for Speed”. In: British

Journal for the Philosophy of Science.

Rayo, Luis and Gary S. Becker (2007). “Evolutionary Efficiency and Happiness”. In: Journal of

Political Economy 115, pp. 302–337.

24


Robalino, Nikolaus and Arthur Robson (2016). “The Evolution of Strategic Sophistication”. In:

American Economic Review 106.4, pp. 1046–72.

Robson, Arthur and Larry Samuelson (2011). “The Evolutionary Foundations of Preferences”. In:

Handbook of Social Economics, 1A. Ed. by J. Benhabib, M. Jackson, and A. Bisin. North-Holland.

Savage, Leonard Jimmie (1951). “The theory of statistical decision”. In: Journal of the American Statis-

tical Association 46, pp. 55–67.

— (1954). The Foundations of Statistics. New York: Dover.

Scheibehenne, Benjamin, Jörg Rieskamp, and Eric-Jan Wagenmakers (2013). “Testing the Adaptive

Toolbox Models: A Bayesian Hierarchical Approach”. In: Philosophical Review 120.1, pp. 39–64.

Skyrms, Brian and Kevin J.S. Zollman (2010). “Evolutionary considerations in the framing of social

norms”. In: Politics, Philosophy and Economics 9.3, pp. 265–273. eprint: http://ppe.sagepub.
com/content/9/3/265.full.pdf+html.

Smead, Rory and Kevin J. S. Zollman (2013). “The Stability of Strategic Plasticity”. Unpublished

manuscript.

Taylor, Peter D. and Leo B. Jonker (1978). “Evolutionary Stable Strategies and Game Dynamics”.

In: Mathematical Bioscience 40.1–2, pp. 145–156.

Trautmann, Stefan T. and Gijs van de Kuilen (2016). “Ambiguity Attitudes”. In: The Wiley Blackwell

Handbook of Judgment and Decision Making. Ed. by G. Keren and G. Wu. Wiley-Blackwell.

Tversky, Amos and Daniel Kahnemann (1981). “The Framing of Decisions and the Psychology of

Choice”. In: Science 211.4481, pp. 453–458.

von Neumann, J. and O. Morgenstern (1944). Theory of Games and Economic Behavior. Princeton Uni-

versity Press.

Walley, Peter (1996). “Inferences from Multinomial Data: Learning about a Bag of Marbles”. In:

Journal of the Royal Statistical Society 58:1, pp. 3–57.

Zollman, Kevin J. S. (2008). “Explaining Fairness in Complex Environments”. In: Politics, Philosophy

and Economics 7.1, pp. 81–97.

Zollman, Kevin J. S. and Rory Smead (2010). “Plasticity and Language: An Example of the Baldwin

Effect?” In: Philosophical Studies 147.1, pp. 7–21.

25

http://ppe.sagepub.com/content/9/3/265.full.pdf+html
http://ppe.sagepub.com/content/9/3/265.full.pdf+html

	Introduction
	Rationality and Subjective Representations
	Choice Mechanisms and Meta-Games
	Evolution of Preferences
	Results
	Simulation Results
	Analytical Results

	Extensions
	More Preference Types
	More Actions
	Solitary Decisions
	Sophisticated Beliefs

	Conclusion
	Proofs
	References