Durham Research Online

Deposited in DRO:

26 May 2014

Version of attached �le:

Accepted Version

Peer-review status of attached �le:

Peer-reviewed

Citation for published item:

Grinfeld, Michael and Volkov, Stanislav and Wade, Andrew R. (2015) 'Convergence in a multidimensional
randomized Keynesian beauty contest.', Advances in applied probability., 47 (1). pp. 57-82.

Further information on publisher's website:

http://dx.doi.org/10.1239/aap/1427814581

Publisher's copyright statement:

Additional information:

Use policy

The full-text may be used and/or reproduced, and given to third parties in any format or medium, without prior permission or charge, for
personal research or study, educational, or not-for-pro�t purposes provided that:

• a full bibliographic reference is made to the original source

• a link is made to the metadata record in DRO

• the full-text is not changed in any way

The full-text must not be sold in any format or medium without the formal permission of the copyright holders.

Please consult the full DRO policy for further details.

Durham University Library, Stockton Road, Durham DH1 3LY, United Kingdom
Tel : +44 (0)191 334 3042 | Fax : +44 (0)191 334 2971

https://dro.dur.ac.uk

https://www.dur.ac.uk
http://dx.doi.org/10.1239/aap/1427814581
http://dro.dur.ac.uk/12513/
https://dro.dur.ac.uk/policies/usepolicy.pdf
https://dro.dur.ac.uk


Applied Probability Trust (12th March 2014)

CONVERGENCE IN A MULTIDIMENSIONAL RANDOMIZED

KEYNESIAN BEAUTY CONTEST

MICHAEL GRINFELD,∗ University of Strathclyde

STANISLAV VOLKOV,∗∗ Lund University and University of Bristol

ANDREW R. WADE,∗∗∗ Durham University

Abstract

We study the asymptotics of a Markovian system of N ≥ 3 particles in [0, 1]d in
which, at each step in discrete time, the particle farthest from the current centre
of mass is removed and replaced by an independent U[0, 1]d random particle.
We show that the limiting configuration contains N − 1 coincident particles
at a random location ξN ∈ [0, 1]

d. A key tool in the analysis is a Lyapunov
function based on the squared radius of gyration (sum of squared distances)
of the points. For d = 1 we give additional results on the distribution of the
limit ξN, showing, among other things, that it gives positive probability to any
nonempty interval subset of [0, 1], and giving a reasonably explicit description
in the smallest nontrivial case, N = 3.

Keywords: Keynesian beauty contest; radius of gyration; rank-driven process;
sum of squared distances.

2010 Mathematics Subject Classification: Primary 60J05
Secondary 60D05; 60F15; 60K35;

82C22; 91A15

1. Introduction, model, and results

In a Keynesian beauty contest, N players each guess a number, the winner being
the player whose guess is closest to the mean of all the N guesses; the name marks
Keynes’s discussion of “those newspaper competitions in which the competitors have
to pick out the six prettiest faces from a hundred photographs, the prize being awarded
to the competitor whose choice most nearly corresponds to the average preferences of
the competitors as a whole” [7, Ch. 12, §V]. Moulin [10, p. 72] formalized a version
of the game played on a real interval, the “p-beauty contest”, in which the target is p
(p > 0) times the mean value. See e.g. [2] and references therein for some recent work
on game-theoretic aspects of such “contests” in economics.

In this paper we study a stochastic process based on an iterated version of the game,
in which players randomly choose a value in [0, 1], and at each step the worst performer

∗ Postal address: Department of Mathematics and Statistics, University of Strathclyde, 26 Richmond
Street, Glasgow G1 1XH, UK.
∗∗ Postal address: Centre for Mathematical Sciences, Lund University, Box 118 SE-22100, Lund,
Sweden, and Department of Mathematics, University of Bristol, University Walk, Bristol BS8 1TW,
UK.
∗∗∗ Postal address: Department of Mathematical Sciences, Durham University, South Road, Durham
DH1 3LE, UK.

1


2 MICHAEL GRINFELD, STANISLAV VOLKOV, ANDREW R. WADE

(that is, the player whose guess is farthest from the mean) is replaced by a new player;
each player’s guess is fixed as soon as they enter the game, so a single new random
value enters the system at each step. Analysis of this model was posed as an open
problem in [4, p. 390]. The natural setting for our techniques is in fact a generalization
in which the values live in [0, 1]d and the target is the barycentre (centre of mass) of
the values. We now formally describe the model and state our main results.

Let d ∈ N := {1, 2, . . .}. We use the notation Xn = (x1, x2, . . . , xn) for a vector of n
points xi ∈ R

d. We write µn(Xn) := n
−1
∑n

i=1 xi for the barycentre of Xn, and ‖ · ‖ for
the Euclidean norm on Rd. Let ord(Xn) = (x(1), x(2), . . . , x(n)) denote the barycentric
order statistics of x1, . . . , xn, so that ‖x(1) − µn(Xn)‖ ≤ ‖x(2) − µn(Xn)‖ ≤ · · · ≤
‖x(n) − µn(Xn)‖; any ties are broken randomly. We call X

∗
n := x(n) the extreme point

of Xn, a point of x1, . . . , xn farthest from the barycentre. We define the core of Xn as
X ′n := (x(1), . . . , x(n−1)), the vector of x1, . . . , xn with the extreme point removed.

The Markovian model that we study is defined as follows. Fix N ≥ 3. Start with
X1(0), . . . , XN(0), distinct points in [0, 1]

d, and write XN (0) := (X(1)(0), . . . , X(N)(0))
for the corresponding ordered vector. One possibility is to start with a uniform random
initial configuration, by taking X1(0), . . . , XN(0) to be independent U[0, 1]

d random
variables; here and elsewhere U[0, 1]d denotes the uniform distribution on [0, 1]d. In
this uniform random initialization, all N points are indeed distinct with probability
1. Given XN(t), replace X

∗
N(t) = X(N)(t) by an independent U[0, 1]

d random variable
Ut+1, so that XN (t + 1) = ord(X(1)(t), . . . , X(N−1)(t), Ut+1).

The interesting case is when N ≥ 3: the case N = 1 is trivial, and the case N = 2
is also uninteresting since at each step either point is replaced with probability 1/2 by
a U[0, 1] variable, so that, regardless of the initial configuration, after a finite number
of steps we will have two independent U[0, 1] points. Our main result, Theorem 1.1,
shows that for N ≥ 3 all but the most extreme point of the configuration converge to
a common limit.

Theorem 1.1. Let d ∈ N and N ≥ 3. Let XN(0) consist of N distinct points in [0, 1]
d.

There exists a random ξN := ξN (XN(0)) ∈ [0, 1]
d such that

X ′N(t)
a.s.
−→ (ξN , ξN, . . . , ξN ), and X

∗
N(t) − Ut

a.s.
−→ 0, (1.1)

as t → ∞. In particular, for U ∼ U[0, 1]d, as t → ∞,

XN(t)
d

−→ (ξN, ξN, . . . , ξN, U).

Remark 1.1. Under the conditions of Theorem 1.1, despite the fact that X ∗N (t)−Ut →
0 a.s., we will see below that X ∗N (t) 6= Ut infinitely often a.s.

Theorem 1.1 is proved in Section 2. Then, Section 3 is devoted to the one-dimensional
case, where we obtain various additional results on the limit ξN. Finally, the Appen-
dices, A and B, collect some results on uniform spacings and continuity of distributional
fixed-points that we use in parts of the analysis in Section 3.

2. Proof of convergence

Intuitively, the evolution of the process is as follows. If, on replacement of the
extreme point, the new point is the next extreme point (measured with respect to the


Randomized Keynesian beauty contest 3

new centre of mass), then the core is unchanged. However, if the new point is not
extreme, it typically penetrates the core significantly, while a more extreme point is
thrown out of the core, reducing the size of the core in some sense (we give a precise
statement below). Tracking the evolution of the core, by following its centre of mass,
one sees increasingly long periods of inactivity, since as the size of the core decreases
changes occur less often, and moreover the magnitude of the changes decreases in step
with the size of the core. The dynamics are nontrivial, but bear some resemblance to
random walks with decreasing steps (see e.g. [3, 8] and references therein) as well as
processes with reinforcement such as the Pólya urn (see e.g. [12] for a survey). The
process is also reminiscent in some ways of iterated interval division [6] or sequential
adsorption with interaction [11].

Our analysis will rest on a ‘Lyapunov function’ for the process, that is, a function
of the configuration that possesses pertinent asymptotic properties. One may initially
hope, for example, that the diameter of the point set XN(t) would decrease over time,
but this cannot be the case because the newly added point can be anywhere in [0, 1]d.
What then about the diameter of X ′N(t), for which the extreme point is ignored? We
will show later in this section that this quantity is in fact well behaved, but we have
to argue somewhat indirectly: the diameter of X ′N(t) can increase (at least for N big
enough; see Remark 2.2 below). However, there is a monotone decreasing function
associated with the process, based on the sum of squared distances of a configuration,
which we will use as our Lyapunov function. To define this function (see (2.9) below)
we need some notation.

For n ∈ N and Xn = (x1, x2, . . . , xn) ∈ (R
d)n, write

Gn(Xn) := Gn(x1, . . . , xn) := n
−1

n
∑

i=1

i−1
∑

j=1

‖xi − xj‖
2 =

n
∑

i=1

‖xi − µn(Xn)‖
2; (2.1)

a detailed proof of the (elementary) final equality in (2.1) may be found on pp. 95–96
of [5], for example. We remark that 1

n
Gn is the squared radius of gyration of x1, . . . , xn:

see e.g. [5], p. 95. Note also that calculus verifies the useful variational formula

Gn(x1, . . . , xn) = inf
y∈Rd

n
∑

i=1

‖xi − y‖
2. (2.2)

For n ≥ 2, define

Fn(Xn) := Fn(x1, . . . , xn) := Gn−1(X
′
n) = Gn−1(x(1), . . . , x(n−1)).

Lemma 2.1. Let n ≥ 2 and Xn = (x1, x2, . . . , xn) ∈ (R
d)n. Then for any x ∈ Rd,

Fn(x(1), . . . , x(n−1), x) ≤ Fn(Xn).

Proof. For ease of notation, we write simply (x1, . . . , xn) for (x(1), . . . , x(n)), i.e.,
we relabel so that xj is the jth closest point to µn(Xn). Then X

∗
n = xn, X

′
n =

(x1, . . . , xn−1), and

Fold := Fn(Xn) = Gn−1(x1, . . . , xn−1) =

n−1
∑

i=1

‖xi − µ
′
old‖

2, (2.3)


4 MICHAEL GRINFELD, STANISLAV VOLKOV, ANDREW R. WADE

where µ′old := µn−1(X
′
n). We compare Fold to Fn evaluated on the set of points obtained

by removing xn and replacing it with some x ∈ R
d.

Write y := {x1, . . . , xn−1, x}
∗ for the new extreme point. Then

Fnew := Fn(x1, . . . , xn−1, x) =

n−1
∑

i=1

‖xi − µ
′
new‖

2 + ‖x − µ′new‖
2 − ‖y − µ′new‖

2, (2.4)

where

µ′new :=
1

n − 1

(

n−1
∑

i=1

xi + x − y

)

= µ′old +
x − y

n − 1
. (2.5)

Denote µnew := µn(x1, . . . , xn−1, x), so

µ′new =
nµnew
n − 1

−
y

n − 1
. (2.6)

From (2.3), (2.4), and (2.5), we obtain

Fnew − Fold =

n−1
∑

i=1

(

‖xi − µ
′
new‖

2 − ‖xi − µ
′
old‖

2
)

+ ‖x − µ′new‖
2 − ‖y − µ′new‖

2. (2.7)

For the sum on the right-hand side of (2.7), we have that

n−1
∑

i=1

(

‖xi − µ
′
new‖

2 − ‖xi − µ
′
old‖

2
)

=

n−1
∑

i=1

(

2xi · (µ
′
old − µ

′
new) + ‖µ

′
new‖

2 − ‖µ′old‖
2
)

= (n − 1)
(

2µ′old · (µ
′
old − µ

′
new) + ‖µ

′
new‖

2 − ‖µ′old‖
2
)

= (n − 1)
(

‖µ′old‖
2 − 2(µ′old · µ

′
new) + ‖µ

′
new‖

2
)

.

Simplifying this last expression and substituting back into (2.7) gives Fnew − Fold =
(n − 1)‖µ′old − µ

′
new‖

2 + ‖x − µ′new‖
2 − ‖y − µ′new‖

2. Thus, using (2.5) and then (2.6),

Fnew − Fold =
‖x − y‖2

n − 1
+ ‖x‖2 − ‖y‖2 − 2µ′new · (x − y)

=
‖x‖2 + ‖y‖2 − 2x · y

n − 1
+ ‖x‖2 − ‖y‖2 − 2

(

nµnew
n − 1

−
y

n − 1

)

· (x − y).

Hence we conclude that

Fnew − Fold =
n

n − 1

(

‖x‖2 − ‖y‖2 − 2µnew · (x − y)
)

=
n

n − 1

(

‖x − µnew‖
2 − ‖y − µnew‖

2
)

≤ 0, (2.8)

since y is, by definition, the farthest point from µnew.

Now we return to the stochastic model. Define

F(t) := FN(XN (t)). (2.9)

Lemma 2.1 has the following immediate consequence.


Randomized Keynesian beauty contest 5

Corollary 2.1. Let N ≥ 2. Then F(t + 1) ≤ F(t).

Corollary 2.1 shows that our Lyapunov function F(t) is nonincreasing; later we show
that F(t) → 0 a.s. (see Lemma 2.4 below). First, we need to relate F(t) to the diameter
of the point set X ′N(t). For n ≥ 2 and x1, . . . , xn ∈ R

d, write

Dn(x1, . . . , xn) := max
1≤i,j≤n

‖xi − xj‖.

Lemma 2.2. Let n ≥ 2 and x1, . . . , xn ∈ R
d. Then

1

2
Dn(x1, . . . , xn)

2 ≤ Gn(x1, . . . , xn) ≤
1

2
(n − 1)Dn(x1, . . . , xn)

2.

Remark 2.1. The lower bound in Lemma 2.2 is sharp, and is attained by collinear
configurations with two diametrically opposed points xi, xj and all the other n−2 points
at the midpoint µ2(xi, xj) = µn(x1, . . . , xn). The upper bound in Lemma 2.2 is not,
in general, sharp; determining the sharp upper bound is a nontrivial problem. The
bound Gn(x1, . . . , xn) ≤

n
2

(

d
d+1

)

Dn(x1, . . . , xn)
2 from [16] is also not always sharp.

Witsenhausen [16] conjectured that the maximum is attained if and only if the points
are distributed as evenly as possible among the vertices of a regular d-dimensional
simplex of edge-length Dn(x1, . . . , xn); this was proved relatively recently [1,15].

Proof of Lemma 2.2. Fix x1, . . . , xn ∈ R
d. For ease of notation, write µ = µn(x1, . . . , xn).

First we prove the lower bound. For n ≥ 2, using the second form of Gn in (2.1),

Gn(x1, . . . , xn) =
n
∑

i=1

‖xi − µ‖
2 ≥ ‖xi − µ‖

2 + ‖xj − µ‖
2,

for (xi, xj) a diameter, i.e., Dn(x1, . . . , xn) = ‖xi − xj‖. By the n = 2 case of (2.2),

‖xi − µ‖
2 + ‖xj − µ‖

2 ≥ 2‖xi − µ2(xi, xj)‖
2 =

1

2
‖xi − xj‖

2.

This gives the lower bound. For the upper bound, from the first form of Gn in (2.1),

Gn(x1, . . . , xn) ≤
1

n

n
∑

i=1

(i − 1)Dn(x1, . . . , xn)
2,

by the definition of Dn, which yields the result.

Let D(t) := DN−1(X
′
N (t)).

Remark 2.2. By Lemma 2.2 (or (2.1)), G2(X
′
3(t)) =

1
2
D2(X

′
3(t))

2, so when N = 3,
Lemma 2.1 implies that D(t + 1) ≤ D(t) a.s. as well. If d = 1, it can be shown that
D(t) is nonincreasing also when N = 4. In general, however, D(t) can increase.

Let Ft := σ(XN (0), XN(1), . . . , XN(t)), the σ-algebra generated by the process up
to time t. Let B(x; r) denote the closed Euclidean d-ball with centre x ∈ Rd and radius
r > 0. Define the events

At+1 := {Ut+1 ∈ B(µN−1(X
′
N (t)); 3D(t))}, A

′
t+1 := {Ut+1 ∈ B(µN−1(X

′
N (t)); D(t)/4)}.


6 MICHAEL GRINFELD, STANISLAV VOLKOV, ANDREW R. WADE

Lemma 2.3. There is an absolute constant γ > 0 for which, for all N ≥ 3 and all t,

A′t+1 ⊆ {F(t + 1) − F(t) ≤ −γN
−1F(t)} ⊆ {F(t + 1) − F(t) < 0} ⊆ At+1. (2.10)

Moreover, there exist constants c > 0 and C < ∞, depending only on d, for which, for
all N ≥ 3 and all t, a.s.,

P
[

F(t + 1) − F(t) ≤ −γN−1F(t) | Ft
]

≥ P[A′t+1 | Ft] ≥ cN
−d/2(F(t))d/2; (2.11)

P [F(t + 1) − F(t) < 0 | Ft] ≤ P[At+1 | Ft] ≤ C(F(t))
d/2. (2.12)

Proof. For simplicity we write X1, . . . , XN−1 instead of X(1)(t), . . . , X(N−1)(t) and
D instead of D(t) = DN−1(X1, . . . , XN−1). By definition of D, there exists some
i ∈ {1, . . . , N − 1} such that ‖µ′old − Xi‖ ≥ D/2, where µ

′
old = µN−1(X1, . . . , XN−1).

Given Ft, the event A
′
t+1, that the new point U := Ut+1 falls in B(µ

′
old; D/4), has

probability bounded below by θdD
d, where θd > 0 depends only on d. Let µnew :=

µN(X1, . . . , XN−1, U). Suppose that A
′
t+1 occurs. Then,

‖µnew − µ
′
old‖ =

1

N
‖U − µ′old‖ ≤

D

4N
≤

D

12
, (2.13)

since N ≥ 3. Hence, by (2.13) and the triangle inequality,

‖U − µnew‖ ≤ ‖U − µ
′
old‖ + ‖µnew − µ

′
old‖ ≤

D

4
+

D

12
=

4D

12
. (2.14)

On the other hand, by another application of the triangle inequality and (2.13),

‖µnew − Xi‖ ≥ ‖µ
′
old − Xi‖ − ‖µnew − µ

′
old‖ ≥

D

2
−

D

12
=

5D

12
.

Then, by definition, the extreme point Y := {X1, . . . , XN−1, U}
∗ satisfies

‖Y − µnew‖ ≥ ‖µnew − Xi‖ ≥
5D

12
. (2.15)

Hence from the x = U case of (2.8) with the bounds (2.14) and (2.15), we conclude

F(t + 1) − F(t) ≤
N

N − 1

(

(

4D

12

)2

−

(

5D

12

)2
)

1(A′t+1) ≤ −
9

144
D21(A′t+1), (2.16)

for all N ≥ 3; the first inclusion in (2.10) follows (with γ = 9/72) from (2.16) together
with the fact that, by the second inequality in Lemma 2.2, D2 ≥ 2N−1F(t). This in
turn implies (2.11), using the fact that P[A′t+1 | Ft] ≥ θdD

d.
Next we consider the event At+1. Using the same notation as above, we have that

‖µnew − U‖ ≥ ‖µ
′
old − U‖ − ‖µnew − µ

′
old‖ =

(

1 −
1

N

)

‖µ′old − U‖,

by the equality in (2.13). Also, for any k ∈ {1, . . . , N − 1},

‖µnew − Xk‖ ≤ ‖µ
′
old − Xk‖ + ‖µ

′
old − µnew‖ ≤ D +

1

N
‖µ′old − U‖,


Randomized Keynesian beauty contest 7

by (2.13) again. Combining these estimates we obtain, for any k ∈ {1, . . . , N − 1},

‖µnew − U‖ − ‖µnew − Xk‖ ≥

(

1 −
2

N

)

‖µ′old − U‖ − D ≥
1

3
‖µ′old − U‖ − D,

for N ≥ 3. So in particular, ‖µnew − U‖ > ‖µnew − Xk‖ for all k ∈ {1, . . . , N − 1}
provided ‖µ′old − U‖ > 3D, i.e., U /∈ B(µ

′
old; 3D). In this case, U is the extreme point

among U, X1, . . . , XN−1, i.e.,

Act+1 ⊆ {X
∗
N(t + 1) = Ut+1}. (2.17)

In particular, on Act+1, F(t + 1) = F(t), and F(t + 1) < F(t) only if At+1 occurs,
giving the final inclusion in (2.10). Since P[At+1 | Ft] is bounded above by CdD

d for
a constant Cd < ∞ depending only on d, (2.12) follows from the first inequality in
Lemma 2.2. This completes the proof.

Lemma 2.4. Suppose that N ≥ 3. Then, as t → ∞, F(t) → 0 a.s. and in L2.

Proof. Let ε > 0 and let σ := min{t ∈ Z+ : F(t) ≤ ε}, where Z+ := {0, 1, 2, . . .}.
Then by (2.11), there exists δ > 0 (depending on d, ε, and N) such that, a.s.,

P [F(t + 1) − F(t) ≤ −δ | Ft] ≥ δ1{t < σ}.

Hence, since F(t + 1) − F(t) ≤ 0 a.s. by Corollary 2.1,

E [F(t + 1) − F(t) | Ft] ≤ −δ
2
1{t < σ}. (2.18)

By Corollary 2.1, F(t) is nonnegative and nonincreasing, and hence F(t) converges a.s.
as t → ∞ to some nonnegative limit F(∞); the convergence also holds in L2 since F(t)
is uniformly bounded. In particular, E[F(t)] → E[F(∞)]. So taking expectations in
(2.18) and letting t → ∞ we obtain

lim sup
t→∞

δ2P[σ > t] ≤ 0,

which implies that P[σ > t] → 0 as t → ∞. Thus σ < ∞ a.s., which together with the
monotonicity of F(t) (Corollary 2.1) implies that F(t) ≤ ε for all t sufficiently large.
Since ε > 0 was arbitrary, the result follows.

Recall the definition of At and A
′
t from before Lemma 2.3. Define (Ft) stopping

times τ0 := 0 and, for n ∈ N, τn := min{t > τn−1 : At occurs}. Then F(t) < F(t − 1)
can only occur if t = τn for some n. Since P[At+1 | Ft] is bounded below by cF(t)

d/2

for some c > 0, it is not hard to see that, provided F(0) > 0, At occurs infinitely often,
a.s. Indeed, suppose that F(0) > 0 and At occurs only finitely often. Then F(t) has
a non-zero limit. On the other hand,

∑

t cF(t)
d/2 ≤

∑

t P[At+1 | Ft] < ∞, by Lévy’s
extension of the Borel–Cantelli lemma, so that F(t) → 0 a.s., which is a contradiction.
Hence τn < ∞, a.s., for all n.

Lemma 2.5. Let N ≥ 3. There exists α > 0 such that, a.s., D(τn) ≤ e
−αn for all n

sufficiently large.


8 MICHAEL GRINFELD, STANISLAV VOLKOV, ANDREW R. WADE

Proof. We have from (2.16) and the second inequality in Lemma 2.2 that

F(τn) − F(τn − 1) ≤ −δF(τn − 1)1(A
′
τn
),

for some δ > 0. Note also that, by definition of the stopping times τn, F(τn − 1) =
F(τn−1). Hence,

P[F(τn) − F(τn−1) ≤ −δF(τn−1) | Fτn−1] ≥ P[A
′
τn

| Fτn−1] ≥ δ,

taking δ > 0 small enough, since, using the fact that 1(Aτn) = 1 a.s.,

P[A′τn | Fτn−1] = E
[

P[A′τn | Fτn]1(Aτn) | Fτn−1
]

= E
[

P[A′τn | Aτn] | Fτn−1
]

,

where by definition of At and A
′
t, P[A

′
τn

| Aτn] is uniformly positive. Since F(t + 1) −
F(t) ≤ 0 a.s. (by Corollary 2.1) it follows that

E
[

F(τn) − F(τn−1) | Fτn−1
]

≤ −δ2F(τn−1).

Taking expectations, we obtain E[F(τn)] ≤ (1 − δ
2)E[F(τn−1)], which implies that

E[F(τn)] = O(e
−cn), for some c > 0 depending on δ. Then by Markov’s inequality,

P[F(τn) ≥ e
−cn/2] = O(e−cn/2), which implies that F(τn) = O(e

−cn/2), a.s., by the
Borel–Cantelli lemma. Then the first inequality in Lemma 2.2 gives the result.

Remark 2.3. The proof of Lemma 2.5 shows that P[A′τn | Fτn−1] is uniformly positive,
so Lévy’s version of the Borel–Cantelli lemma, with the fact that τn < ∞ a.s. for all
n, shows that A′t occurs for infinitely many t, a.s. With the proof of Lemma 2.3, this
shows that X ∗N(t) 6= Ut infinitely often, as claimed in Remark 1.1.

We are almost ready to complete the proof of Theorem 1.1. We state the main step
in the remaining argument as the first part of the the next lemma, while the second
part of the lemma we will need in Section 3.3 below. For ε > 0, define the stopping
time νε := min{t ∈ N : F(t) < ε

2}; for any ε > 0, νε < ∞ a.s., by Lemma 2.4.

Lemma 2.6. Let N ≥ 3. Then there exists ξN ∈ [0, 1]
d such that µN−1(X

′
N (t)) → ξN

a.s. and in L2 as t → ∞. Moreover, there exists an absolute constant C such that for
any ε > 0, and any t0 ∈ N, on {νε ≤ t0}, a.s.,

E

[

max
t≥t0

‖µN−1(X
′
N (t)) − µN−1(X

′
N (t0))‖ | Ft0

]

≤ Cε.

Proof. Let µ′(t) := µN−1(X
′
N (t)). Observe that for N ≥ 3, X

′
N (t) and X

′
N(t − 1)

have at least one point in common; choose one such point, and call it Z(t). Then
µ′(t) ∈ hull X ′N(t) ⊆ hull XN(t), where hull X denotes the convex hull of the point set
X . So ‖Z(t) − µ′(t)‖ ≤ D(t). Similarly ‖Z(t) − µ′(t − 1)‖ ≤ D(t − 1). By definition
of τn, µ

′(t) = µ′(t − 1) and D(t) = D(t − 1) unless t = τn for some n, in which case
µ′(τn − 1) = µ

′(τn−1) and D(τn − 1) = D(τn−1). Hence,

∑

t≥1

‖µ′(t) − µ′(t − 1)‖ =
∑

n≥1

‖µ′(τn) − µ
′(τn−1)‖

≤
∑

n≥1

(‖µ′(τn) − Z(τn)‖ + ‖µ
′(τn−1) − Z(τn)‖) , (2.19)


Randomized Keynesian beauty contest 9

by the triangle inequality. Then the preceding remarks imply that

∑

t≥1

‖µ′(t) − µ′(t − 1)‖ ≤
∑

n≥1

(D(τn) + D(τn−1)) < ∞, a.s.,

by Lemma 2.5. Hence there is some (random) ξN ∈ [0, 1]
d for which µ′(t) → ξN a.s. as

t → ∞, and L2 convergence follows by the bounded convergence theorem.
For the final statement in the lemma we use a variation of the preceding argument.

Let M := max{n ∈ Z+ : τn ≤ t0}. Then F(t0) = F(τM ) and µ
′(τM ) = µ

′(t0), so that
on {νε ≤ t0}, we have {νε ≤ τM } as well. Hence (by Corollary 2.1) F(τM ) < ε

2. A
similar argument to that in the proof of Lemma 2.5 shows that, for m ≥ 0,

E[F(τM+m) | Ft0] ≤ e
−cm

E[F(τM ) | Ft0] ≤ ε
2e−cm,

on {νε ≤ t0}, where c > 0 depends on N but not on m or ε. Thus by Lemma 2.2, on
{νε ≤ t0}, E[D(τM+m)

2 | Ft0] ≤ 2ε
2e−cm. Also, similarly to (2.19),

max
t≥τM

‖µ′(t) − µ′(τM )‖
2 ≤

∑

t≥τM

‖µ′(t) − µ′(t − 1)‖2

≤
∑

m≥1

(D(τM+m) + D(τM+m−1))
2.

Taking expectations and using the Cauchy–Schwarz inequality, we obtain, on {νε ≤ t0},

E

[

max
t≥t0

‖µ′(t) − µ′(t0)‖
2 | Ft0

]

= E
[

max
t≥τM

‖µ′(t) − µ′(τM )‖
2 | Ft0

]

≤ 8ε2ec
∑

m≥1

e−cm,

a constant times ε2. Jensen’s inequality now gives the result, with C2 = 8
1−e−c

.

Proof of Theorem 1.1. Again let µ′(t) := µN−1(X
′
N(t)). We have from Lemma 2.6

that µ′(t) → ξN a.s. Now, for any j ∈ {1, . . . , N − 1}, by the triangle inequality,

‖X(j)(t) − ξN ‖ ≤ ‖X(j)(t) − µ
′(t)‖ + ‖µ′(t) − ξN‖ ≤ D(t) + ‖µ

′(t) − ξN‖,

which tends to 0 a.s. as t → ∞, since D(t) → 0 a.s. by Lemma 2.5. This establishes
the first statement in (1.1). Moreover, by (2.17), X ∗N(t+1) 6= Ut+1 only if At+1 occurs.
On At+1, X

∗
N(t + 1) is one of the points of X

′
N(t), and so ‖X

∗
N(t + 1) − µ

′(t)‖ ≤ D(t).
In addition, on At+1, we have ‖Ut+1 − µ

′(t)‖ ≤ 3D(t). So by the triangle inequality,

‖X ∗N(t + 1) − Ut+1‖ ≤ 4D(t)1(At+1),

which tends to 0 a.s., again by Lemma 2.5. This gives the final part of (1.1).

3. The limit distribution in one dimension

3.1. Overview and simulations

Throughout this section we restrict attention to d = 1. Of interest is the distribution
of the limit ξN in (1.1), and its behaviour as N → ∞. Simulations suggest that ξN is
highly dependent on the initial configuration: Figure 1 shows histogram estimates for
ξN from repeated simulations with a deterministic initial condition. In more detail, 10

8


10 MICHAEL GRINFELD, STANISLAV VOLKOV, ANDREW R. WADE

runs of each simulation were performed, each starting from the same initial condition;
each run was terminated when D(t) < 0.0001 for the first time, and the value of
µN−1(X

′
N(t)) was output as an approximation to ξN (cf Theorem 1.1). Note that, by

(2.10), in the simulations one may take the new points not U[0, 1] but uniform on a
typically much smaller interval, which greatly increases the rate of updates to the core
configuration.

0.0 0.2 0.4 0.6 0.8 1.0

0
2

4
6

8
1

0

0.0 0.2 0.4 0.6 0.8 1.0

0
1

2
3

4
5

Figure 1: Normalized histograms each based on 108 simulations, with N = 3 and initial
points 1

4
, 1
2
, 3
4
(left) and N = 7 and initial points k

8
, k ∈ {1, . . . , 7} (right).

Figure 2 shows sample results obtained with an initial condition of N i.i.d. U[0, 1]
random points. Now the histograms appear much simpler, although, of course, they
can be viewed as mixtures of complicated multi-modal histograms similar to those in
Figure 1. In the uniform case, it is natural to ask whether ξN converge in distribution
to some limit distribution as N → ∞.

The form of the histograms in Figure 2 might suggest a Beta distribution (this is
one sense in which the randomized beauty contest is “reminiscent of a Pólya urn” [4,
p. 390]). An ad-hoc Kolmogorov–Smirnov analysis (see Table 3.1) suggests that the
distributions are indeed ‘close’ to Beta distributions, but different enough for the match
to be unconvincing. Simulations for large N are computationally intensive. We remark
that it is not unusual for Beta or ‘approximate Beta’ distributions to appear as limits
of schemes that proceed via iterated procedures on intervals: see for instance [6] and
references therein.

N β κ(β)
3 1.256 0.0010
10 1.392 0.0016
50 1.509 0.0018
100 1.539 0.0019

Table 1: κ(β) is the Kolmogorov–Smirnov distance between a Beta(β, β) distribution and
the empirical distribution from the samples of size 108 plotted in Figure 2, minimized over β
in each case.


Randomized Keynesian beauty contest 11

0.0 0.2 0.4 0.6 0.8 1.0

0
.0

0
.2

0
.4

0
.6

0
.8

1
.0

1
.2

0.0 0.2 0.4 0.6 0.8 1.0

0
.0

0
.2

0
.4

0
.6

0
.8

1
.0

1
.2

0.0 0.2 0.4 0.6 0.8 1.0

0
.0

0
.2

0
.4

0
.6

0
.8

1
.0

1
.2

0.0 0.2 0.4 0.6 0.8 1.0

0
.0

0
.2

0
.4

0
.6

0
.8

1
.0

1
.2

Figure 2: Normalized histograms for 108 simulations with random i.i.d. uniform initial
conditions, with (top row) N = 3, 10 and (bottom row) N = 50, 100.

In the rest of this section we study ξN and its distribution. Our results on the
limit distribution, in particular, leave several interesting open problems, including a
precise description of the phenomena displayed by the simulations reported above. In
Section 3.2 we give an alternative (one might say ‘phenomenological’) characterization
of the limit ξN , and contrast this with an appropriate rank-driven process in the sense
of [4]. In Section 3.3 we show that the distribution of ξN is fully supported on (0, 1)
and assigns positive probability to any proper interval, using a construction permitting
transformations of configurations. Finally, Section 3.4 is devoted to the case N = 3, for
which some explicit computations for the distribution of ξN (in particular, its moments)
are carried out.

3.2. A characterization of the limit

Let

πN(t) :=
1

t
# {s ∈ {1, 2, . . . , t} : X ∗N(s) < µN(XN(s))} ,


12 MICHAEL GRINFELD, STANISLAV VOLKOV, ANDREW R. WADE

the proportion of times up to time t for which the extreme point was the leftmost
point (as opposed to the rightmost). The next result shows that πN (t) converges to
the (random) limit ξN given by Theorem 1.1; we give the proof after some additional
remarks.

Proposition 3.1. Let d = 1 and N ≥ 3. Then limt→∞ πN (t) = ξN a.s.

It is instructive to contrast this behaviour with a suitable rank-driven process (cf [4]).
Namely, fix a parameter π ∈ (0, 1). Take N points in [0, 1], and at each step in discrete
time replace either the leftmost point (with probability π) or else the rightmost point
(probability 1−π), independently at each step; inserted points are independent U[0, 1]
variables. For this process, results of [4] show that the marginal distribution of a typical
point converges (as t → ∞ and then N → ∞) to a unit point mass at π (cf Remark
3.2 in [4]).

This leads us to one sense in which the randomized beauty contest is, to a limited
extent, “reminiscent of a Pólya urn” [4, p. 390]. Recall that a Pólya urn consists of an
increasing number of balls, each of which is either red or blue; at each step in discrete
time, a ball is drawn uniformly at random from the urn and put back into the urn
together with an extra ball of the same colour. The stochastic process of interest is
the proportion of red balls, say; it converges to a random limit π′, which has a Beta
distribution. The beauty contest can be viewed as occupying a similar relation to
the rank-driven process described above as the Pólya urn process does to the simpler
model in which, at each step, independently, either a red ball is added to the urn (with
probability π′) or else a blue ball is added (probability 1 − π′).

Proof of Proposition 3.1. Given τ0, τ1, τ2, . . ., ξN = limn→∞ µN−1(X
′
N(τn)) is in-

dependent of Ut, t /∈ {τ0, τ1, . . .}, since, by (2.17), any such Ut is replaced at time
t + 1. Let ε > 0. By Theorem 1.1, there exists a random T < ∞ a.s. for which
max1≤i≤N−1 |ξN − X(i)(t)| ≤ ε for all t ≥ T .

Since µN(XN(t + 1)) =
N−1
N

µN−1(X
′
N(t)) +

1
N
Ut+1, we have that for t ≥ T , using

the triangle inequality, for any i ∈ {1, . . . , N − 1},

|µN(XN (t + 1)) − X(i)(t)| ≤ ε + |µN(XN(t + 1)) − ξN|

≤ ε +
N − 1

N
|µN−1(X

′
N(t)) − ξN | +

1

N
|Ut+1 − ξN |.

Hence, for t ≥ T ,

max
1≤i≤N−1

∣

∣X(i)(t) − µN(XN(t + 1))
∣

∣ ≤
1

N
|ξN − Ut+1| + 2ε. (3.1)

On the other hand, for t ≥ T , µN(XN (t + 1)) ≥
N−1
N

(ξN − ε) +
1
N
Ut+1, so that for

i ∈ {1, . . . , N − 1},

µN(XN (t + 1)) − Ut+1 ≥
N − 1

N
(ξN − Ut+1 − ε). (3.2)


Randomized Keynesian beauty contest 13

Suppose that Ut+1 < ξN − Kε for some K ∈ (1, ∞). Then, from (3.1) and (3.2),

|µN(XN (t + 1)) − Ut+1| − max
1≤i≤N−1

∣

∣X(i)(t) − µN(XN(t + 1))
∣

∣

≥
N − 2

N
(ξN − Ut+1) −

3N − 1

N
ε

>
ε

N
((N − 2)K − 3N + 1) .

This last expression is positive provided K ≥ 3N−1
N−2

, which is the case for all N ≥ 3
with the choice K = 8, say. Hence, with this choice of K, {Ut+1 < ξN − 8ε} implies
that Ut+1 is farther from µN+1(XN(t + 1)) than is any of the points left over from
X ′N(t). Write Lt := {Ut < ξN − 8ε}. Then we have shown that, for t ≥ T , the event
Lt implies that Ut = X

∗
N(t), and, moreover, Ut < µN(XN (t)). Hence, for t ≥ T ,

πN(t) ≥
1

t

t
∑

s=T

1(Ls) ≥
1

t

t
∑

s=T

s/∈{τ0,τ1,...}

1(Ls).

Given τ0, τ1, . . ., Us, s /∈ {τ0, τ1, . . .} are independent of T and ξN. For such an s,
Us is uniform on Is := [0, 1] \ B(µN−1(X

′
N (s)); 3D(s)), and, for s ≥ T , D(s) ≤ 2ε so

that Is ⊇ [0, max{ξN − 8ε, 0}] ∪ [min{ξN + 8ε, 1}, 1]. Hence, given s /∈ {τ0, τ1, . . .} and
s ≥ T ,

P[Ls] = P[Us < ξN − 8ε | Us ∈ Is] ≥ ξN − 8ε.

Hence, considering separately the cases ξN > 9ε and ξN ≤ 9ε, the strong law of large
numbers implies that

1

t

t
∑

s=T

s/∈{τ0,τ1,...}

1(Ls) ≥ ξN − 9ε,

for all t sufficiently large; here we have used the fact that t − T → ∞ a.s. as t → ∞
and #{n ∈ Z+ : τn ≤ t} = o(t) a.s., which follows from (2.12) and Lemma 2.4
with a standard concentration argument using e.g. the Azuma–Hoeffding inequality.
Since ε > 0 was arbitrary, it follows that lim inft→∞ t

−1πN(t) ≥ ξN a.s. The sym-
metrical argument considering events of the form Rt := {Ut > ξN + 8ε} shows that
lim inft→∞(1 − t

−1πN(t)) ≥ 1 − ξN a.s., so lim supt→∞ t
−1πN(t) ≤ ξN a.s. Combining

the two bounds gives the result.

3.3. The limit has full support

In this section, we prove that ξN is fully supported on (0, 1) in the sense that
ess inf ξN = 0, ess sup ξN = 1, and ξN assigns positive probability to any non-null
interval. Let

mN(0) := min{‖Xi(0) − Xj(0)‖ : i, j ∈ {0, 1, . . . , N + 1}, i 6= j}, (3.3)

where we use the conventions X0(0) := 0 and XN+1(0) := 1. For ρ > 0 let Sρ denote
the F0-event Sρ := {mN(0) ≥ ρ} that no point of XN(0) is closer than distance ρ to
any other point of XN(0) or to either of the ends of the unit interval.


14 MICHAEL GRINFELD, STANISLAV VOLKOV, ANDREW R. WADE

Proposition 3.2. Let d = 1 and N ≥ 3. Let ρ ∈ (0, 1). For any non-null interval
subset I of [0, 1], there exists δ > 0 (depending on N, I, and ρ) for which

P[ξN ∈ I | XN(0)] ≥ δ1(Sρ), a.s. (3.4)

In particular, in the case where XN(0) consists of N independent U[0, 1] points, P[ξN ∈
I] > 0 for any non-null interval I ⊆ [0, 1].

We suspect, but have not been able to prove, that ξN has a density fN with respect
to Lebesgue measure, i.e., ξN is absolutely continuous in the sense that for every ε > 0
there exists δ > 0 such that P[ξN ∈ A] < ε for every A with Lebesgue measure less
than δ. Note that P[ξN ∈ A | XN(0)] may be 0 if XN(0) contains non-distinct points:
e.g. if N ≥ 3 and XN(0) = (x, x, . . . , x, y), then X

′
N(t) = (x, x, . . . , x) for all t.

For a ∈ [0, 1], ε > 0, and t ∈ N, define the event

Ea,ε(t) :=

N
⋂

i=1

{|Xi(t) − a| < ε} .

The main new ingredient needed to obtain Proposition 3.2 is the following result.

Lemma 3.1. Let N ≥ 3. For any ρ ∈ (0, 1) and ε > 0 there exist t0 ∈ N and δ0 > 0
(depending on N, ρ, and ε) for which, for all a ∈ [0, 1],

P[Ea,ε(t0) | XN(0)] > δ01(Sρ), a.s.

Proof. Fix a ∈ [0, 1]. Let ρ ∈ (0, 1) and ε > 0. It suffices to suppose that ε ∈ (0, ρ),
since Ea,ε(t) ⊆ Ea,ε′(t) for ε

′ ≥ ε. Suppose that Sρ occurs, so that mN(0) ≥ ρ with
mN(0) defined at (3.3). For ease of notation we list the points of XN (0) in increasing
order as 0 < X1 < X2 < · · · < XN < 1. Let M = ⌊N/2⌋.

Let ν = ε/N2. The following argument shows how one can arrive at a configuration
at a finite (deterministic) time t0 where all of X1(t0), . . . , XN(t0) lie inside (a−ε, a+ε)
with a positive (though possibly very small) probability.

Let us call the points which are present at time 0 old points; the points which
will gradually replace this set will be called new points. We will first describe an
event by which all the old points are removed and replaced by new points arranged
approximately equidistantly in the interval [XM, XM+1], and then we will describe an
event by which such a configuration can migrate to the target interval.

Step 1. Starting from time 0, iterate the following procedure until a new point
becomes an extreme point. The construction is such that at each step, the extreme
point is one of the old points, either at the extreme left or right of the configuration. At
each step, the extreme old point is removed and replaced by a new U[0, 1] point to form
the configuration at the next time unit. We describe an event of positive probability by
requiring the successive new arrivals to fall in particular intervals, as follows. The first
old point removed from the right is replaced by a new point in (XM + ν, XM + ν + δ),
where δ ∈ (0, ν) will be specified later. Subsequently, the ith point (i ≥ 2) removed
from the right is replaced by a new point in (XM +iν, XM +iν +δ). We call this subset
of new points the accumulation on the left. On the other hand, the ith extreme point
removed from the left (i ∈ N) is replaced by a new point in (XM+1 −iν, XM+1 −iν +δ).
This second subset of new points will be called the accumulation on the right.


Randomized Keynesian beauty contest 15

During the first M steps of this procedure, the new points are necessarily internal
points of the configuration and so are never removed. Therefore, there will be a time
t1 ∈ [M, N] at which, for the first time, one of the new points becomes either the
leftmost or rightmost point of XN(t1); suppose that it is the rightmost, since the
argument in the other case is analogous. If at time t1 the accumulation on the right is
non-empty, we continue to perform the procedure described in Step 1, but now allowing
ourselves to remove new points from the accumulation on the right. So we continue
putting extra points on the accumulation on the left whenever the rightmost point is
removed, and similarly putting extra points to the accumulation on the right whenever
the leftmost point is removed, as described for Step 1. Eventually we will have either
(a) a configuration where all the new points of the left or the right accumulation are
completely removed, and there are still some of the old points left, or (b) a configuration
where all old points are removed. The next step we describe separately for these two
possibilities.

Step 2(a). Without loss of generality, suppose that the accumulation on the right
is empty, so the configuration consists of k points of the left accumulation and N − k
old points remaining to the left of XM (including XM itself). Note that Step 1
produces at least M new points, so M ≤ k ≤ N − 1, since by assumption we have
at least one old point remaining. Let us now denote the points of the configuration
x1 < x2 < · · · < xN so that xN−k = XM, and by the construction in Step 1,
xN−k+i ∈ (XM + iν, XM + iν + δ) for i = 1, 2, . . . , k. Provided that k ≤ N − 2, so
that there are at least 2 old points, we will show that x1 is necessarily the extreme
point of the configuration. Indeed, writing µ = µN(x1, . . . , xN), using the fact that
xN−k+i ≥ xN−k + iν for 1 ≤ i ≤ k and xN ≤ xN−k + kν + δ, we have

µ −
x1 + xN

2
≥

x1 + · · · + xN−k + kxN−k +
1
2
νk(k + 1)

N
−

x1 + xN−k + νk + δ

2

=
1

2N
(2x1 + · · · + 2xN−k + (2k − N)xN−k − Nx1 + νk(k + 1 − N) − δN) .

The old points all have separation at least ρ, so for 1 ≤ i ≤ N − k, xi ≥ x1 + (i − 1)ρ,
and hence

2x1 +· · ·+2xN−k +(2k−N)xN−k ≥ Nx1 +ρ(N −k−1)(N −k)+ρ(2k−N)(N −k−1).

It follows, after simplification, that

µ −
x1 + xN

2
≥

1

2N
(k(N − k − 1)(ρ − ν) − δN)

≥
1

2N
((N − 2)(ρ − ν) − δN) ,

provided 1 ≤ k ≤ N − 2. By choice of ν, we have ν ≤ ρ/9 and it follows that the last
displayed expression is positive provided δ is small enough compared to ρ (δ < ρ/4,
say). Hence |x1 −µ| > |xN −µ|. Thus next we remove x1. We replace it similarly to the
procedure in Step 1, but now building up the accumulation on the left. We can thus
iterate this step, removing old points from the left and building up the accumulation
on the left, while keeping the accumulation on the right empty, until we get just one
old point remaining (i.e. until k = N −1); this last old point will be XM . At this stage,


16 MICHAEL GRINFELD, STANISLAV VOLKOV, ANDREW R. WADE

after a finite number of steps, we end up with a configuration where the set of points
x1 < x2 < · · · < xN satisfies xi ∈ [XM + (i − 1)ν, XM + (i − 1)ν + δ], i = 1, 2, . . . , N.

Step 2(b). Suppose that the configuration is such that all old points have been
removed but both left and right accumulations are non-empty. Repeating the procedure
of Step 1, replacing rightmost points by building the left accumulation and leftmost
points by building the right accumulation, we will also, in a finite number of steps,
obtain a set points xi such that xi ∈ [b + (i − 1)ν, b + (i − 1)ν + δ], i = 1, 2, . . . , N, for
some b ∈ [0, 1].

Step 3. Now we will show how one can get to the situation where all points lie
inside the interval (a − ε, a + ε) starting from any configuration in which

xi ∈ [b + iν, b + iν + δ], i = 1, 2, . . . , N − 1, (3.5)

where b ∈ [0, 1] and x1 < · · · < xN−1 are the core points of the configuration (i.e.,
with the extreme point removed). We have shown in Step 1 and Step 2 how we can
achieve such a configuration in a finite time with a positive probability. Suppose that
a > b; the argument for the other case is entirely analogous. We describe an event of
positive probability by which the entire configuration can be moved to the right.

Having just removed the extreme point, we stipulate that the new point y1 belong
to (b + Nν − 6δ, b + Nν − 5δ), so y1 > xN−1 is the new rightmost point provided
δ < ν/7. Then to ensure that x1, and not y1, is the most extreme point we need

x1 + y1
2

−

[

b + ν
N + 1

2

]

<
x1 + · · · + xN−1 + y1

N
−

[

b + ν
N + 1

2

]

.

The left-hand side of the last inequality is less than −2δ while the right-hand side is
more than −6δ

N
, so the inequality is indeed satisfied provided N ≥ 3.

b + ν

✲✛

δ

✉

b + 2ν

✲✛

δ

✉

· · · b + (N − 1)ν

✲✛

δ

✉

b + Nν

✛ ✲✛✲

δ 5δ

✉

Figure 3: Schematic of a configuration at the start of Step 3. The disks represent the points
x1, x2, . . . , xN−1 and, on the extreme right, the new point y1.

Hence at the next step x1 is removed. Our new collection of core points is x2 <
· · · < xN−1 < y1. We stipulate that the next new point y2 arrive in (b + (N + 1)ν −
18δ, b + (N + 1)ν − 17δ). So again, for δ small enough (δ < ν/13 suffices), y2 > y1 and
the newly added point (y1) becomes the rightmost point in the configuration. Again,
to ensure that the leftmost point (x2) is now the extreme one, we require

x2 + y2
2

−

[

b + ν
N + 3

2

]

<
x2 + · · · + xN−1 + y1 + y2

N
−

[

b + ν
N + 3

2

]

.

The left-hand side of the last inequality is less than −8δ, while the right-hand side is
more than −24δ/N, and so the displayed inequality is true provided N ≥ 3.

We will repeat this process until we remove the rightmost core point present at the
start of Step 3, namely xN−1, located in [b + (N − 1)ν, b + (N − 1)ν + δ]. We will


Randomized Keynesian beauty contest 17

demonstrate how we can do this, in succession removing points from the left of the
configuration and at each step replacing them by points on the right with careful choice
of locations for the new points. We consecutively put new points yk at locations in
intervals

∆k := (b + (N − 1 + k)ν − 2 · 3
kδ, b + (N − 1 + k)ν − 2 · 3kδ + δ),

for k = 1, 2, . . . , N − 1. We have just shown that for k = 1, 2 this procedure will
maintain the leftmost point (xk) as the extreme one. Let us show that this is true
for all 1 ≤ k ≤ N − 1, by an inductive argument. Indeed, suppose that the original
points x1, x2, . . . , xk−1 have been removed, the successive new points yj are located
in ∆j, j = 1, 2, . . . , k − 1, and that the replacement for the most recently removed
point xk−1 is the new point yk. Place the new point yk in ∆k. Provided δ <

ν
4·3k−1+1

,
yk > yk−1 and yk is the rightmost point of the new configuration, while the leftmost
point is xk ∈ [b + νk, b + νk + δ]. Since N ≥ 3 we have

xk + yk
2

≤ b +

[

N − 1

2
+ k

]

ν − [3k − 1]δ ≤ b +

[

N − 1

2
+ k

]

ν −
2(3 + 32 + · · · + 3k)δ

N

≤
xk + · · · + xN−1 + y1 + · · · + yk

N
,

thus ensuring that the leftmost point xk, and not yk, is the farthest from the centre of
mass.

Thus, provided δ < 3−Nν, say, we proceed to remove all the points xk and end up
with a new collection of points x′1, . . . , x

′
N−1 satisfying the property

x′i ∈ [b
′ + iν, b′ + iν + δ′], i = 1, 2, . . . , N − 1,

where b′ = b + (N − 1)ν − δ′ and δ′ := 3Nδ (> 2 · 3N−1δ). Thus the situation is similar
to the one in (3.5) but with b replaced by b′ > b + ν, so the whole “grid” is shifted
to the right. Hence, provided δ is small enough, and δ′ and its subsequent analogues
remain such that δ′ < 3−Nν, we can repeat the above procedure and move points to
the right again, etc., a finite number of times (depending on |b−a|/ν) until the moment
when all the new points are indeed in (a − ε, a + ε), and the probability of making all
those steps is strictly positive. In particular, we can check that taking δ < 3−2N/νν
will suffice.

All in all, we have performed a finite number of steps, which can be bounded above
in terms of N, ρ, and ε but independently of a, and each of which required a U[0, 1]
variable to be placed in a small interval (of width less than 3−Nν) and so has positive
probability, which can be bounded below in terms of N, ρ, and ε. So overall the desired
transformation of the configuration has positive probability depending on N, ρ, and ε,
but not on a. This completes the proof of the lemma.

Proof of Proposition 3.2. Write µ′(t) := µN−1(X
′
N(t)). Let I ⊆ [0, 1] be a non-null

interval. We can (and do) choose a ∈ (0, 1) and ε′ > 0 such that I′ := [a−ε′, a+ε′] ⊆ I.
Also take I′′ := [a − ε, a + ε] ⊂ I′ for ε = (4BC)−1N−1/2ε′, where C is the constant in
Lemma 2.6 and B ≥ 1 is an absolute constant chosen so that ε < ε′/4 for all N ≥ 3.
Fix ρ ∈ (0, 1). It follows from Lemma 3.1 that, for some δ0 > 0 and t0 ∈ N, depending
on ε,

P[{µ′(t0) ∈ I
′′} ∩ {D(t0) ≤ 2ε} | XN (0)] ≥ δ01(Sρ).


18 MICHAEL GRINFELD, STANISLAV VOLKOV, ANDREW R. WADE

By Lemma 2.2, we have that D(t0) ≤ 2ε implies that F(t0) ≤ 2Nε
2 < (ε′/(2BC))2, so

that t0 ≥ νε′/(2BC), where ν· is as defined just before Lemma 2.6. Applying Lemma 2.6
with this choice of t0 and with the ε there equal to ε

′/(2BC), we obtain, by Markov’s
inequality,

P

[

max
t≥t0

|µ′(t) − µ′(t0)| ≤ 3ε
′/4 | Ft0

]

≥ 1/3, a.s.

It follows that, given XN(0), the event

{µ′(t0) ∈ I
′′} ∩ {D(t0) ≤ 2ε} ∩ {|ξN − µ

′(t0)| ≤ 3ε
′/4}

has probability at least (δ0/3)1(Sρ), and on this event we have |ξN −a| ≤ ε+(3ε
′/4) <

ε′, so ξN ∈ I. Hence (3.4) follows.

For the final statement in the proposition, suppose that XN(0) consists of independ-
ent U[0, 1] points. In this case mN(0) defined at (3.3) is the minimal spacing in the
induced partition of [0, 1] into N +1 segments, which has the same distribution as 1

N+1

times a single spacing, and in particular has density f(x) = N(N +1)(1−(N +1)x)N−1

for x ∈ [0, 1
N+1

] (cf Section A). Hence for any ρ ∈ [0, 1
N+1

], we have P[Sρ] =

(1 − (N + 1)ρ)N, which is positive for ρ = 1
2N

, say. Thus taking expectations in
(3.4) yields the final statement in the proposition.

3.4. Explicit calculations for N = 3

For this section we take N = 3, the smallest nontrivial example. In this case we can
perform some explicit calculations to obtain information about the distribution of ξ3. In
fact, we work with a slightly modified version of the model, avoiding certain ‘boundary
effects’, to ease computation. Specifically, we do not use U[0, 1] replacements but, given
X3(t), we take Ut+1 to be uniform on the interval [min X

′
3(t)− D(t), max X

′
3(t)+ D(t)].

If this interval is contained in [0, 1] for all t, this modification would have no effect on
the value of ξ3 realized (only speeding up the convergence), but the fact that now Ut+1
might be outside [0, 1] does change the model.

For this modified model, the argument for Theorem 1.1 follows through with minor
changes, although we essentially reprove the conclusion of Theorem 1.1 in this case
when we prove the following result, which gives an explicit description of the limit

distribution. Here and subsequently ‘
d
=’ denotes equality in distribution.

Proposition 3.3. Let d = 1 and N = 3 and work with the modified version of the
process just described. Let X3(0) consist of 3 distinct points in [0, 1]. Write µ :=
µ2(X

′
3(0)) and D := D2(X

′
3(0)). There exists a random ξ3 := ξ3(X3(0)) ∈ R such

that X ′3(t)
a.s.
−→ (ξ3, ξ3) as t → ∞. The distribution of ξ3 can be characterized via

ξ3
d
= µ + DL, where L is independent of (µ, D), L

d
= − L, and the distribution of L is

determined by the distributional solution to the fixed-point equation

L
d
=



















−1+U
2

+ UL with probability 1
3

−2−U
4

+ U
2
L with probability 1

6
2−U
4

+ U
2
L with probability 1

6
1+U
2

+ UL with probability 1
3
,

(3.6)


Randomized Keynesian beauty contest 19

where E[|L|k] < ∞ for all k, and U ∼ U[0, 1]. Writing θk := E[L
k], we have θk = 0

for odd k, and θ2 =
7
12
, θ4 =

375
368

, and θ6 =
76693
22080

. In particular,

E[ξ3] = E[µ], E[ξ
2
3] = E[µ

2] +
7

12
E[D2], and E[ξ33] = E[µ

3] +
7

4
E[µD2]. (3.7)

In the case where X3(0) contains 3 independent U[0, 1] points, E[ξ
k
3 ] =

1
2
, 1
3
, 1
4
for k =

1, 2, 3 respectively. If X3(0) = (
1
4
, 1
2
, 3
4
), then E[ξk3 ] =

1
2
, 29
96
, 13
64
, 873
5888

for k = 1, 2, 3, 4.

We give the proof of Proposition 3.3 at the end of this section. First we state one
consequence of the fixed-point representation (3.6).

Proposition 3.4. L given by (3.6) has an absolutely continuous distribution.

Proof. It follows from (3.6) that

P
[

L = 1
2

]

= 1
3
P
[

U
(

L − 1
2

)

= 1
]

+ 1
6
P
[

U
(

L + 1
2

)

= 2
]

+ 1
6
P
[

U
(

L − 1
2

)

= 0
]

+ 1
3
P
[

U
(

L + 1
2

)

= 0
]

.

The first two terms on the right-hand side of the last display are zero, by an application
of the first part of Lemma B.1 with X = U, Y = L±1/2, and a = 1, 2. Also, since U > 0
a.s., P[U(L ∓ 1

2
) = 0] = P[L = ±1

2
], and, by symmetry, P[L = 1/2] = P[L = −1/2].

Thus we obtain

P
[

L = 1
2

]

= 1
6
P
[

L = 1
2

]

+ 1
3
P
[

L = −1
2

]

= 1
2
P
[

L = 1
2

]

.

Hence P[L = 1/2] = P[L = −1/2] = 0.
Each term on the right-hand side of (3.6) is of the form ±1

2
+ V (L ± 1

2
) where V is

an absolutely continuous random variable, independent of L (namely U or U/2). The
final statement in Lemma B.1 with the fact that P[L = ±1/2] = 0 shows that each
such term is absolutely continuous. Finally, Lemma B.2 completes the proof.

In principle, the characterization (3.6) can be used to recursively determine all
the moments E[Lk] = θk, and the moments of ξ3 may then be obtained by expanding
E[ξk3 ] = E[(µ+DL)

k]. However, the calculations soon become cumbersome, particularly
as µ and D are, typically, not independent: we give some distributional properties of
(µ, D) in the case of a uniform random initial condition in Appendix A.

Before giving the proof of Proposition 3.3, we comment on some simulations. Figure
4 shows histogram estimates for the distribution of ξ3 for two initial distributions (one
deterministic and the other uniform random), and Table 2 reports corresponding mo-
ment estimates, which may be compared to the theoretical values given in Proposition
3.3. In the uniform case, we only computed the first 3 moments analytically, namely,
1
2
, 1

3
, 1

4
as quoted in Proposition 3.3; it is a curiosity that these coincide with the first

3 moments of the U[0, 1] distribution.

Proof of Proposition 3.3. Let µ′(t) := 1
2
(X(1)(t) + X(2)(t)) and D(t) := |X(1)(t) −

X(2)(t)| denote the mean and diameter of the core configuration, repeating our notation
from above.

Consider separately the events that Ut+1 falls in each of the four intervals

[min X ′3(t) − D(t), min X
′
3(t)), [min X

′
3(t), min X

′
3(t) +

1
2
D(t)),

[min X ′3(t) +
1
2
D(t), max X ′3(t)), [max X

′
3(t), max X

′
3(t) + D(t)],


20 MICHAEL GRINFELD, STANISLAV VOLKOV, ANDREW R. WADE

−2 −1 0 1 2

0
.0

0
.5

1
.0

1
.5

−0.5 0.0 0.5 1.0 1.5

0
.0

0
.2

0
.4

0
.6

0
.8

1
.0

1
.2

Figure 4: Normalized histograms based on 108 simulations of the modified N = 3 model
with fixed {−1/2, 1/2, 100} initial condition (left) and i.i.d. U[0, 1] initial condition (right).

k 1 2 3 4 5 6
±1

2
core 0.0001 0.5833 0.0000 1.0192 −0.0005 3.4765

U[0,1] 0.5000 0.3333 0.2500 0.2029 0.1739 0.1561

Table 2: Empirical kth moment values (to 4 decimal places) computed from the simulations
in Figure 4.

which have probabilities 1
3
, 1
6
, 1
6
, 1
3
respectively. Given (µ′(t), D(t)), we see, for Vt+1 a

U[0, 1] variable, independent of (µ′(t), D(t)),

(µ′(t + 1), D(t + 1)) =



















(µ′(t) −
1+Vt+1

2
D(t), Vt+1D(t)) with prob.

1
3

(µ′(t) −
2−Vt+1

4
D(t), 1

2
Vt+1D(t)) with prob.

1
6

(µ′(t) +
2−Vt+1

4
D(t), 1

2
Vt+1D(t)) with prob.

1
6

(µ′(t) +
1+Vt+1

2
D(t), Vt+1D(t)) with prob.

1
3

. (3.8)

Writing mk(t) = E[D(t)
k | X3(0)] we obtain from the second coordinates in (3.8)

mk(t + 1) =
2

3
E[V kt+1]mk(t) +

1

3
2−kE[V kt+1]mk(t),

which implies that

mk(t) =

(

1

3(k + 1)
(2 + 2−k)

)t

D(0)k. (3.9)

For example, m1(t) = (5/12)
tD(0) and m2(t) = (1/4)

tD(0)2.
Next we show that µ′(t) converges. From (3.8), we have that |µ′(t+1)−µ′(t)| ≤ D(t),

a.s., so to show that µ′(t) converges, it suffices to show that
∑∞

t=0 D(t) < ∞ a.s. But
this can be seen from essentially the same argument as Lemma 2.5, or directly from
the fact that the sum has nonnegative terms and E

∑∞
t=0 D(t) =

∑∞
t=0 E[m1(t)], which

is finite. Hence µ′(t) converges a.s. to some limit, ξ3 say. Extending this argument a


Randomized Keynesian beauty contest 21

little, we have from (3.8) that |µ′(t+1)| ≤ |µ′(t)|+D(t), a.s., and D(t+1) ≤ Vt+1D(t),
a.s. Hence for V1, V2, . . . i.i.d. U[0, 1] random variables, we have D(t) ≤ V1 · · · VtD(0)
and

|µ′(t)| − |µ′(0)| =

t−1
∑

s=0

(|µ′(s + 1)| − |µ′(s)|) ≤

(

1 +

∞
∑

s=1

s
∏

r=1

Vr

)

D(0) =: (1 + Z)D(0).

Here Z has the so-called Dickman distribution (see e.g. [13, §3]), which has finite
moments of all orders. Hence E[|µ′(t)|p | X3(0)] is bounded independently of t, so, for
any p ≥ 1, (µ′(t))p is uniformly integrable, and hence limt→∞ E[(µ

′(t))k | X3(0)] =
E[ξk3 | X3(0)] for any k ∈ N.

We now want to compute the moments of ξ3; by the previous argument, we can first
work with the moments of µ′(t). Note that, from (3.8),

E[(µ′(t + 1) − µ′(t))k | X3(t)]

=
1 + (−1)k

3
D(t)kE

[

(

1 + Vt+1
2

)k
]

+
1 + (−1)k

6
D(t)kE

[

(

1 + Vt+1
4

)k
]

=
1 + (−1)k

6
(21−k + 2−2k)

2k+1 − 1

k + 1
D(t)k,

using the fact that E[(1+Vt+1)
k] = 2

k+1−1
k+1

. In particular, E[(µ′(t+1)−µ′(t))k | X3(t)] =
0 for odd k, so E[µ′(t) | X3(0)] = µ

′(0), and hence E[ξ3] = limt→∞ E[µ
′(t)] = E[µ′(0)],

giving the first statement in (3.7). In addition,

E[(µ′(t + 1))2 − (µ′(t))2 | X3(t)]

= 2µ′(t)E[µ′(t + 1) − µ′(t) | X3(t)] + E[(µ
′(t + 1) − µ′(t))2 | X3(t)]

=
7

16
D(t)2.

Hence

E[(µ′(t))2 − (µ′(0))2 | X3(0)] =

t−1
∑

s=0

E[(µ′(s + 1))2 − (µ′(s))2 | X3(0)]

=
7

16

t−1
∑

s=0

m2(s)

→
7

16

∞
∑

s=0

4−sD(0)2,

as t → ∞, and the limit evaluates to 7
12
D(0)2, so that E[ξ23] = limt→∞ E[(µ

′(t))2] =
E[(µ′(0))2] + 7

12
E[D(0)2], giving the second statement in (3.7).

Write L(µ′(0), D(0)) = ξ3(X3(0)) emphasizing the dependence on the initial config-
uration through µ′(0) and D(0). Then by translation and scaling properties

L(µ′(0), D(0))
d
= µ′(0) + D(0)L(0, 1). (3.10)

So we work with L := L(0, 1) (which has the initial core points at ±1
2
).


22 MICHAEL GRINFELD, STANISLAV VOLKOV, ANDREW R. WADE

We will derive a fixed-point equation for L. The argument is closely related to that
for (3.8). Conditioning on the first replacement and using the transformation relation
(3.10), we obtain (3.6). From (3.6) we see that |L| is stochastically dominated by
1 + U|L|; iterating this, similarly to the argument involving the Dickman distribution
above, we obtain that |L| is stochastically dominated by 1 + Z, where Z has the
Dickman distribution, which is determined by its moments. Hence (3.6) determines a
unique distribution for L with E[|L|k] < ∞ for all k.

Writing (3.6) in functional form L
d
= Ψ(L), we see that by symmetry of the form

of Ψ, also Ψ(L)
d
= − Ψ(−L). Hence −L

d
= − Ψ(L)

d
= Ψ(−L), so −L satisfies the same

distributional fixed-point equation as does L. Hence L
d
= − L.

Writing θk := E[L
k], which we know is finite, we get

θk =
1

3

k
∑

j=0

(

k

j

)

(1 + (−1))jθk−jE

[

(

1 + U

2

)j

Uk−j

]

+
1

6

k
∑

j=0

(

k

j

)

(1 + (−1))jθk−jE

[

(

2 − U

4

)j (
U

2

)k−j
]

.

Here

E

[

(

1 + U

2

)j

Uk−j

]

= 2−j
j
∑

ℓ=0

(

j

ℓ

)

1

k − ℓ + 1
=: a(k, j);

E

[

(

2 − U

4

)j (
U

2

)k−j
]

= 2−k
j
∑

ℓ=0

(

j

ℓ

)

(−1/2)j−ℓ

k − ℓ + 1
=: b(k, j).

So we get

θk =
1

3

∑

j even, j≤k

(

k

j

)

θk−j(2a(k, j) + b(k, j)). (3.11)

In particular, as can be seen either directly by symmetry or by an inductive argument
using (3.11), θk = 0 for odd k. For even k, one can use (3.11) recursively to find θk,
obtaining for example the values quoted in the proposition.

Note that, by (3.10), E[ξ33] = E[(µ
′(0) + LD(0))3], which, on expansion, gives the

final statement in (3.7). The first 3 moments in the case of the uniform initial condition
follow from (3.7) and Lemma A.1. For the initial condition with points 1

4
, 1
2
, 3
4
, we have

D(0) = 1
4
and µ′(0) = χ3

8
+(1−χ)5

8
= 5

8
− χ

4
, where χ is the tie-breaker random variable

taking values 0 or 1 each with probability 1
2
. It follows that E[µ′(0)k] = 1

2
8−k(3k +5k).

Then, using (3.10),

E[ξk3 ] = E[(µ
′(0) + (L/4))k] =

1

2
8−k

k
∑

j=0

(

k

j

)

2jθj(3
k−j + 5k−j).

We can now compute the four moments given in the proposition.


Randomized Keynesian beauty contest 23

Appendix A. Uniform spacings

In this appendix we collect some results about uniform spacings which allow us to
obtain distributional results about our uniform initial configurations. The basic results
that we build on here can be found in Section 4.2 of [14]; see the references therein for
a fuller treatment of the theory of spacings.

Let U1, U2, . . . , Un be independent U[0, 1] points. Denote the corresponding in-
creasing order statistics U[1] ≤ · · · ≤ U[n], and define the induced spacings by Sn,i :=
U[i] −U[i−1], i = 1, . . . , n+1, with the conventions U[0] := 0 and U[n+1] := 1. We collect
some basic facts about the Sn,i. The spacings are exchangeable, and any n-vector,
such as (Sn,1, . . . , Sn,n), has the uniform density on the simplex ∆n := {(x1, . . . , xn) ∈
[0, 1]n :

∑n
i=1 xi ≤ 1}.

We need some joint properties of up to 3 spacings. Any 3 spacings have density
f(x1, x2, x3) = n(n − 1)(n − 2)(1 − x1 − x2 − x3)

n−3 on ∆3. We note that

min{Sn,1, Sn,2}
d
= 1

2
Sn,1, (n ≥ 1), (A.1)

(Sn,1, min{Sn,2, Sn,3})
d
= (Sn,1,

1
2
Sn,2), (n ≥ 2); (A.2)

see for example Lemma 4.1 of [14]. Finally, for any n ≥ 1 and α ≥ 0, β ≥ 0,

E[Sαn,1S
β
n,2] =

Γ(n + 1)Γ(α + 1)Γ(β + 1)

Γ(n + 1 + α + β)
. (A.3)

In particular E[Skn,1] =
n!k!

(n+k)!
for k ∈ N.

Our main application in the present paper of the results on spacings collected above
is to obtain the following result, which we use in Section 3.4.

Lemma A.1. Let d = 1 and N = 3. Suppose that X3(0) consists of 3 independent
U[0, 1] points. Then

(µ2(X
′
3(0)), D2(X

′
3(0)))

d
= ((S1 +

1
4
S2)ζ + (1 − S1 −

1
4
S2)(1 − ζ),

1
2
S2),

where ζ is a Bernoulli random variable with P[ζ = 0] = P[ζ = 1] = 1/2. For k ∈ Z+,

E[(D2(X
′
3(0)))

k] = 2−k
6

(k + 1)(k + 2)(k + 3)
, (A.4)

E[(µ2(X
′
3(0)))

k] =
4(3k − 5 + (3k+3 − 1)4−(k+1))

(k + 1)(k + 2)(k + 3)
. (A.5)

So, for example, the first 3 moments of D2(X
′
3(0)) are

1
8
, 1

40
, and 1

160
, while the first 3

moments of µ2(X
′
3(0)) are

1
2
, 51

160
, and 73

320
. Finally, E[µ2(X

′
3(0))(D2(X

′
3(0)))

2] = 1
80
.

Proof. The 3 points of X3(0) induce a partition of the interval [0, 1] into uniform
spacings S1, S2, S3, S4, enumerated left to right (for this proof we suppress the first
index in the notation above). For ease of notation, write D := D2(X

′
3(0)) and µ :=

µ2(X
′
3(0)) for the duration of this proof. Then D = min{S2, S3}

d
= S1/2, by (A.1).

Moreover, min{S2, S3} is equally likely to be either S2 or S3. In the former case,


24 MICHAEL GRINFELD, STANISLAV VOLKOV, ANDREW R. WADE

µ = S1 +
1
2
min{S2, S3}, while in the latter case µ = 1 − S4 −

1
2
min{S2, S3}. Using

(A.2), we obtain the following characterization of the joint distribution of µ and D.

(µ, D)
d
=

{

(S1 +
1
4
S2,

1
2
S2) with probability

1
2

(1 − S1 −
1
4
S2,

1
2
S2) with probability

1
2
.

(A.6)

Hence E[Dk] = 2−kE[Sk1 ], which gives (A.4) by the n = 3, α = k, β = 0 case of (A.3).
For the moments of µ, we have from (A.6) that µ has the distribution of W :=

S1 +
1
4
S2 with probability 1/2 or 1 − W with probability 1/2. So we have

E[µk] =
1

2
E[W k] +

1

2
E[(1 − W)k] =

1

2
wk +

1

2

k
∑

j=0

(

k

j

)

(−1)jwj,

where wk := E[W
k]. Since wk = E[(S1 +

1
4
S2)

k], we compute

wk =

k
∑

j=0

(

k

j

)

4−jE[S
k−j
1 S

j
2] = 6

k!

(k + 3)!

k
∑

j=0

4−j,

by the n = 3, α = k − j, β = j, case of (A.3). Thus we obtain

wk =
8(1 − 4−(k+1))

(k + 1)(k + 2)(k + 3)
.

It follows that

E[µk] =
1

2
wk + 4

k
∑

j=0

k!

(j + 3)!(k − j)!
(−1)j −

k
∑

j=0

k!

(j + 3)!(k − j)!
(−1/4)j.

We deduce (A.5), after simplification, from the claim that, for any z ∈ R,

S(z) :=

k
∑

j=0

k!

(j + 3)!(k − j)!
(−z)j

=
k!

2z3(k + 3)!

[

z2(k + 2)(k + 3) + 2 − 2z(k + 3) − 2(1 − z)k+3
]

. (A.7)

Thus it remains to verify (A.7). To this end, note that

S(z) =
k!

(k + 3)!

k
∑

j=0

(

k + 3

j + 3

)

(−z)j

=
k!

(k + 3)!



−z−3
k+3
∑

j=0

(

k + 3

j

)

(−z)j + z−1
(

k + 3

2

)

− z−2
(

k + 3

1

)

+ z−3
(

k + 3

0

)





=
k!

z3(k + 3)!

[

−(1 − z)k+3 +
1

2
z2(k + 2)(k + 3) − z(k + 3) + 1

]

,

which gives the claim (A.7).


Randomized Keynesian beauty contest 25

For the final statement in the lemma, we have from (A.6) that

E[µD2] =
1

2
E[(S1 +

1
4
S2)(

1
4
S22)] +

1

2
E[(1 − S1 −

1
4
S2)(

1
4
S22)] =

1

8
E[S22] =

1

80
,

by (A.3). This completes the proof of the lemma.

We can also obtain explicit expressions for the densities of D and µ. Since D
d
= S1/2,

the density of D is fD(r) = 3(1 − 2r)
2 for r ∈ [0, 1/2]. In addition, µ has density fµ

given by

fµ(r) =











4r[3(1 − r) − 4r] if r ∈ [0, 1/4]

2 − 4r(1 − r) if r ∈ [1/4, 3/4]

4(1 − r)[3r − 4(1 − r)] if r ∈ [3/4, 1]

. (A.8)

Indeed, with the representation of µ as either W or 1−W with probability 1/2 of each,
we have P[µ ≤ r] = 1

2
P[W ≤ r] + 1

2
(1 − P[W < 1 − r]). Assuming that W has a density

fW (which indeed it has, as we will show below), we get

fµ(r) =
1

2
fW (r) +

1

2
fW (1 − r). (A.9)

Using the fact that W
d
= S1 +

1
4
S2, we use the joint distribution of (S1, S2) to calculate

P[W ≤ r] =

∫ 1

0

dx1

∫ 1−x1

0

dx26(1 − x1 − x2)1{x1 +
1
4
x2 ≤ r}

=

∫ r

0

dx1

∫ (4(r−x1))∧(1−x1)

0

dx26(1 − x1 − x2).

After some routine calculation, we then obtain

P[W ≤ r] =

{

1 − 4
3
(1 − r)3 + 1

3
(1 − 4r)3 if r ∈ [0, 1/4]

1 − 4
3
(1 − r)3 if r ∈ [1/4, 1].

Hence W has density fW given by

fW (r) =

{

4(1 − r)2 − 4(1 − 4r)2 if r ∈ [0, 1/4]

4(1 − r)2 if r ∈ [1/4, 1].

Then (A.8) follows from (A.9).

Appendix B. Continuity of random variables

In this appendix we give some results that will allow us to deduce the absolute
continuity of certain distributions specified as solutions to fixed-point equations: spe-
cifically, we use these results in the proof of Proposition 3.4. The results in this section
may well be known, but we were unable to find a reference for them in a form directly
suitable for our application, and so we include the (short) proofs.

Lemma B.1. Let X and Y be independent random variables such that X has an
absolutely continuous distribution. Then for any a 6= 0 we have P[XY = a] = 0.
Moreover, if P[Y = 0] = 0, then XY is an absolutely continuous random variable.


26 MICHAEL GRINFELD, STANISLAV VOLKOV, ANDREW R. WADE

Proof. For the moment assume that P[X < 0], P[X > 0], P[Y < 0], and P[Y > 0]
are all positive. Take some 0 < c < d. Then

P[XY ∈ (c, d)] = P[log(X) + log(Y ) ∈ (log c, log d) | X > 0, Y > 0]P[X > 0]P[Y > 0]

+P[log(−X) + log(−Y ) ∈ (log c, log d) | X < 0, Y < 0]P[X < 0]P[Y < 0].

Note that conditioning X on the event X > 0 (or X < 0) preserves the continuity of
X and the independence of X and Y . Then since the sum of two independent random
variables at least one of which absolutely continuous is also absolutely continuous
(see [9, Theorem 5.9, p. 230]) we have

P[log(X) + log(Y ) ∈ (log c, log d) | X > 0, Y > 0] =

∫ log d

log c

f+(x)dx, and

P[log(−X) + log(−Y ) ∈ (log c, log d) | X < 0, Y < 0] =

∫ log d

log c

f−(x)dx,

for suitable probability densities f+ and f−. After the substitution u = e
x, this yields

P[XY ∈ (c, d)] =

∫ d

c

P[X > 0]P[Y > 0]f+(log u) + P[X < 0]P[Y < 0]f−(log u)

u
du.

This expression is also valid if some of the probabilities for X and Y in the numerator
of the integrand are zero. Therefore, we have, for any 0 < c < d,

P[XY ∈ (c, d)] =

∫ d

c

f(u)du, (B.1)

for some function f(u) defined for u > 0. A similar argument applies to the case
c < d < 0; then (B.1) is valid for any c < d < 0 as well, extending f(u) for strictly
negative u. In particular, it follows that P[XY = a] = 0 for a 6= 0.

Now if P[Y = 0] = 0, then P[XY 6= 0] = 1. Then we can set f(0) = 0 so that (B.1)
holds for all c, d ∈ R. This completes the proof.

Lemma B.2. Suppose that a random variable L satisfies the distributional equation

L
d
=















Z1 with probability p1
...

Zn with probability pn,

where n ∈ N,
∑n

i=1 pi = 1, pi > 0, and each Zi is an absolutely continuous random
variable. Then L is absolutely continuous.

Proof. Suppose Zi has a density fi. Then for any −∞ ≤ a < b ≤ +∞ we have

P[L ∈ (a, b)] =

n
∑

i=1

piP[Zi ∈ (a, b)] =

n
∑

i=1

pi

∫ b

a

fi(x)dx =

∫ b

a

[

n
∑

i=1

pifi(x)

]

dx,

which yields the statement of lemma.


Randomized Keynesian beauty contest 27

Acknowledgements

Parts of this work were done at the University of Strathclyde, where the third author
was also employed, during a couple of visits by the second author, who is grateful for
the hospitatlity of that institution.

References

[1] Benassi, C. and Malagoli, F. The sum of squared distances under a diameter constraint, in
arbitrary dimension. Arch. Math. 90 (2008) 471–480.

[2] De Giorgi, E. and Reimann, S. The α-beauty contest: Choosing numbers, thinking intervals.
Games Econom. Behav. 64 (2008) 470–486.

[3] Erdős, P. On the smoothness properties of a family of Bernoulli convolutions. Amer. J. Math.
62 (1940) 180–186.

[4] Grinfeld, M., Knight, P.A. and Wade, A.R. Rank-driven Markov processes. J. Stat. Phys.
146 (2012) 378–407.

[5] Hughes, B.D. Random Walks and Random Environments; Volume 1: Random Walks,
Clarendon Press, Oxford, 1995.

[6] Johnson, N.L. and Kotz, S. Use of moments in studies of limit distributions arising from
iterated random subdivisions of an interval. Statist. Probab. Lett. 24 (1995) 111–119.

[7] Keynes, J.M. The General Theory of Employment, Interest and Money, Macmillan, London,
1936.

[8] Krapivsky, P.L. and Redner, S. Random walk with shrinking steps. Amer. J. Phys. 72 (2004)
591–598.

[9] Moran, P.A.P. An Introduction to Probability Theory, Clarendon Press, Oxford, 1968
(paperback ed., with corrections, 2002).

[10] Moulin, H. Game Theory for the Social Sciences, 2nd ed., New York University Press, New
York, 1986.

[11] Muratov, A. and Zuyev, S. LISA: Locally interacting sequential adsorption, Stoch. Models 29
(2013) 475–496.

[12] Pemantle, R. A survey of random processes with reinforcement. Probab. Surv. 4 (2007) 1–79.

[13] Penrose, M.D. and Wade, A.R. Random minimal directed spanning trees and Dickman-type
distributions. Adv. in Appl. Probab. 36 (2004) 691–714.

[14] Penrose, M.D. and Wade, A.R. Limit theory for the random on-line nearest-neighbor graph.
Random Structures Algorithms 32 (2008) 125–156.

[15] Pillichshammer, F. On the sum of squared distances in the Euclidean plane. Arch. Math. 74
(2000) 472–480.

[16] Witsenhausen, H.S. On the maximum of the sum of squared distances under a diameter
constraint. Amer. Math. Monthly 81 (1974) 1100–1101.