THEORETICAL NOTES

The Robust Beauty of Ordinary Information

Konstantinos V. Katsikopoulos
Max Planck Institute for Human Development, Berlin, Germany,

and Massachusetts Institute of Technology

Lael J. Schooler
Max Planck Institute for Human Development,

Berlin, Germany

Ralph Hertwig
University of Basel

Heuristics embodying limited information search and noncompensatory processing of information can
yield robust performance relative to computationally more complex models. One criticism raised against
heuristics is the argument that complexity is hidden in the calculation of the cue order used to make
predictions. We discuss ways to order cues that do not entail individual learning. Then we propose and
test the thesis that when orders are learned individually, people’s necessarily limited knowledge will
curtail computational complexity while also achieving robustness. Using computer simulations, we
compare the performance of the take-the-best heuristic—with dichotomized or undichotomized cues—to
benchmarks such as the naı̈ve Bayes algorithm across 19 environments. Even with minute sizes of
training sets, take-the-best using undichotomized cues excels. For 10 environments, we probe people’s
intuitions about the direction of the correlation between cues and criterion. On the basis of these
intuitions, in most of the environments take-the-best achieves the level of performance that would be
expected from learning cue orders from 50% of the objects in the environments. Thus, ordinary
information about cues— either gleaned from small training sets or intuited— can support robust
performance without requiring Herculean computations.

Keywords: inductive inference, heuristics, take-the-best, robustness, bootstrapping

A psychological regularity touted as a potentially universal law of
cognition is the effort–accuracy tradeoff: Only if people invest more
cognitive effort do they stand to achieve more accuracy in their
choices and judgments. More effort can take the form of searching for
information exhaustively, spending plenty of time on the problem, or
performing complex computations. Yet, this law may fall short of
being universal. Inspired by the ideas of Herbert Simon, those in the
research program on fast and frugal heuristics have developed com-
putational models of heuristics as one precise interpretation of
Simon’s (1956) notion of bounded rationality. Although computa-
tional models of heuristics were proposed before (e.g., Payne, Bett-
man, & Johnson, 1993; Tversky, 1972), they were built on the
premise that people rely on heuristics because they lack the cognitive

capacity to perform rational calculations or are willing to sacrifice
accuracy by expending less effort. This view has been challenged by
demonstrations that processes embodying bounded rationality, such
as limited information search and noncompensatory processing, can
lead to more accurate inferences than can be achieved by models
based on more information and complex computations (Gigerenzer,
Todd, & The ABC Research Group, 1999).

This challenge to the effort–accuracy tradeoff, however, has not
remained undisputed. One important objection suggests that the heu-
ristics’ success rides on complex computations. For example, many
heuristics, such as take-the-best (Gigerenzer et al., 1999), do not use
all available cues (i.e., variables that correlate, albeit imperfectly, with
the criterion variable in a decision problem; see Brunswik, 1955) but
instead order them, look them up one by one, and stop searching as
soon as a discriminating cue is encountered. Juslin and Persson (2002,
p. 575) stressed that much effort is needed to order cues. For example,
in Gigerenzer et al.’s (1999) city-population-comparison task (e.g.,
which of the following two cities has more residents, Heidelberg or
Bonn?), the computation of cue order involving 83 German cities and
nine cues requires 30,627 comparisons (for details, see Juslin &
Persson, 2002). Others have raised similar concerns (Dougherty,
Franco-Watkins, & Thomas, 2008; Newell, 2005; Rakow, Hinvest,
Jackson, & Palmer, 2004).

Do simple heuristics really freeload on the effort hidden in the
computation of cue orders? A first response to the “hidden effort”
criticism is that a person can conveniently piggyback on the experi-
ences of others, past or present. Evolution, culture, and other individ-
uals can learn cue orders (for details, see Gigerenzer, Hoffrage, &

This article was published Online First September 6, 2010.
Konstantinos V. Katsikopoulos, Center for Adaptive Behavior and Cogni-

tion, Max Planck Institute for Human Development, Berlin, Germany, and
Department of Mechanical Engineering and Engineering Systems Division,
Massachusetts Institute of Technology; Lael J. Schooler, Center for Adaptive
Behavior and Cognition, Max Planck Institute for Human Development; Ralph
Hertwig, Faculty of Psychology, University of Basel, Basel, Switzerland.

We thank Anna Culen for her help with data entry and processing, Carola
Fanselow for programming the simulations, members of the ABC Research Group
for their comments and helpful discussions, and Laura Wiles for editing the article.

Correspondence concerning this article should be addressed to Kon-
stantinos V. Katsikopoulos, Center for Adaptive Behavior and Cognition,
Max Planck Institute for Human Development, Lentzeallee 94, 14195
Berlin, Germany. E-mail: katsikop@mpib-berlin.mpg.de

Psychological Review © 2010 American Psychological Association
2010, Vol. 117, No. 4, 1259 –1266 0033-295X/10/$12.00 DOI: 10.1037/a0020418

1259


Goldstein, 2008). Of course, despite evolution, culture, and the vicar-
ious experience of others, sometimes a person finds herself with
nobody to consult or imitate. This, however, does not dictate that the
heuristics this person would use require massive effort for computing
cue orders, for two reasons. First, some heuristics do without cue
order. For example, the recognition heuristic (Goldstein & Gigeren-
zer, 2002) and the fluency heuristic (Hertwig, Herzog, Schooler, &
Reimer, 2008; Schooler & Hertwig, 2005) need only one cue, bene-
fiting from automatic, albeit potentially computationally complex,
memory processes. Other heuristics, such as the priority heuristic, rely
on risk preferences to order cues (Brandstätter, Gigerenzer, &
Hertwig, 2006), and some heuristics bypass cues by relying on the
frugal retrieval of exemplars (Juslin & Persson, 2002). A second
reason why heuristics do not demand massive investment in the
calculation of cue orders is that minute samples of knowledge can be
used, which drastically restricts the computations required. Of course,
to paraphrase Alexander Pope, a little cue knowledge can be a
dangerous thing. It makes computations workable, but it may lead to
ineffective cue orders. Whether severely limited knowledge is a curse
or a blessing is the subject of our investigations.

We will test how well simple heuristics perform with ordinary
cue information. In the context of this article, ordinary information
refers to a person’s severely limited knowledge or a person’s
knowledge of the sign of the correlation between a cue and a
criterion, called the cue’s direction. Cue directions often follow
people’s basic conceptions about how the world works: For ex-
ample, cities with a high unemployment rate (cue) tend to have a
high homelessness rate (criterion).

We present two investigations that combine computer simula-
tions and human data. The first analyzes the performance of
heuristics that use minute samples. The second analyzes the per-
formance of heuristics that rely only on people’s intuitions about
cue directions. We next describe the heuristics.

Fast and Frugal Heuristics

We focus on the inference models studied by Gigerenzer et al.
(1999): three heuristics (tallying, take-the-best, and minimalist)
and two benchmarks (linear regression and naı̈ve Bayes). The task
is a paired comparison, in which the goal is to predict which of two
objects has the higher value on a criterion. For example, it can be
asked which of two cities, Chicago or Los Angeles, has a higher
homelessness rate. Lacking direct knowledge of these values, most
people have to use cues. For example, the cues associated with a
city’s homelessness rate could be the average temperature (a
continuous cue) and the presence of a university (a binary cue).

These models differ in how they process cues. Linear regression
weighs and sums cues, predicting that the object with the higher
sum has the higher criterion. Naı̈ve Bayes selects the object with
the higher probability of having the higher criterion value, given
the objects’ values on the cues. In contrast, the three heuristics
represent simplifications of weighted linear models. The first one,
tallying (Dawes, 1979), dispenses with weighting and simply adds
binary cue values. The take-the-best heuristic dispenses with add-
ing and weighs cues simply by ordering them by a measure of cue
goodness called validity and looking them up one at a time. As
soon as a discriminating cue is found, search is stopped and a
decision is made on the basis of that cue alone. Minimalist operates
like take-the-best except that it randomly orders cues.

In sum, relative to the heuristics, the benchmarks use more
complex computations and sophisticated information. Multiple
regression and naı̈ve Bayes use the precise values of regression
weights and cue validities, whereas take-the-best uses the order of
cue validities. Minimalist and tallying do not require cue validities
or orders but only cue directions.

Czerlinski, Gigerenzer, and Goldstein (1999) applied these in-
ference models to 20 data sets from domains such as biology,
economics, and sociology. Each data set was split in half, and the
parameters of each model (weights, validities, directions) were
estimated on half of the data, the training set. The parameter
estimates were used to predict, for each model, the paired com-
parisons on the other half of the data, the test set. This process was
repeated 1,000 times. In the test set, the predictive accuracy of the
models was as follows: minimalist 65%, regression with dichoto-
mized cues 68%, tallying 69%, take-the-best with dichotomized
cues 71%, naı̈ve Bayes 73%, regression with undichotomized cues
76%, and take-the-best with undichotomized cues 76%.

Using a city-population-comparison task, Juslin and Persson (2002)
and Chater, Oaksford, Nakisa, and Redington (2003) provided some
evidence that heuristics have high predictive accuracy when the
training set includes fewer than 50% of all objects. For example,
Chater et al. tested take-the-best against a three-layer feed-forward
connectionist network, two exemplar-based models, and a decision-
tree-induction algorithm. With training sets of up to 40% of all
objects, take-the-best outperformed or matched its competitors but
was outperformed with larger training sets. Our first goal is to find out
how robust these initial results are across many environments and
with minute training set sizes with as few as two objects.

Individual Learning From Minute Samples: How
Good Could It Be?

Table 1 lists the nine models included in the first simulation: two
versions of multiple regression, one in which continuous cues were
dichotomized (MRD) and one in which they were not (MRU); tallying
(TAL) and minimalist (MIN), which both use dichotomized cues, and
three versions of take-the-best, one that uses undichotomized cues
(TTBU) and two that use dichotomized cues. The latter two versions
differ in their computation of validity, which can be done by a
frequentist (TTBF) or a Bayesian (TTBB) approach (in the undichoto-
mized take-the-best version we used frequentist validity). We also
included two versions of naı̈ve Bayes: one with frequentist (NBF) and
one with Bayesian (NBB) validity. The equation for the frequentist
cue validity is v � R/(R � W), where R and W are the number of
correct and incorrect comparisons, respectively, based on the cue. The
frequentist approach may weight the first observations too much,
compromising performance on minute samples.1 The Bayesian ap-
proach overcomes this problem (Lee & Cummins, 2004) by calculat-
ing validity as v � (R � 1)/(R � W � 2).

1 This is a problem with some inference models, such as, for example, naı̈ve
Bayes. Naı̈ve Bayes selects the object with the larger product �i, jvi(1 – vj), where
vi is the validity of any cue ci that has a higher value on the object, and vj is the
validity of any cue cj that has a higher value on the other object. If validities equal
0 or 1, the product will equal 0 for both objects (unless all cues with validity 1 point
to one object and all cues with validity 0 point to the other), and naı̈ve Bayes will
have to guess. In line with this point, naı̈ve Bayes with frequentist validity scored
50% for a training set of two objects (see Table 1).

1260 KATSIKOPOULOS, SCHOOLER, AND HERTWIG


Design

The predictive accuracy of the inference models was tested in 19
of the 20 data sets used by Czerlinski et al. (1999; we omitted a
data set with fewer than 20 objects). To compare our results to
previous work, we investigated a training set that included 50% of
all objects. We also investigated training sets ranging in size from
two to 10 objects. Averaged across the data sets, training sets of
two and 10 objects equal 3% and 15% of all objects, respectively.

The models, with parameters (weights, validities, directions)
estimated from all objects in the training set, were applied to the
test set. When a model required binary cues, these were dichoto-
mized cues using the sample median. The simulation was repeated
5,000 times for each model, environment, and training set size.

Results

Figure 1 plots mean predictive accuracy, defined as a model’s
proportion of correct inferences in the test set as a function of the
training set size. TTBU performs best in all cases except when the
training set has two objects. In this case, TTBU is slightly behind
TAL. For training sets with five to 10 objects, TTBU surpasses the
second-best model, NBB, by at least 5%. With seven objects,
TTBU reaches 74%—near its maximum performance of 76%,
achieved with a training set of 50% of all objects.

Bayesian validities guard against the extreme estimates that the
frequentist approach may produce for very small training sets.
Unlike take-the-best, naı̈ve Bayes depends on point estimates of
validity, so Bayesian validities could be important for its perfor-
mance. With a training set of two objects, NBB performs substan-
tially better than does NBF (58% vs. 50%; see also Footnote 1).
This advantage disappears with 10 objects or more. The other
benchmark, linear regression, does not fare well for very small
training set sizes. This is not a new finding (Dawes, 1979; Hogarth
& Karelaia, 2005). MRD does not outperform the MIN heuristic
for any size of the training set. MRU surpasses MIN only when
there are six objects in the training set, and TAL only for nine
objects in the training set.

In sum, we found that the models are differentially influenced
by the information that minute samples afford. The predictive

accuracy of the two benchmarks, linear regression and naı̈ve
Bayes, is compromised, presumably because training sets with 10
or fewer objects provide unreliable point estimates of regression
weights and cue validities. Still, the use of Bayesian estimates
provides a decent remedy for naı̈ve Bayes. The heuristics, in
contrast, seem capable of making do with very limited informa-
tion: TAL is the most robust model for training sets with two
objects, whereas TTBU is the most robust model for sets including
between three and 10 objects, with a difference in accuracy of
more than 5 percentage points on average. In addition, with as few
as seven objects TTBU reaches a performance level close to the
maximum achieved by any model with a 50% training set. Hogarth
and Karelaia (2007) conjectured that heuristics such as TTBU
“would be less sensitive to sampling errors” (p. 751). This is
exactly what we found. Why is it so?

The Robust Beauty of Cue Directions

One possible reason for the robustness of heuristics is that even
minute samples identify a few good cues. For example, could it be
that at least one of the, say, three most valid cues is ordered
correctly? No. For training sets with two to eight objects, it is more
likely that the three most valid cues are ranked incorrectly than that
at least one of them is ranked correctly. Interestingly, getting the
validity order right does not really matter much, as shown in
Figure 2.

In the upper panel of Figure 2 we plot, for all training set sizes,
the correlation between the predictive accuracy of take-the-best
(and MIN) and how closely the validity order estimated from the
training set corresponds to the validity order in the population. For
each one of the 5,000 repetitions of the simulation and for each
training set size, we measured the degree of correspondence be-
tween the two orders by a rank correlation. We separately consid-
ered the three most valid cues and all cues. None of the correla-
tions between predictive accuracy and degree of correspondence
reach a value higher than .2 for any training set size.

If cue order does not affect predictive accuracy much, then what
does? Cue directions do. In the lower panel of Figure 2, we graph,
for all training set sizes, the correlation between the predictive

Table 1
Summary of the Fast and Frugal Heuristics and Benchmark Models Tested in the First Investigation

Model Decision rule Parameters

Multiple linear regression with dichotomized cues
(MRD) Select object with higher weighted sum of cue values Regression weights

Multiple linear regression with undichotomized
cues (MRU) Select object with higher weighted sum of cue values Regression weights

Naı̈ve Bayes with dichotomized cues and
frequentist validity (NBF) Select object with higher probability of having higher criterion value Cue validities

Naı̈ve Bayes with dichotomized cues and Bayesian
validity (NBB) Select object with higher probability of having higher criterion value Cue validities

Tallying with dichotomized cues (TAL) Select object with higher sum of cue values Cue directions
Minimalist with dichotomized cues (MIN) Select object with higher value on random cue that discriminates Cue directions
Take-the-best with dichotomized cues and

frequentist validity (TTBF) Select object with higher value on most valid cue that discriminates Order of cue validities
Take-the-best with dichotomized cues and

Bayesian validity (TTBB) Select object with higher value on most valid cue that discriminates Order of cue validities
Take-the-best with undichotomized cues and

frequentist validity (TTBU) Select object with higher value on most valid cue that discriminates Order of cue validities

1261ROBUST ORDINARY INFORMATION


accuracy of take-the-best (and MIN) and how closely the cue
directions estimated from the training set correspond to the cue
directions in the population. We measured the degree of corre-
spondence between these two sets of cue directions by the number
of cue directions in the population that are correctly estimated
from the training set, for each one of the 5,000 repetitions of the
simulation and for each training set size. As before, we considered
the three most valid and all cues. The correlations start very high,
between .6 and .7 for a training set with two objects. As the size
of the training set increases, correlations decrease, possibly be-
cause there is increasingly less variability in the number of cor-
rectly estimated cue directions.

For all training set sizes, we calculated, using all data sets and
all simulation repetitions per data set, the probability that the
direction of a cue was estimated correctly. To avoid overestimating
this probability, we used all repetitions, not only those in which a
cue discriminated and its direction could be judged. Even so, the
probability is high. By sampling two objects, cue direction is
estimated correctly with a probability of about .6, surpasses .7 for
six objects, and reaches .8 for 10 objects.

Although the performance of both take-the-best and MIN cor-
relates highly with the correct estimation of cue directions, MIN’s
poor performance in Figure 1 suggests that cue order also matters.
It is better to look up valid cues first, as take-the-best does, rather
than look them up randomly, as MIN does. For example, in
training sets with two objects there is a .17 probability that the cue
looked up first is the most valid (the probability that this would

happen by chance equals .13); and this probability increases to .3
for a training set with four objects, .51 for a set with eight objects,
and .56 for a set with 10 objects.

In sum, minute samples do not veridically represent the popu-
lation validity order. Yet, they afford an ordinary but informative
piece of information—the direction of a cue. Moreover, even
minute samples offer a slightly better-than-chance probability of
looking up good cues first. This, together with good cue directions,
explains the impressive robustness of take-the-best.

Cue Directions: Individual and Collective Intuitions

Our first investigation established that fast and frugal heuristics
do not necessarily need good cue orders, relying instead on good
cue directions to make accurate inferences. In our second investi-
gation, we ask: How good or bad are people’s intuitions about cue
directions? The father of the TAL heuristic, Robyn Dawes (1979),
wrote that “people are good at picking out the right predictor
variables and at coding them in such a way that they have a
conditionally monotone relationship with the criterion” (p. 573).
But it is known that individuals often develop false beliefs about
how variables are associated. For example, Chapman and Chap-
man (1969) found that students perceived positive correlations
between people’s interpretations of ink blots and homosexuality,
although in reality the correlation in the experimental material was
negative.

Figure 1. Mean predictive accuracy (across 19 environments) of fast and frugal heuristics and benchmark
models as a function of the size of the training set (2–10 objects and 50% of the objects in the population.
TTBF � take-the-best with dichotomized cues and frequentist validity; TTBB � take-the-best with dichotomized
cues and Bayesian validity; TAL � tallying with dichotomized cues; MIN � minimalist with dichotomized cues;
NBF � naı̈ve Bayes with dichotomized cues and frequentist validity; NBB � naı̈ve Bayes with dichotomized
cues and Bayesian validity; MRD � multiple linear regression with dichotomized cues; MRU � multiple linear
regression with undichotomized cues; TTBU � take-the-best with undichotomized cues and frequentist validity.

1262 KATSIKOPOULOS, SCHOOLER, AND HERTWIG


We empirically probed people’s intuitions about the directions
of cues on 10 of the data sets used before. To see whether these
intuitions can lead heuristics to perform well, we also tested the
performance of bootstraps of heuristics. That is, we used the cue
directions judged by participants as input to take-the-best and
TAL. Bootstrapping may refer to using the judgments of a person
as input to the model of the same person. We use the term more
broadly, including the use of an aggregate judgment of a group
(defined later) as input.2

Method

Of the data sets used previously, we eliminated some that are
obscure (e.g., predicting rainfall from cloud seeding) or strongly
overlap with others (e.g., we selected predicting obesity instead of
percentage of body fat). Using the same constraints, we reduced

the cues to an average of 5.1 cues per data set, varying from three
to eight, ending up with a total of 51 continuous and binary cues.
Table 2 lists the remaining environments and cues. For two envi-
ronments, participants were likely to have some direct experience
(German city populations and body fat in men). It is unlikely,
however, that our German participants had experience with the
other environments, such as the housing market in Erie, Pennsyl-
vania.

Rather than asking our participants, 50 students from three
universities in Berlin, to state the sign of a correlation between cue
and criterion, we asked them whether an increase in the value of a
specific cue would be associated with an increase or a decrease in

2 We are not using the term bootstrap to refer to the statistical procedure
based on resampling.

Figure 2. Upper panel: Mean correlations (across 19 environments), as a function of the size of the training set
(2–10 objects and 50% of the objects in the population), between the predictive accuracy of take-the-best with
dichotomized cues and frequentist validity (TTBF) and minimalist with dichotomized cues (MIN) and how
closely the cue order estimated from the training set approximates the cue order in the population for the three
most valid cues and for all cues. Lower panel: Mean correlations (across 19 environments), as a function of the
size of the training set (2–10 objects and 50% of the objects in the population), between the predictive accuracy
of TTBF and MIN and the number of cue directions in the population that are correctly estimated from the
training set for the three most valid cues and for all cues.

1263ROBUST ORDINARY INFORMATION


the criterion value. We also asked them to order cues by how
useful they thought the cues would be. The order in which envi-
ronments were presented and the order in which cues were pre-
sented within environments were varied across participants. For
some cues, some participants did not answer, but we collected
more than 45 judgments for each cue.

How Good Are People’s Intuitions About Cue
Directions?

Across all 10 environments, people judged cue directions cor-
rectly in 67% of all cases. The directions of about half of the cues
(27 out of 51) were judged correctly by more than 85% of partic-
ipants. Directions of about three fourths of the cues (38 out of 51)
were judged correctly by more than 50% of the participants. Cues
judged incorrectly by more than 50% of the participants occur
mostly in three environments: mammals’ sleep (four cues), high-
way accidents (three cues), and fuel consumption (three cues).
People’s incorrect intuitions are not surprising. For example, the
number of drivers with a license in a state is negatively correlated
with the average fuel consumption in the state. Other cue direc-
tions may be justifiable in retrospect but are still surprising, such

as, for example, the number of car accidents on a road segment
being negatively correlated with the speed limit. Finally, some cue
directions simply require expert knowledge. For instance, it is not
clear, to dilettantes in biology such as the authors, why the gesta-
tion index is negatively correlated with how much a mammal
sleeps.

How Well Do Heuristics Perform When Fed With
People’s Cue Directions?

To answer this question, we constructed and tested bootstrap
models of take-the-best and a benchmark that we call calibrated
take-the-best model. The predictive accuracy of the calibrated
take-the-best is the predictive accuracy of TTBF as analyzed in the
first simulation. Across the 10 environments and 51 cues listed in
Table 2, TTBF was calibrated on a training set with 50% of all
objects in each environment. The predictive accuracy of the first
bootstrap, which we call individual take-the-best, is the average of
the predictive accuracies of 50 bootstraps, each using the cue
directions and cue orders probed from one of the 50 participants.
The second bootstrap, social take-the-best, used the cue directions
of the majority of participants (Garcı́a-Retamero, Takezawa, &
Gigerenzer, in press). To determine cue order, social take-the-best
exploits the strength of the majority about cue directions. For
example, if 60% of participants agreed on the direction of one cue,
whereas 70% of participants agreed on the direction of another
cue, the second cue would be ranked higher.3

We also conducted the corresponding analysis for TAL, testing
two bootstrap models and a benchmark, calibrated TAL. The
predictive accuracy of the first bootstrap, individual TAL, is the
average of the predictive accuracies of 50 bootstraps, each using
the cue directions of one of the 50 participants. The second, social
TAL, takes the direction of a cue to be the direction judged by the
majority of the participants (Garcı́a-Retamero et al., in press).

Table 3 shows the performance of the three take-the-best and
three TAL models. Across all 10 environments, the bootstrap
models lag from 7 to 10 percentage points behind the calibrated
model. Table 2 also shows how accuracy depends on the envi-
ronment. We distinguish post hoc between counterintuitive envi-
ronments, in which cue directions proved difficult for people to
judge (the mammals’ sleep, highway accidents, and fuel consump-
tion environments), and intuitive environments, in which cue di-
rections were easier to judge (the other seven environments). In
counterintuitive environments, for both take-the-best and TAL, the
calibrated model does much better than the bootstrap models. In
intuitive environments, however, the bootstrap models, especially
the social bootstraps, do well and catch up or even outperform the

3 We also implemented a version of social take-the-best that uses peo-
ple’s intuitions about not only cue directions but also cue orders. It uses
Borda counts, the sum of the ranks that a cue was given by all participants,
as a proxy for cue ranks (Garcı́a-Retamero et al., in press). This model
reached essentially the same predictive accuracy as did social take-the-best
with cue directions and the majority rule.

Table 2
10 Environments and 51 Cues Used to Test People’s Intuitions
About Cue Directions in the Second Investigation

Environment Cues

Predict dropout rate of
the 57 Chicago public
high schools

Grade, percentage of non-White students,
percentage of low-income students

Predict the rate of
homelessness in 50
U.S. cities

Percentage of public housing,
unemployment rate, average
temperature, vacancy rate, rent control,
percentage of inhabitants below poverty
line

Predict the mortality rate
in 20 U.S. cities

Percentage of non-White people in
urbanized areas, pollution level, average
January temperature

Predict populations of the
83 German cities with
at least 100,000
inhabitants

City in former East Germany, state capital,
exposition site, university, in industrial
belt, intercity train connection, license
plate, soccer team in premier league

Predict the selling prices
of 22 houses in Erie,
Pennsylvania

Age of house, number of bathrooms, lot
size, living space, property tax, number
of bedrooms, garage

Predict the accident rate
per million vehicle
miles for 37 segments
of a highway

Number of lanes, speed limit, percentage
of trucks, shoulder width, segment
length, lane width, intersections,
volume, average traffic count

Predict the average motor
fuel consumption per
person for each of the
48 contiguous U.S.
states

Population, miles per highway, motor fuel
tax, percentage of population with
license, number of licensed drivers

Predict percentage of
body fat for 218 men

Weight, leg circumference, height

Predict the average
amount of sleep time
for 35 species of
mammals

Maximum life span, brain weight, body
weight, gestation index

Predict the number of
species on 26
Galapagos islands

Distance to adjacent island, distance to
coast, area of adjacent island, elevation

1264 KATSIKOPOULOS, SCHOOLER, AND HERTWIG


calibrated model with perfect knowledge about half of the envi-
ronment.4

General Discussion

In what follows, we first discuss why getting cue directions right
appears to be so crucial for the performance of fast and frugal
heuristics, and then we turn to the role of causal theories under-
lying intuitions about cue directions.

The Robust Beauty of Ordinary Information

The title of our article alludes to research by Dawes, Hastie, and
Kameda on the “robust beauty” of TAL and its group analogue, the
majority rule (Dawes, 1979; Hastie & Kameda, 2005). Consistent
with the results of their work, the success of TAL in our simula-
tions shows that aggregating cue directions is a robust strategy.
What is puzzling about the remarkable performance of take-the-
best with undichotomized cues is that it dispenses with aggrega-
tion. It relies on just the direction of literally a single cue, the one
with the highest validity estimate on the basis of the training set.
In the case of continuous cues, this cue will almost always dis-
criminate between two objects. A full solution to this puzzle is still
lacking, but we can gain some insight by using two approaches
taken in the past to explain the strong performance of heuristics.

First, heuristics achieve high accuracy if the cue structure is
noncompensatory. There are at least two meanings of a noncom-
pensatory cue structure, which have been studied analytically:
Katsikopoulos and Martignon (2006) showed that, essentially, if
the cue with the highest validity has a much higher validity than do
all other cues, then take-the-best with binary cues has maximum
accuracy among all possible inference models. And Hogarth and
Karelaia (2005) showed that, essentially, if the variance in crite-
rion values accounted for by the cue that take-the-best looks up
first is higher than the variance in criterion values accounted for by

using all cues in a linear regression (adjusting for the total number
of cues), then take-the-best with undichotomized cues is more
accurate than regression is.

We offer a speculation for the performance of take-the-best with
undichotomized cues rooted in how dichotomization affects valid-
ities. One might expect the validity of a continuous cue, under
conditions to be specified, to decrease when the cue is dichoto-
mized. The values of two objects on a continuous cue can (a) both
lie on the same side of the threshold used to dichotomize the cue
or (b) lie on the opposite sides of this threshold. In (b), the validity
of the cue is unaffected by dichotomization. In (a), a dichotomized
cue does not discriminate, whereas an undichotomized cue does.
For environments where the validity of a continuous cue in (a) is
higher than its validity in (b), dichtomization lowers cue validity.
Irwin and McClelland (2003) made similar observations about the
statistical power lost from dichotomizing continuous cues.

In a second approach to explaining the strong performance of
take-the-best, Gigerenzer and Brighton (2009) used the fact that
the predictive accuracy of any inference model is the sum of its
predictions’ bias and variance. They conjectured that heuristics
achieve high predictive accuracy because their comparatively low
variance compensates for their comparatively high bias. Extrapo-
lating from this account, we suspect that take-the-best with undi-
chotomized cues will have lower variance in its predictions than
will take-the-best with dichotomized cues. When cues are contin-
uous, take-the-best determines the relative ranks of two objects,
according to the criterion, on the basis of the first cue only. When
cues are dichotomized, however, the ranking of objects can, in the
most extreme case, depend on the complete order of cues. That is,
with three continuous cues there would be, in principle, three
rankings of the objects, whereas with three dichotomized cues
there would be, in principle, six (3!). This difference could de-
crease the variance in take-the-best’s predictions with continuous
cues, relative to dichotomous cues.

Cue Directions: Occam’s Razor and Causal Theories

Entities must not be multiplied beyond necessity. If so, what is
gained by explaining a person’s judgment for one kind of binary
choice, say, which of two mountains in the Swiss Alps is higher,
on the basis of another binary variable, intuitions about cue direc-
tion? Clearly the most direct solution to the inference problem is to
retrieve the mountains’ height directly from memory (see the
concept of “local mental models” in Gigerenzer, Hoffrage, &
Kleinbölting, 1991). However, there will be situations in which
there is no direct knowledge about the criterion. Then, a person
may vicariously substitute direct knowledge with probabilistic
cues. Our analyses have shed light on the key role of cue directions
in this process of substitution.

4 The results on social TAL can be analytically justified: TAL with
correct cue directions can be viewed as a majority rule with jurors with
at-least-chance accuracy (Katsikopoulos & Martignon, 2006). The Con-
dorcet jury theorem (Condorcet, 1785) delineates conditions under which
the majority rule (TAL) can have high or low accuracy. It says that adding
jurors (cues) with at-least-chance accuracy leads to a drastic increase in the
accuracy of the majority rule; conversely, adding jurors with below-chance
accuracy leads to a drastic decrease in the accuracy of the majority rule.
This is what we found.

Table 3
Predictive Accuracy (%) of the Models in the Simulation in the
Second Investigation

Model

Predictive accuracy

All 10
environments

3 Counterintuitive
environments

7 Intuitive
environments

Take-the-best
Calibrated 68 68 68
Individual 58 42 65
Social 60 42 68

Tallying
Calibrated 67 69 66
Individual 59 46 64
Social 60 41 68

Note. The predictive accuracy of a calibrated model and two bootstrap
(called individual and social) models, separately for take-the-best and
tallying. The calibrated models use all cue-related information in a training
set with 50% of all objects in the population, whereas the bootstrap models
use cue direction information provided by 50 people (see text for details).
Results are provided across all environments and separately for those
environments where cue directions were found to be difficult for people to
judge (counterintuitive environments) and easy for people to judge (intu-
itive environments).

1265ROBUST ORDINARY INFORMATION


We speculate that people could arrive at intuitions about cue
directions on the basis of their causal knowledge about how the
world works. According to Waldmann, Hagmayer, and Blaisdell
(2006), people have a “natural capacity to form causal represen-
tations” (p. 310), based, in part, on regularities such as “causes
typically temporally precede their effects” (p. 308). Garcı́a-
Retamero, Hoffrage, and Dieckmann (2007) have argued that
causal representations help people focus on causal associations
between cues and criteria, which are likely to be robust. They
demonstrated that people were faster to learn causal associations,
relative to equally valid associations that were not presented to
participants as causal, and that they were more likely to search for
cues that were causally related to criteria.

Tentative evidence that causal theories shaped our participants’
intuitions about cue directions can be found in one of the common
errors in judged cue directions. For inferring the rate of highway
accidents, people could causally link speed limit (cue) to lower
accident rates (criterion): “Low speed increases the possibility to
respond in time when necessary, and thus the possibility to avoid a
collision becomes higher.” Counterintuitively, however, lower speed
limits in the present data set are associated with more accidents. That
is, causal theories can be and sometimes are wrong— but they nev-
ertheless give rise to intuitions about cue directions.

Conclusion

The effort–accuracy tradeoff carries the ring of a general law of
cognition: Investing less effort is tantamount to achieving lower
accuracy. Challenging the belief that accuracy and effort inescapably
trade off, research on fast and frugal heuristics has demonstrated that
less information and computation can yield better performance. Coun-
tering the argument that heuristics’ success rides on the effort put into
calculating cue validities and orders, we showed that information
limitations that reduce effort do not always hurt accuracy. Simple
heuristics can be robust even if simplicity is secured through ordinary
information about cue directions, garnered from limited knowledge or
found in people’s intuitions.

References

Brandstätter, E., Gigerenzer, G., & Hertwig, R. (2006). The priority heuristic:
A process model of risky choice. Psychological Review, 113, 409 – 432.

Brunswik, E. (1955). Representative design and probabilistic theory in a
functional psychology. Psychological Review, 62, 193–217.

Chapman, L. J., & Chapman, J. P. (1969). Illusory correlation as an
obstacle to the use of valid psychodiagnostic signs. Journal of Abnormal
Psychology, 74, 271–280.

Chater, N., Oaksford, M., Nakisa, R., & Redington, M. (2003). Fast, frugal
and rational: How rational norms explain behavior. Organizational
Behavior and Human Decision Processes, 90, 63– 86.

Condorcet, N. C. (1785). Essai sur l’application de l’analyse á la proba-
bilité des décisions rendues á la pluralité des voix [Essay on the
application of analysis to the probability of majority decisions]. Paris,
France: Imprimerie Royale.

Czerlinski, J., Gigerenzer, G., & Goldstein, D. G. (1999). How good are
simple heuristics? In G. Gigerenzer, P. M. Todd, & The ABC Research
Group (Eds.), Simple heuristics that make us smart (pp. 97–118). New
York, NY: Oxford University Press.

Dawes, R. M. (1979). The robust beauty of improper linear models in
decision making. American Psychologist, 34, 571–582.

Dougherty, M. R., Franco-Watkins, A. M., & Thomas, R. (2008). Psycho-

logical plausibility of the theory of probabilistic mental models and the
fast and frugal heuristics. Psychological Review, 115, 119 –213.

Garcı́a-Retamero, R., Hoffrage, U., & Dieckmann, A. (2007). When one cue
is not enough: Combining fast and frugal heuristics with compound cue
processing. Quarterly Journal of Experimental Psychology, 60, 1197–1215.

Garcı́a-Retamero, R., Takezawa, M., & Gigerenzer, G. (in press). How to
learn good cue orders: When social learning benefits simple heuristics.
In R. Hertwig, U. Hoffrage, & The ABC Research Group (Eds.), Social
heuristics that make us smart. New York, NY: Oxford University Press.

Gigerenzer, G., & Brighton, H. (2009). Homo heuristicus: Why biased
minds make better inferences. Topics in Cognitive Science, 1, 107–143.

Gigerenzer, G., Hoffrage, U., & Goldstein, D. G. (2008). Heuristics are
plausible models of cognition: Reply to Dougherty, Franco-Watkins, and
Thomas (2008). Psychological Review, 115, 230 –239.

Gigerenzer, G., Hoffrage, U., & Kleinbölting, H. (1991). Probabilistic
mental models: A Brunswikian theory of confidence. Psychological
Review, 98, 506 –528.

Gigerenzer, G., Todd, P. M., & The ABC Research Group (Eds.). (1999). Simple
heuristics that make us smart. New York, NY: Oxford University Press.

Goldstein, D. G., & Gigerenzer, G. (2002). Models of ecological rational-
ity: The recognition heuristic. Psychological Review, 109, 75–90.

Hastie, R., & Kameda, T. (2005). The robust beauty of majority rules in
group decisions. Psychological Review, 112, 494 –508.

Hertwig, R., Herzog, S. M., Schooler, L. J., & Reimer, T. (2008). Fluency
heuristic: A model of how the mind exploits a by-product of information
retrieval. Journal of Experimental Psychology: Learning, Memory, and
Cognition, 34, 1191–1206.

Hogarth, R. M., & Karelaia, N. (2005). Ignoring information in binary
choice with continuous variables: When is less “more”? Journal of
Mathematical Psychology, 49, 115–124.

Hogarth, R. M., & Karelaia, N. (2007). Heuristic and linear models of judgment:
Matching rules and environments. Psychological Review, 114, 733–758.

Irwin, J. R., & McClelland, G. H. (2003). Negative consequences of
dichotomizing continuous predictor variables. Journal of Market Re-
search, 40, 366 –371.

Juslin, P., & Persson, M. (2002). PROBabilities from EXemplars: A “lazy”
algorithm for probabilistic inference from generic knowledge. Cognitive
Science, 26, 563– 607.

Katsikopoulos, K. V., & Martignon, L. (2006). Naı̈ve heuristics for paired
comparison: Some results on their relative accuracy. Journal of Math-
ematical Psychology, 50, 488 – 494.

Lee, M. D., & Cummins, T. D. R. (2004). Evidence accumulation in
decision making: Unifying the “take the best” and “rational” models.
Psychonomic Bulletin & Review, 11, 343–352.

Newell, B. (2005). Re-visions of rationality? Trends in Cognitive Sciences,
9, 11–15.

Payne, J. W., Bettman, J. R., & Johnson, E. J. (1993). The adaptive
decision maker. Cambridge, England: Cambridge University Press.

Rakow, T., Hinvest, N., Jackson, E., & Palmer, M. (2004). Simple heuris-
tics from the adaptive toolbox: Can we perform the requisite learning?
Thinking and Reasoning, 10, 1–29.

Schooler, L. J., & Hertwig, R. (2005). How forgetting aids heuristic
inference. Psychological Review, 112, 610 – 628.

Simon, H. A. (1956). Rational choice and the structure of environments.
Psychological Review, 63, 129 –138.

Tversky, A. (1972). Elimination of aspects: A theory of choice. Psycho-
logical Review, 79, 281–299.

Waldmann, M. R., Hagmayer, Y., & Blaisdell, A. P. (2006). Beyond the
information given: Causal models in learning and reasoning. Current
Directions in Psychological Science, 15, 307–311.

Received July 1, 2009
Revision received March 18, 2010

Accepted March 19, 2010 �

1266 KATSIKOPOULOS, SCHOOLER, AND HERTWIG