Prior and prejudice


NATURE NEUROSCIENCE  VOLUME 14 | NUMBER 8 | AUGUST 2011 9 4 3

N E W S  A N D  V I E W S

question is how to relate it to a prior distri-
bution: how do those sensory neurons take 
into account the fact that some values of x are 
more likely than others?

There are two basic ways. One is to allocate 
more preferred values xj to the values of x that 
occur more frequently (Fig. 1b) and the other 
is to vary the width of the tuning curves as a 
function of xj (Fig. 1c). Both strategies encode 
some values of x with higher variability than 
others (high variability meaning high spread, 
low precision) and also create biases (high bias 
meaning high systematic deviation, low accu-
racy) with respect to the true values (Fig. 1b,c). 
The studies by Girshick et al.1 and Fischer  
and Peña2 dissect two examples in which 
these two strategies are combined (Fig. 1d). 
Although the neural circuits in each case are 
very different, the resulting computations are 
extremely similar.

In the study by Girshick et al.1, the goal was to 
explain two sets of observations indicating that 
there is an asymmetry in the representation of 
orientation (x in this case) in the visual systems 
of mammals. First, performance in orientation 
discrimination tasks is consistently better at the 
cardinal orientations (horizontal and vertical), 
a phenomenon known as the oblique effect. 
Second, neurophysiological measurements 
indicate that the preferred orientations of neu-
rons in the primary visual cortex (V1) are not 
distributed uniformly; rather, the cardinals are 
over-represented (as in Fig. 1b). From optimal-
ity arguments, a reasonable explanation for 
both phenomena is that visual scenes naturally 
contain more edges that are oriented vertically 
or horizontally. These ideas were not new, but 
Girshick et al.1 tested them rigorously.

First, the authors carefully measured the 
actual distribution of orientations in a large 
collection of photographs, thus determining 
the prior for orientation from natural scenes. 
Horizontal and vertical indeed proved to be 

N E W S  A N D  V I E W S

Emilio Salinas is in the Department of Neurobiology 
and Anatomy, Wake Forest School of Medicine, 
Winston-Salem, North Carolina, USA. 
e-mail: esalinas@wakehealth.edu  

Prior and prejudice
Emilio Salinas

To best interpret new sensory information, populations of sensory neurons must represent the lessons of past 
experience. How do they do this? The same solution to this problem is now reported in two very different sensory 
systems, providing a classic example of computational convergence.

A young woman rebuffs a lad. She probably 
does not like him—or perhaps she likes him 
too much. Jane Austen exploits this ambigu-
ity between interpreting words and actions 
according to past experiences—prejudice—or  
to their literal meaning. But the tug-of-war 
between expectations and evidence is a fun-
damental problem that our brain encounters 
at all levels, from social interactions to the 
most basic perceptual judgments. Two stud-
ies in Nature Neuroscience now investigate 
how neural circuits combine the knowledge 
accumulated from previous encounters with 
sensory scenes, technically known as a prior 
distribution, with new stimuli. Although they 
analyze very different sensory computations, 
the determination of visual edge orientation 
in primates1 and the localization of sounds in 
owls2, they reach an identical conclusion about 
how populations of neurons may adapt their 
response properties to incorporate knowledge 
about the statistics of the world, and the solu-
tion is elegant.

The problem of how to combine prior 
expectations and current sensory informa-
tion in an optimal way is addressed through 
the principles of Bayesian inference, which 
provide a mathematical recipe for evaluating 
their relative importance. The generality of this 
problem may be illustrated by sports in which 
players continuously update the prior describ-
ing what the opponent is likely to do. In ten-
nis, for example, the server can direct his serve 
either to the middle or to the side of the court, 
and typically chooses whichever is hardest  

for the opponent to return. However, if the 
serve becomes predictable, then the returner 
can prepare accordingly and produce a win-
ning shot. For the returner the key trade-off is 
this: if the serves are slow enough (low noise 
in the sensory input), then he can simply see 
where the ball is going and choose without any 
bias whether to hit a forehand or a backhand, 
but if the serves are fast (high noise), then he 
must guess and commit to a particular motion 
early, else he has little chance of returning the 
serve. The Bayesian recipe finds the best prob-
abilistic strategy between these two extremes, 
one that is biased toward the prior and another 
that is not.

A growing body of evidence indicates 
that human subjects often behave in such a  
statistically optimal way in a wide array of 
perceptual and motor tasks3–5, and that those 
probabilistic calculations may also determine 
fundamental properties of single neurons6. 
Many such studies have specifically shown 
that, in making perceptual judgments, indi-
viduals indeed take into account prior dis-
tributions, whether they arise naturally7,8 or 
are artificially imposed by the experimental 
design4,5,8,9. How then are such prior distribu-
tions represented by neural circuits, and how 
are they accessed?

Consider how populations of sensory 
neurons encode a given stimulus feature x.  
Typically, neuron j becomes maximally acti-
vated when x takes a particular value xj, and 
the response decreases as x differs from this 
preferred xj (Fig. 1a). Neurons across the 
population have different preferences, and 
their response curves as functions of x, or 
‘tuning curves’, overlap to cover the full range 
of x. Although this type of representation has 
been studied thoroughly6,10–15, a lingering 

It is particularly incumbent on those who never change their opinion, to be secure of judging  
properly at first. —Jane Austen, Pride and Prejudice

©
 2

01
1

 N
at

u
re

 A
m

er
ic

a,
 In

c.
  A

ll 
ri

g
h

ts
 r

es
er

ve
d

.
©

 2
01

1
 N

at
u

re
 A

m
er

ic
a,

 In
c.

  A
ll 

ri
g

h
ts

 r
es

er
ve

d
.


9 4 4  VOLUME 14 | NUMBER 8 | AUGUST 2011  NATURE NEUROSCIENCE

N E W S  A N D  V I E W SN E W S  A N D  V I E W SN E W S  A N D  V I E W S

of computations that can be easily performed 
by neurons (for example, weighted sums of 
inputs). To infer the stimulus angle encoded 
by the model responses at any given time, they 
used a simple readout scheme known as the 
vector method13. Neuron j casts a vote in favor 
of a vector pointing at the angle xj such that the 
strength of the vote is equal to the response of 
the cell rj. Then all the weighted vectors are 
added and the angle of the resulting vector 
is considered to be the angle encoded by the 
population. This way, the most active neurons 
contribute more to the final answer. With this 
readout or decoding method, the performance 
of the model population in the orientation dis-
crimination task was indistinguishable from 
that of human subjects. The asymmetry in the 
neuronal representation fully accounted for 
the bias in behavior.

Fischer and Peña2 studied the auditory 
system of owls, so the details are very differ-
ent, but they adopted a remarkably similar 
approach and conceptual framework. Their 
starting point was also a notorious asym-
metry in behavior. Owls can locate sounds 
along the horizontal plane accurately near the  
center of gaze, but they typically underestimate 
those originating further into the periphery. 
This central bias is substantial: on average, 
stimuli at ±45° elicit responses to ±33° or less2. 
Fischer and Peña2 accounted for these behav-
ioral results with a Bayesian model with two  

more common. Second, they tested the dis-
crimination capacity of people in a new task 
that allowed them to parametrically vary the 
noise or uncertainty of each oriented stimulus. 
This was crucial because perceived orientation 
varies with the amount of noise in the stimu-
lus (that is, with its visibility): noisy stimuli 
appear more horizontal or more vertical. This 
is exactly as expected: the larger the uncer-
tainty in the evidence, the stronger the reliance 
on the prior. Third, using a Bayesian model 
of the task applied to these psychophysical 
data, the authors inferred the internal prior 
used by the subjects and found that, on aver-
age, this prior was nearly identical to the prior 
obtained from natural images. This means that 
the neural representation of orientation in the 
brain is biased in a way that precisely matches 
the actual asymmetry found in nature. This 
match is a strong indicator of computational 
efficiency in the visual system.

Finally, Girshick et al.1 simulated the 
responses of a population of orientation- 
sensitive neurons with distributions of widths 
and preferred orientations based on reported 
data from neurophysiological experiments. 
Their model essentially applied the depen-
dencies shown in Figure 1b,c simultaneously 
to the same population. The objective was to 
investigate whether the Bayesian operations 
needed to combine the sensory evidence and 
the prior could be implemented with the types 

elements: a prior that favored sound sources near 
the center of gaze and a function that generated  
a noisy estimate of interaural time difference 
(ITD) for any given stimulus direction.

The ITD, which is the difference in the time 
of arrival of a sound to the two ears, is a crucial 
intermediate variable here because early audi-
tory neurons are tuned for ITD and the hori-
zontal angle of a sound is actually computed 
from it by specialized circuitry downstream. 
Thus, any uncertainty in ITD is carried over as 
uncertainty in source direction. Now, because 
for any sound direction the ITD that reaches 
the tympanic membrane is known from exper-
imental measurements, the Bayesian model 
had only two free parameters: the amount of 
noise in the ITD estimation and the width of 
the prior. By adjusting these two parameters, 
the model accounted for the original behav-
ioral data and for the behavior observed under 
two additional experimental conditions, one 
that altered the relationship between ITD and 
sound direction and another that increased 
the amount of noise in the owl’s perception 
of ITD.

Next, Fischer and Peña2 developed a popu-
lation model describing the encoding of hori-
zontal sound direction in the optic tectum of 
the owl. Again, the objective was to figure out 
how the neurons could implement the proba-
bilistic operations of the Bayesian model. For 
this, they generated arrays of neuronal tuning  

a b c d

0

 
8

M
e

a
n

 r
e

sp
o

n
se

 (
sp

ik
e

s)

0

 
8

0

 
8

Non-uniform widths,
uniform preferences

0

 
8

Non-uniform widths,
non-uniform preferences

0

5

10

15

V
a

ri
a

b
ili

ty
 (

d
e

g
)

0

5

10

15

0

5

10

15

0

20

40

60

!180 !90 0 90 180

!10

0

10

B
ia

s 
(d

e
g

)

Stimulus angle (deg)
!180 !90 0 90 180

!10

0

10

Stimulus angle (deg)
!180 !90 0 90 180

!10

0

10

Stimulus angle (deg)
!180 !90 0 90 180

!30

0

30

Stimulus angle (deg)

Uniform widths,
uniform preferences

Uniform widths,
non-uniform preferences

Figure 1  Encoding of a stimulus by a neuronal population. The angle on the x axis represents either the horizontal direction of a sound or the orientation 
of a visual stimulus (with the range of x rescaled by a factor of 2). (a) Top, a standard array of tuning curves with identical tuning width and uniformly 
distributed preferred angles. Bottom, variability (s.d.) and bias (mean) of the angle decoded over multiple trials from the responses of the population in a. 
Black circles (diameter, 10°) and orange spots depict variability (spot size) and bias (spot offset) at three stimulus angles. (b) Data presented as in a, but 
with variable density of preferred angles, highest at 0° and ±180° and lowest at ±90°. (c) Data presented as in a, but with variable tuning-curve widths. 
Narrowest curves peak at 0° and ±180° and widest ones at ±90°. (d) An array of tuning curves that approximates those found in the optic tectum of owls. 
All populations consisted of 50 model neurons with Poisson responses and had the same mean tuning-curve width. Encoded angles were found using the 
vector method.

©
 2

01
1

 N
at

u
re

 A
m

er
ic

a,
 In

c.
  A

ll 
ri

g
h

ts
 r

es
er

ve
d

.
©

 2
01

1
 N

at
u

re
 A

m
er

ic
a,

 In
c.

  A
ll 

ri
g

h
ts

 r
es

er
ve

d
.


NATURE NEUROSCIENCE  VOLUME 14 | NUMBER 8 | AUGUST 2011 9 4 5

N E W S  A N D  V I E W S

neuronal population, but also to understand  
why different neurons have tuning curves of 
different shapes14,15. What makes a ‘good’ 
shape? What makes an optimal mixture of 
shapes for a population encoding a par-
ticular sensory feature? The answers will 
certainly depend on the organism’s lifestyle 
and its interactions with the environment, 
but there is hope that general principles will 
emerge12,14,15. The new studies have peeled a 
layer of mystery from this fundamental issue 
in computational neuroscience.
COMPETING FINANCIAL INTERESTS 
The author declares no competing financial interests.

1. Girshick, A.R., Landy, M.S. & Simoncelli, E.P.  
Nat. Neurosci. 14, 926–932 (2011).

2. Fischer, B. & Peña, J.L. Nat. Neurosci. 14, 1061–1066 
(2011).

3. Ernst, M.O. & Banks, M.S. Nature 415, 429–433 
(2002).

4. Körding, K.P. & Wolpert, D.M. Nature 427, 244–247 
(2004). 

5. Trommershäuser, J., Maloney, L.T. & Landy, M.S. 
Trends Cogn. Sci. 12, 291–297 (2008). 

6. Ma, W.J., Beck, J.M., Latham, P.E. & Pouget, A. Nat. 
Neurosci. 9, 1432–1438 (2006). 

7. Weiss, Y., Simoncelli, E.P. & Adelson, E.H. Nat. 
Neurosci. 5, 598–604 (2002). 

8. Ashourian, P. & Loewenstein, Y. PLoS ONE 6, e19551 
(2011).

9. Miyazaki, M., Yamamoto, S., Uchida, S. & Kitazawa, S.  
Nat. Neurosci. 9, 875–877 (2006).

10. Pouget, A., Dayan, P. & Zemel, R. Nat. Rev.  
Neurosci. 1, 125–132 (2000).

11. Paradiso, M.A. Biol. Cybern. 58, 35–49 (1988).
12. Berens, P., Ecker, A.S., Gerwinn, S., Tolias, A.S. &  

Bethge, M. Proc. Natl. Acad. Sci. USA 108,  
4423–4428 (2011).

13. Salinas, E. & Abbott, L.F. J. Comput. Neurosci. 1, 
89–107 (1994).

14. Salinas, E. PLoS Biol. 4, e387 (2006).
15. Bonnasse-Gahot, L. & Nadal, J.P. J. Comput.  

Neurosci. 25, 169–187 (2008).

presumably, sounds in a forest may come from 
any direction. Rather, the prior function rep-
resents the relevance of the various sound 
directions.

Such an ‘importance coefficient’ of each 
direction may depend on many factors besides 
the associated frequency of occurrence. For 
instance, sounds coming from the back of the 
owl may be irrelevant because large orienting 
movements may alert the potential prey or 
require too much time or energy. In fact, the 
underestimation of sound directions has been 
reported in many species2. If, for whatever rea-
son, there is no point in responding to a par-
ticular direction, then detecting sounds from 
it is unnecessary; it just wastes resources14. 
In general, asymmetries in the distribu-
tions of preferences and widths in a popula-
tion can be used to assign different weights 
to different stimulus values because of their 
frequency, their potential for higher reward, 
motor constraints14, and so on. In the tennis 
analogy, a player may ignore balls coming to 
his backhand side either because they are too 
infrequent, because he cannot see well in that 
direction, or because he is hurt and cannot 
hit backhands. As a consequence, behavioral 
asymmetries may have multiple causes, and 
resolving them may require careful analyses 
such as those in carried out by Girshick et al.1 
and Fischer et al.2, and behavioral or neuronal 
responses that appear suboptimal under one 
prior may be optimal under another.

In a wider context, the goal is not just to 
identify the factors that determine the distri-
butions of widths and preferred values of a 

curves as functions of sound direction 
and compared the model responses to the 
 behavioral data. This required two ingredi-
ents. First, they needed a read-out to infer 
the source angle encoded by the population’s 
responses, and they used the very same  vector 
method as Girshick et al.2 Furthermore, they 
obtained an important theoretical result 
describing the mathematical conditions under 
which the vector method is equivalent to the 
Bayesian model2. Second, to fit the behavioral 
data, they had to adjust the distribution of pre-
ferred locations across the population. Their 
resulting model is qualitatively similar to that 
shown in Figure 1c, except that the owl’s tun-
ing curves are not perfectly symmetric. Finally, 
they showed that the distribution of preferred 
locations in the best-fitting model matched 
the actual distribution measured experimen-
tally, providing further proof of consistency 
between the behavioral, computational and 
neuro physiological results.

Both these studies create convincing links 
between psychophysical performance and 
neuronal representations using the formalism 
of Bayesian inference. There is a noteworthy 
difference between them, though. For edge 
orientation, the prior corresponds exactly to 
the frequencies with which horizontal, verti-
cal or other orientations are encountered in 
a visual scene. Thus, the statistics of natural 
images can fully account for the asymmetries 
in width and density in the V1 orientation 
tuning curves (Fig. 1b,c). For the owl, in con-
trast, the prior does not represent the distri-
bution of sound sources in the environment;  

of neuroectodermal origin, and involves a 
downstream molecule called -catenin. In 
the absence of Wnt, Axin1 cooperates with 
glycogen synthase kinase 3 (GSK3) and phos-
phorylates -catenin, thereby signaling its 
degradation. In the presence of Wnt, -catenin 
is not phosphorylated and accumulates in the 
cell and modulates gene expression. Active 
Wnt has been shown to impair oligodendro-
cyte progenitor differentiation and repair of 
demyelination2–5.

Fancy and colleagues1 identified the pro-
tein Axin2, also known as Axil (in rat) and 
Conductin (in mouse), as a negative regulator  
of -catenin stability (Fig.1), even in the 

Patrizia Casaccia is in the Department of 
Neuroscience and Friedman Brain Institute, Mount 
Sinai School of Medicine, New York,  
New York, USA. 
e-mail: patrizia.casaccia@mssm.edu  

Anti-TANKyrase weapons promote myelination
Patrizia Casaccia

A study identifies mechanisms responsible for the inability to form new myelin after neonatal hypoxia. It identifies 
Axin2 as a potential therapeutic target for reversing the ‘differentiation block’ of oligodendrocyte-lineage cells.

Cerebral palsy and cognitive deficits repre-
sent the devastating consequences of preterm 
births and of perinatal hypoxic or ischemic 
injury of full-term infants. At a cellular level, 
disease severity correlates with the degree of 
white matter injury and is characterized by 
the inability of cells in the oligodendrocyte 
lineage to differentiate into myelin-forming 
cells. There are no therapies to overcome this  

differentiation block. A similar deficit in the 
ability to form new myelin can be detected 
in the adult brain after demyelination in 
people with multiple sclerosis and is associ-
ated with lack of repair. In this issue of Nature 
Neuroscience, Fancy and colleagues1 identify 
Axin2, an inhibitor of the Wnt pathway, as 
a promising new therapeutic target for drug 
development directed at favoring new myelin 
formation in the neonatal and adult brain.

Wnt proteins comprise a family of secreted 
ligands crucial for stem cell biology and 
embryonic development. Inappropriate regula-
tion of Wnt signaling occurs in several types of 
cancer, including colon, liver and brain tumors 

©
 2

01
1

 N
at

u
re

 A
m

er
ic

a,
 In

c.
  A

ll 
ri

g
h

ts
 r

es
er

ve
d

.
©

 2
01

1
 N

at
u

re
 A

m
er

ic
a,

 In
c.

  A
ll 

ri
g

h
ts

 r
es

er
ve

d
.