ws-procs9x6


Emergence of Communication in Embodied Agents 
Evolved for the Ability to Solve a Collective Navigation 

Problem 
 

Davide Marocco         Stefano Nolfi 
Institute of Cognitive Science and Technologies, CNR, Viale Marx 15 

Rome, 00137, Italy 
[davide.marocco; stefano.nolfi]@istc.cnr.it 

In this paper we present the results of an experiment in which a collection of simulated robots that are 
evolved for the ability to solve a collective navigation problem develop a communication system that 
allows them to better cooperate. The analysis of the obtained results indicates how evolving robots 
develop a non-trivial communication system and exploit different communication modalities. Results 
also indicate how the possibility to co-adapt the robots’ individual and social/communicative 
behaviour plays a key role in the development of progressively more complex and effective individuals.   

1. Introduction 

The development of embodied agents able to interact autonomously with the physical world 
and to communicate on the basis of a self-organizing communication system is a new exciting 
field of research (Steels and Vogt, 1997; Cangelosi and Parisi, 1998; Steels, 1999; Steels and 
Kaplan, 2001; Marocco, Cangelosi and Nolfi, 2003; Quinn et al, 2003; for a review see Kirby, 
2002; Steels, 2003; Wagner et al., 2003; Nolfi, 2005). The objective is to identify methods of 
how a population of agents equipped with a sensory-motor system and a cognitive apparatus 
can develop a grounded communication system and use their communication abilities to solve 
a given problem.  

Answering this question is important for both scientific and technological reasons.  From 
a scientific point of view, understanding how communication abilities and a communication 
system might emerge in a population of interacting embodied agents might shed lights on the 
evolution of animal communication and on the origin of language. From a technological point 
of view, understanding the fundamental principles involved might lead to the development of 
innovative communication methods for multi-agent software systems, autonomous robots, and 
ubiquitous computing devices. 

In this paper we present the results of an experiment in which a collection of simulated 
robots that are evolved for the ability to solve a collective navigation problem develop a 
communication system that allows them to cooperate better.  

Robots are provided with simple sensory-motor systems that allow them to move, 
produce signals with varying frequencies, and gather information from their physical and 
social environment (including signals produced by other agents). Since the chosen problem 
admits a variety of qualitatively different solutions and since robots are selected on the basis 
of their ability to solve the collective navigation problem (and not on the basis of their 
communication abilities), evolving robots are left free to determine the circumstances in 
which communication is used, the structure of the communication system (i.e. the number, the 
type and the “meaning” of signals), the communication modalities (i.e. the role played by 
communicating individuals), and the relation between individual and social/communication 
abilities.  

The analysis of the obtained results indicates how evolving robots develop a non-trivial 
communication system and exploit different communication modalities. Results also indicate 

 1


how the possibility to co-adapt the robots’ individual and social communicative behaviours 
play a key role in the development of progressively more complex and effective individuals. 

In the following section we will review the related literature. In section 3, we will 
describe our experimental setup and we will show how a communication ability emerges as a 
result of an indirect selective pressure. In section 4, we will describe the type of signals 
produced by evolved robots and their effects on other robots’ behaviours. In section 5 we will 
describe the modalities with which evolved individuals communicate. In section 6, we will 
describe the relation between the individual and social/communicative behaviour. Finally, in 
section 7, we will discuss the implications of the obtained results.  

2. Related literature 
In their pioneering work on evolution of communication Werner and Dyer (1992) evolved a 
simulated population of male and female agents living in a toroidal grid environment for the 
ability to ‘mate’ (i.e. for the ability of male agents to reach the physical location of female 
agents). The sensory-motor structure of the agents was designed so as to force them to rely on 
signalling behaviour and to prevent any possible alternative strategy. Indeed, females are 
immobile (i.e. they cannot reach males) and males are blind (i.e. they cannot detect the 
position of females). Females are allowed to detect the position and orientation of the closest 
male located in the 5x5 cells area surrounding them and to send signals (i.e. vectors of 3 
binary values) to all males located in the same surrounding area. Males, on the other hand, are 
allowed to detect the signals produced by females and to move. By analyzing the obtained 
results, the authors showed how evolved individuals were able to solve the mating problem by 
exploiting their communication ability despite the fact that communication was not explicitly 
rewarded in the fitness function. By analysing how evolved males modified their motor 
behaviour on the basis of the signals detected, the authors showed how evolved females 
produced few different signals whose effect could be described with sentences like “go 
forward”, “turn left”, and “turn right”. 

More recently, Cangelosi and Parisi (1998) and Marocco et al. (2003) demonstrated how 
communication can emerge also in experimental settings in which communication is not 
necessarily required to solve adaptive problems, but it allows individuals to achieve better 
performance. In Marocco et al. (2003), for example, a population of robotic arm controllers 
have been evolved for the ability to discriminate objects with different shapes (i.e. spherical 
or cubic objects) on the basis of tactile sensory information by continuing to touch spherical 
objects while avoiding cubic objects. Evolving robots are asked to play alternatively the role 
of speaker and hearer. When they assume the role of speaker they only receive as input the 
state of tactile sensors and are allowed to produce a vector of floating point value to ‘name’ 
the object. When they assume the role of hearer, they receive as input both tactile and 
communication information (i.e. the vector of floating point number produced by a speaker 
agent that previously interacted with the same object). Evolved individuals display an ability 
to discriminate the two type of objects and to produce the appropriate motor behaviour (i.e. 
continuing to touch or avoiding the object) on the basis of tactile sensory information only but 
they also develop an ability to name the two objects with different patterns and to use these 
patterns to discriminate the objects immediately, thus avoiding to waste the time necessary to 
discriminate objects through physical interactions. 

Quinn et al. (Quinn 2001; Quinn et al, 2003) evolved a team of mobile robots for the 
ability to move while remaining close to one another. Robots are only provided with 
proximity sensors and therefore do not have dedicated communication channels. Despite of 
that, they evolve a primitive form of implicit communication based on bodily movement that 
allows them to coordinate and to assume different roles (see also Baldassarre et al, 2003). 
This is achieved through two sub-behaviours: (1) an approaching behaviour in which one of 
the two robots tends to produce a sustained level of activation on the infrared sensors of the 
other robot, and (2) a front-inversion behaviour in which the robot that experiences a 
sustained level of activation in its infrared sensors inverts its direction of movement. The 

 2


combination of these two sub-behaviours, in fact, allows the two robots to align and to 
assume the role of follower and leader, respectively.  

Finally, Di Paolo (2000) reported the results of a set of experiments in which two 
simulated robots moving in an arena have been evolved for the ability to approach each other 
and to remain close together as long as possible. Robots are provided with two motors 
controlling the two wheels, a sound organ able to produce sounds with different intensities, 
and two sound sensors symmetrically placed at ±45 degrees with respect to the frontal side of 
the robot that detects the intensity of the sound produced by the two robots. Evolved robots 
exploit the possibility to modulate the intensity of the produced sounds by producing 
rhythmical sounds with varying intensities phase-locked at some value near perfect anti-
phase. 

Like the authors of the models reviewed above, we are interested in building a model in 
which communication can emerge without being explicitly rewarded.  However, we are also 
interested in experimental set-ups in which individual and social/communicative behaviour 
can co-evolve by mutually shaping one another. Moreover, we are interested in studying how 
individuals might discover categories that are useful from a communication point of view and 
that are not already explicitly or implicitly identified in the experimental setting. Finally, we 
are interested in studying how not only communication abilities but also communication 
modalities, that regulate how individuals interact/communicate, can emerge as a result of an 
indirect selective pressure. For this reason, we will not impose a restricted and predefined 
interaction schema and we will leave robots free to determine the modality with which they 
will interact. By restricted and predefined interaction schema we mean the interaction 
modality adopted for example in Werner and Dyer (1992), in which females and males 
individuals can only play the role of the speaker and hearer, respectively. Or the interaction 
modality adopted in Cangelosi and Parisi (1998) and Marocco et al. (2003), in which, agents 
alternatively assume the role of speaker or hearer and in which speakers are allowed to send 
to hearer robots a signal consisting of a single pattern after having interacted for a certain 
amount of time with the same object that will be experienced by the hearer.  

3. Experimental set-up and emergence of communication 
A group of four simulated robots live in a square arena of 270x270cm surrounded by walls 
that contains two circular target areas (see Figure 1, left). The robots have a circular body 
with a radius of 11 cm and are provided with: two motors controlling the two corresponding 
wheels, one communication actuator capable of sending signals with varying frequencies 
(signals are encoded as floating point values ranging from 0.0 to 1.0), eight infrared sensors 
(that detect obstacles up to a distance of about 15 cm), one ground sensor (which, by 
detecting the colour of the floor, can ascertain whether the robot is located on a target area or 
not), and four communication sensors that detect signals produced by other robots up to a 
distance of 100cm from four corresponding directions (i.e. frontal [315o-45o], rear [135o-
225o], left [225o-315o], right [45o-135o])1. The implementation of communication sensors are 
shaped in a way that allows robots to detect signals emitted by only a single robot for each 
direction.  
 

1 We are currently trying to implement these robots in hardware. Robots’ signalling production and detection 
system will be based on Designer Systems DS-IRCM infrared wireless communication modules sold by 
Totalrobots, that allow the formation of a local bi-directional wireless network, and appropriate software 
routines. Communication between modules only occurs within a given angular range and within a distance up to 
10 meters. Interferences between modules are prevented by hardware and software routines that check the 
intensity of incoming signals.  
 

 3


Figure 1. Left: The environment and the robots. The square represents the arena surrounded by walls. The two 
grey circles represent two target areas. The four black circles represent four robots. Right: The neural controller 
of evolving robots. Internal neurons and recurrent connections are only included in one of the two experimental 
settings (see text). 

The robots’ neural controllers consist of neural networks with 14 sensory neurons (that 
encode the activation states of the corresponding 13 sensors and the activation state of the 
communication actuator at time t-1, i.e. each robot can hear its own emitted signal at the 
previous time step) directly connected to the three motor neurons that control the desired 
speed of the two wheels and the communication signal produced by the robot.  

In a first experimental setting, neural controllers did not include internal neurons and 
recurrent connections. In the second experimental setting, neural controllers also included two 
internal neurons with recurrent connections. On both case the output of motor neurons was 
computed according to the logistic function (2). In the case of the second experimental 
setting, the output of sensory and internal neurons was computed according to function (3) 
and (4), respectively (for a detailed description of these activation functions and the relation 
with other related models, see Nolfi [2002]). We will call the robots of the former 
experimental setting reactive robots (since in their case motor actions can only be determined 
on the basis of the current sensory state, plus the copy of the communication neuron at the 
time t-1) and the robots of the latter experimental setting non-reactive robots (since in their 
case the motor actions are also influenced by previous sensory and internal states).  
 
 ∑+=

i
iijjj OwtA  (1)

 
 Ajj e
O

−+
=

1
1

 (2)

 
 ( ) ( )jjtjj IOO ττ −+= − 11  (3)

 ( ) ( ) ( )jAtjj jeOO ττ −++= −−− 11 11  (4)
 
 
 4


With Aj being the activity of the jth neuron, tj being the bias of the jth neuron, wij the 
weight of the incoming connections from the ith to the jth neuron, Oi the output of the ith 
neuron, Oj(t-1) the output of the jth neuron at the previous time step, τj the time constant of the 
jth neuron, and Ij the intensity of the jth sensors.  

Robots were evolved for the ability to find and remain in the two target areas by 
subdividing themselves equally between the two areas. Each team of four robots was allowed 
to "live" for 20 trials, lasting 100 seconds (i.e. 1000 time steps of 100 ms each). At the 
beginning of each trial the position and the orientation of the robots was randomly assigned 
outside the target areas. The fitness of the team of robots consists of the sum of 0.25 scores 
for each robot located in a target area and a score of -1.00 for each extra robot (i.e. each robot 
exceeding the maximum number of two) located in a target area. The total fitness of a team is 
computed by summing the fitness gathered by the four robots in each time step. 

The initial population consisted of 100 randomly generated genotypes that encoded the 
connection weights, the biases, and the time constants (in the case of non-reactive robots) of 
100 corresponding neural controllers (each parameter is encoded with 8 bits and normalized 
in the range [–5.0, +5.0], in the case of connection weights and biases, and in the range [0.0, 
1.0], in the case of time constants). Each genotype is translated into 4 identical neural 
controllers that are embodied in the four corresponding robots (i.e. teams are homogeneous 
and consist of four identical robots, for a discussion about this point and alternative selection 
schemas see Quinn, 2000, 2001; Quinn et al. 2003; Baldassarre et al. 2003). The 20 best 
genotypes of each generation were allowed to reproduce by generating five copies each, with 
2% of their bits replaced with a new randomly selected value. The evolutionary process lasted 
100 generations (i.e. the process of testing, selecting and reproducing robots is iterated 100 
times). The experiment was replicated 10 times for each of the two experimental conditions 
(reactive and non-reactive neural controllers).  
 

Figure 2. The behaviour displayed by the team of evolved robots of one of the best replications for reactive (left) 
and non-reactive robots (right). The square and the grey circles indicate the arena and the target area 
respectively. Lines inside the arena indicate the trajectory of the four robots during a trial. The numbers indicate 
the starting and ending position of the corresponding robot (the ending position is marked with a white circle). 
 
By analysing the behaviour of one of the best teams of evolved robots for the two 
experimental conditions we can see how both reactive and non-reactive robots are able find 
and remain in the two target areas by equally distributing themselves between the two. Figure 
2 (left) shows a typical behaviour exhibited by reactive robots. In this example Robots #2 and 
#3 quickly reach two different empty target areas. Then, robot #1 joins robot #2 in the top-left 

 5


target area. Robots #0, approaches and avoids the top-left target area (that already contains 
two robots) and later joins the bottom-left target area. In the example shown in the right part 
of Figure 2, that displays a typical behaviour exhibited by non-reactive robots, robots #2 and 
#3 quickly reach two different empty target areas. Later on, robot #1 and then robot #0 
approach and enter the bottom-right target area. As soon as the third robot (i.e. robot #0) 
enters the area, robot #1 leaves the bottom-right target area and, after exploring the 
environment for a while, enters and remains in the top-left target area.  

To determine whether the possibility to signal and to use other robots’ signals is exploited 
by evolving robots and whether the type of neural architecture influences the obtained 
performance we tested the evolved teams of reactive and non-reactive experiments in three 
conditions: a normal condition, a deprived condition in which robots were not allowed to 
detect other robots’ signals (i.e. in which communication sensors were always set to a null 
value), and in a no-signal conditions in which the two sets of evolutionary experiments were 
replicated by not allowing robots to detect other robots’ signals and in which evolved robots 
were tested in the same deprived condition (see Figure 3).  

  
0

0.1

0.2

0.3

0.4

0.5

Normal Deprived No-signals
 

Figure 3. Average fitness of all teams of the last generations of 10 different replications of the experiments. Grey 
and black histograms represent the average fitness of reactive and non-reactive robots, respectively. Normal 
histograms represent the fitness obtained by testing the robot in a normal condition (in the same condition in 
which they have been evolved). Deprived histograms represent the fitness obtained by testing robots (evolved in 
a normal condition) in a control condition in which they are not allowed to detect other robots’ signals. No-
signals histograms represent the fitness obtained by evolving and testing robots in a control condition in which 
they are not allowed to detect other robots’ signals. Fitness value are normalized in the range [0.0-1.0], were 1.0 
corresponds to the case in which individuals spend the entire lifetime in target areas equally divided into two 
groups (i.e. a fitness value that cannot be reached in practice since robots first have to locate and reach the two 
target areas). In all cases, individuals have been tested for 1000 trials. 
 
Performance in the “Normal” condition is better than in the other two conditions. The 
difference is statistically significant (p < 0.001) both in the case of reactive and non-reactive 
robots. The fact that performance in the “Normal” condition are better demonstrate that robots 
use their ability to produce and detect signals.  

The fact that non-reactive outperform reactive robots in the normal condition (differences 
in performance are statistically significant) indicates that the possibility to integrate sensory-
motor information through time is exploited by non-reactive individuals. Moreover, the fact 
that the differences of performance between reactive and non-reactive conditions are not 
statistically significant in the “Deprived” and “No-Signal” conditions indicates that the 
possibility to integrate sensory-motor information through time is exploited by 

 6


communicating individuals only. We will analyse the differences between robots’ individual 
and social behaviour and the relation between these two forms of behaviours in more detail in 
section 6. 

 
4. The evolved communication system: signals produced and their effects of other 
robots behaviour 

By analysing the teams of the best replication of the experiment we observed that evolved 
individuals developed a non-trivial communication system, both in the case of the reactive 
and non-reactive experiments. More specifically evolving robots display an ability to develop 
a sort of lexicon (including 4-5 different signals), a perceptually grounded categorization of 
the physical and social world reflected by the different signals, an ability to appropriately 
modulate their motor behaviour on the basis of the signals detected, an ability to appropriately 
modulate their signalling behaviour on the basis of the signals detected. 

In the next two sections we will describe in details the signals produced by reactive and 
non-reactive robots in different conditions and the effect of the detected signals on robots’ 
motor and signalling behaviours. As we will see, in the case of the best replication, reactive 
and non-reactive robots developed a similar signalling system. However, non-reactive robots 
outperform reactive robots in their ability to “use” the signals detected.  In section 5 we will 
describe the evolved communication modalities. Finally, in section 6, we will describe robots’ 
individual behaviour and the relation between individual and social/communicative 
behaviours.  

4.1 Experiment I – Reactive robots 
Reactive robots of the best replication (the same described in Figure 2, left) produce at least 
four different types of signals: 
 

(a) a signal A with an value of about 0.07 produced by robots located outside the target 
areas not interacting with other robots located inside target areas;  

(b) a signal B with an value of about 0.45 produced by robots located alone inside a 
target area;  

(c) a signal C, an highly varying signal with an average value of 0.25, produced by 
robots located inside a target area that also contains another robot; 

(d) a signal D with an value of about 0.01 produced by robots that are approaching a 
target area and are interacting with another robot located inside the target area.  

 
Robots receiving these four types of signals modify their motor and/or signalling behaviour 
on the basis of the signal received and on other available sensory information. More 
specifically:  
 
(1) robots located outside the target areas receiving signal A tend to modify their motor 

behaviour in a way that allow them to explore the environment more effectively, i.e. to 
find more quickly the target areas (see below);  

(2) robots located outside target areas receiving signal B tend to modify their motor 
behaviour (by approaching the robot emitting the signal and the corresponding target 
area) and their signalling behaviour (i.e. by producing signal D instead of signal A);  

(3) robots located outside the target areas receiving the signal C (i.e. the signal produced by 
two robots located inside a target area) modify their motor behaviour so as to move away 
from the signal source. 

 7


(4) robots located inside the target areas that receive the signal C (i.e. the signal produced by 
two other robots located inside the target area) tend to modify their motor behaviour so as 
to exit from the target area. 

 
To verify the functionality of signal A, we measured the time elapsed until at least one robot 
of the team reaches one of the two target areas in a normal condition and in a control 
condition in which robots were not allowed to detect signals (i.e. in which the state of the four 
communication sensors of all robots was always set to a null value). By testing the best 
evolved team of robots in the two conditions, we observed that the time needed to reach the 
first target area, on the average, is 6.727s and 7.765s in the case of the normal and the control 
condition, respectively (grey bars in Figure 4). This implies that signals A, produced by 
robots located outside target areas allow the team to better explore the environment and, 
consequently, to more quickly reach the target areas.  
 

0

2

4

6

8

Normal Deprived
 

Figure 4. The average time elapsed (seconds) until at least one robot of the team reaches one of the two target 
areas in a normal condition (“Normal”) and in a control condition (“Deprived”) in which robots were not 
allowed to detect signals during the test. Black and grey bars represent the average time required by non-reactive 
and reactive robots, respectively. Average performance obtained by testing the robots for 1000 trials lasting 100 
seconds.  

 
To verify the functionality of signal B, we tested a team consisting of two robots placed in an 
environment including only a single target area (one of the robots was manually placed into 
the target area, while the other one was placed in a random position outside the area) in a 
normal condition and in a control condition in which robots were not allowed to detect signals 
(i.e. in which the state of the four communication sensors of all robots was always set to a 
null value). Testing the best evolved team of robots in the two conditions, we observed that 
the percentage of trials in which the robot placed outside was able to reach the target area 
within 100 seconds is 80.9% and 53.2% in the normal and control condition, respectively 
(grey bars in Figure 5). This demonstrates that robots detecting signal B modify their motor 
behaviour so to approach the source of the signal. We will discuss the effect of signal B on 
robots signalling behaviour in the next section.  

 
 8


0

20

40

60

80

100

Normal Deprived
 

Figure 5. Percentage of trials in which both robots were able to reach the target area. Tests of a team consisting 
of two robots placed in an environment including only a single target area in a normal condition (“Normal”) and 
in a control condition (“Deprived”) in which robots were not allowed to detect signals. Grey and black bars 
represent the performance of reactive and non-reactive robots, respectively. Average performance obtained by 
testing the robots for 1000 trials lasting 100 seconds.  
 
Robots located inside a target area produce signal B. However, two interacting robots located 
in the same target area reciprocally modulate their signalling behaviour so as to produce 
signal C (i.e. a highly varying signal with an average value of 0.25). Indeed, by placing two 
robots in a target area and by preventing the former to detect the signal produced by the latter, 
we observed that the first robot produces the signal B. To verify whether signal C reduces the 
chances that more than two robots enter in the same target area, we tested a team consisting of 
three robots in an environment including only a single target area in a normal condition and in 
a “Deprived” control condition in which communication was disabled (i.e. in which the state 
of the four communication sensors of all robots was always set to a null value). At the 
beginning of each trial two robots are placed inside the target area and one robot is placed 
outside the target area with randomly selected positions and orientations. Testing the best 
evolved team of robots in the two conditions we observed that the percentage of trials in 
which the robot initially placed outside the target area erroneously enters the area is 4.8% and 
43.7%, in the normal and control condition, respectively (see grey bars in Figure 6).  

To verify whether signal C increases the chances that a robot exits from a target area that 
contains more than two robots we tested a team of three robots in an environment including 
only a single target area in a normal condition and in a “Deprived” control condition in which 
communication was disabled. At the beginning of each trial, all three robots were placed 
inside the target area with randomly selected positions and orientations. The percentage of 
times in which one of the three robots exit from the target area is 52.8% and 26.9% in normal 
and deprived conditions, respectively (see Figure 7, grey bars).  

 
 9


0

20

40

60

80

100

Normal Deprived
 

Figure 6. Percentage of trials in which a third robot erroneously enters in a target area that already contains two 
robots in a normal condition (“Normal”) and in a control condition (“Deprived”) in which robots were not 
allowed to detect signals. Grey and black bars represent the performance of reactive and non-reactive robots, 
respectively. Average performance obtained by testing the robots for 1000 trials lasting 100 seconds.  
 

0

20

40

60

80

100

Normal Deprived
 

Figure 7. Percentage of times in which one robot exit from a target area that contains three robots in a normal 
condition (“Normal”) and in a control condition (“Deprived”) in which robots were not allowed to detect 
signals. Grey and black bars represent the performance of reactive and non-reactive robots, respectively. 
Average performance obtained by testing the robots for 1000 trials lasting 100 seconds.  

4.2 Experiment II – Non-Reactive robots 
 
Non-reactive robots of the best replication (the same described in Figure 2, right) produce at 
least five different types of signals. The signals A, B, C and D are analogous to the 
corresponding signals produced by reactive robots (i.e. although the value of the signals 
varies, they are produced in the same circumstances and have functionally similar effects). 
More precisely, non-reactive robots produce the following signals:  
 
(a) a signal A with an value of about 0.42 produced by robots located outside the target areas 

that do not detect other robots’ signals;  
(b) a signal B with an value of about 0.85 produced by robots located alone inside a target 

area;  

 10


(c) a signal C, an highly varying signal with an average value of 0.57, produced by robots 
located inside a target area that also contains another robot; 

(d) a signal D with an value of about 0.07 produced by robots outside target areas that are 
approaching a target area and are interacting with another robot located inside the target 
area.  

(e) a signal E, an highly varying with an average value of 0.33, produced by robots located 
outside the target areas interacting with other robots also located outside target areas.  

 
Robots receiving these five types of signals modify their motor and/or signalling behaviour on 
the basis of the signal received and of other available sensory information. More specifically:  
 
(1) robots located outside the target areas receiving signal A modify their signalling 

behaviour by producing signal E;  
(2) robots located outside target areas receiving signal B tend to modify their motor 

behaviour (by approaching the robot emitting the signal and the corresponding target 
area) and their signalling behaviour by producing signal D;  

(3) robots located outside the target areas receiving the signal C (i.e. the signal produced by 
two robots located inside the target area) modify their motor behaviour so as to move 
away from the signal source; 

(4) robots located inside target areas receiving the signal C (i.e. the signal produced by two 
robots located inside the target area) modify their motor behaviour so as to exit from the 
target area. 

(5) robots located outside the target areas receiving signal E tend to modify their motor 
behaviour to better explore the environment. 

 
The fact that signals A and E produced by robots located outside target areas allow them to 
explore the environment more effectively (i.e. to more quickly find target areas) is 
demonstrated by the fact that the average time in which the first robot enter in one of the two 
target areas is 5.922s and 6.478s in normal and deprived conditions, respectively (see the 
black bars in Figure 4). By testing the best teams of the other replications of the experiment 
similar results were observed in most of the cases (result not shown). Overall, these results 
indicate that robots exploit their signalling behaviour to produce a form of coordinated 
exploration that increases their ability to quickly find the target areas. To identify the relative 
roles of the two signals we ran an additional test in which robots were allowed to produce and 
detect signal A but were not allowed to switch from signal A to E (they were forced to 
produce signal A even when they start to detect the signal A produced by other robots). The 
obtained result (i.e. an average time of 6.952s) indicates that the functionality is provided by 
the signal E, while the role of signal A is that to trigger the production of signal E. 

The fact that signal B increases the chances that other robots enter the target area from 
which the signal is produced is demonstrated by the fact that the percentage of trials in which 
a robot placed outside the target area enters in a target area that already contains a single 
robot is 97.2% and 75.4% in the case of robot tested in normal and deprived conditions, 
respectively (see the black bars in Figure 5).  

The fact that signal C reduces the chances that other robots enter into a target area that 
already contains two robots is demonstrated by the fact that the percentage of times in which 
a third robot joins a target area that already contains two other robots is 2.3% and 82.6% in 
normal and deprived conditions, respectively (see Figure 6, black bars). 

The fact that signal C increases the chances that a robot exits from target area that 
contains more than two robots is demonstrated by the fact that the percentage of times in 
which one of three robots located in the same target area exit the area is 84.6% and 2.7% in 
normal and deprived conditions, respectively (see Figure 7, black bars). 

 11


The functionality of signal D (both in the case of non-reactive and reactive robots) and 
more generally the functionality of the effects that signals detected have on the type of signals 
produced will be discussed in the next section. 

5. The evolved communication system: communication modalities 

Evolving robots might rely on mono or bi-directional communication forms. In mono-
directional communication forms, the motor behaviour or the signal produced by one 
individual affects the behaviour of a second individual but the behaviour of the latter 
individual does not alter the behaviour of the former individual. In these forms of 
communication, the two robots play the role of the ‘speaker’ and of the ‘hearer’, respectively, 
and communication can be described as a form of information exchange (in which 
communication allows the ‘hearer’ to access information that is available to the ‘speaker’) or 
as a form of manipulation (in which the ‘speaker’ alters the behaviour of the ‘hearer’ in a way 
that is useful to the ‘speaker’ or to both the ‘speaker’ and the ‘hearer’). In bi-directional 
communication forms, on the other hand, the motor or signalling behaviour of one individual 
affects the second individual and vice versa. In these forms of communication each robot 
plays both the role of the ‘speaker’ and of the ‘hearer’ (i.e. different roles cannot be 
identified). 

Another important aspect that characterizes communication forms is whether they consist 
of static or dynamical processes. In static communication forms, the signal produced by an 
individual is only a function of the current state of the individual. In dynamic communication 
forms, instead, the signal produced at a given time step is also a function of the signals 
produced and detected previously. As an example of a static communication form we might 
consider the case of a robot emitting an alarm signal continuously (until the danger 
disappears). As an example of a dynamic communication form we might consider the case of 
two individuals that alternatively play the role of the speaker and of the hearer by taking turns 
(Iizuka and Ikegami, 2003a, 2003b). Bi-directional and dynamical communication forms 
might lead to emergent properties (e.g. synchronization or shared attention) that result from 
the mutual interaction between two or more individuals and that cannot be explained by the 
sum of the individual contributions only (Di Paolo, 2000).  

In the experiments reported in this paper the modalities that regulate communicative 
behaviours are not predefined but vary within evolving individuals. Indeed, as we will see, 
evolved agents use different communication modalities in different circumstances.  

To describe the communication modalities used, let us consider a simplified situation in 
which a team consisting of two robots is placed in an arena that includes only a single target 
area.  Figure 8 (left) and Figure 9 show the typical motor and signalling behaviour exhibited 
by two reactive robots. Initially the two robots are both outside the target area and produce a 
signal with an value of about 0.07 (signal A). When the two robots get close enough and 
detect the other robot’s signal, they slightly change their motor trajectory so as to increase 
their chances to end up in a target area. Individual #0 reaches the target area and starts to 
produce a signal with an value of about 0.45 (signal B). Once robot #1 gets close enough to 
robot #0 (i.e. as soon as it starts to detect the signal B produced by robot #1) it modifies its 
trajectory so as to approach the direction of the signal and it starts to produce signal D (i.e. a 
signal with almost null value). Later on, when robot #1 enters the area, the two robots start to 
produce an highly varying signal with an value of about 0.25 (signal C). Signal C affects the 
motor behaviour of robots located outside the target area (which tend to avoid the target area) 
and inside the target area (which tend to exit from areas that contains more than two robots). 
Signal C also affects the signalling behaviour of other robots located inside a target area. 

 12


Indeed, robots located inside a target area switch their signalling behaviour from B to C when 
they detect the signal produced by another robot located in the same target area.  

 
Figure 8. The behaviour of two robots tested in an arena including a single target area. The dashed and full lines 
represent the trajectory of robot #0 and #1, respectively. The numbers indicate both the starting and ending 
positions of the corresponding robots. Left: typical behaviour exhibited by a reactive robot.  Right: typical 
behaviour exhibited by a non-reactive robot.  
 

 13


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 250 500 750 1000 1250 1500
lifecycles

si
gn

al
 in

te
n

si
ty

0 & 1 out 0 in, 1 out 0 & 1 in

A

B C

D

 
Figure 9. Values of the signals produced by the two reactive robots during the behaviour shown in the left side 
of Figure 8. Dashed and full lines indicate the values of the signals produced by robot #0 and #1, respectively. 
Letters (A,B,C, and  D) indicate the 4 classes of signals produced by the robots. The black lines at the bottom of 
the figure indicate the three phases in which: (1) both robots are outside the target area, (2) robot #0 is in and 
robot #1 is out, and (3) both robots are inside the target area. The grey lines at the bottom of the figure indicate 
the phases in which the two robots are located within the signal range. The short signals produced when the 
robots outside target areas are produced when they detect an obstacle through their infrared sensors. These 
signals do not seem to play any functional role. 

As shown in Figure 8 (right) and Figure 10, non-reactive robots rely on similar 
communication modalities. Initially the two robots are both outside the target area and 
produce a signal with an value of about 0.42 (signal A). As soon as the two robots get close 
enough to detect their signals, they produce a signal with a varying value and an average 
value of 0.33 (signal E) and they vary their motor trajectory by increasing their turning angle 
so to increase their chance to enter into a target area. After some time robot #0 reaches the 
target area and starts to produce a signal with an value of about 0.85 (signal B). Later on, once 
robot #1 returns close enough to robot #0 and detects the signal B produced by robot #0,  it 
modifies its motor trajectory (by approaching robot #0) and its signalling behaviour (by 
producing signal D, i.e. a signal with an almost null value, instead of signal A). When also 
robot #1 enters the area, the two robots start to produce a varying signal with an average value 
of about 0.57 (signal C). This signal reduces the probability that other robots will enter in the 
area and eventually, if an additional robot erroneously joins the area, increases the probability 
that a one of the robots exits from the area.  

 
 14


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 200 400 600 800 1000
lifecycles

si
gn

al
 in

te
n

si
ty

0 & 1 out 0 in, 1 out 0 & 1 in

A

B
C

D

E

 
Figure 10. Values of the signals produced by the two robots provided with hidden units during the behaviour 
shown in the right side of Figure 7. Dashed and full lines indicate the values of the signals produced by robot #0 
and #1, respectively. Letters (A, B, C, D and E) indicate the 5 classes of signals produced by the robots. The 
black lines in the bottom part of the figure indicate the three phases in which: (1) both robots are out the target 
area, (2) robot n.0 is in and robot n.1 is out, and (3) both robots are inside the target area. The grey line in the 
bottom part of the figure indicate the phases in which the two robots are located within the signal range. The 
short signals produced when the robots outside target areas are produced when they detect an obstacle through 
their infrared sensors. These signals do not seem to play any functional role. 

By analysing the functionality of the different signals and the context in which they are used, 
we can see how evolved robots use different communication modalities and select on the fly 
the modality that is appropriate for the current situation. 

The situation in which one robot is located inside a target area and another robot is 
located outside, within the communication range, is a case in which the former robot has 
access to information (related to the location of the target area) to which the latter robot does 
not have access to. In this particular case, communication should be mono-directional since 
the latter robot should change its behaviour on the basis of the signal produced by the former 
robot while the former robot should not necessarily change its motor or signalling behaviour 
as a result of the signal produced by the latter robot. Indeed in this situation the evolved 
robots rely on a mono-directional communication form in which the former robot produces 
the signal B and the latter robot switches its signalling behaviour off by producing the signal 
D (i.e. a signal with an almost null value). This communication interaction thus can be 
described as an information exchange behaviour in which the former robot (the speaker) 
produces a signal that encodes information related to the location of the target area and the 
latter robot (the hearer) exploits this information to navigate toward the area. Or, 

 15


alternatively, this communication interaction can be described as a form of manipulation in 
which the former robot (the speaker) manipulates the motor behaviour of the latter robot (the 
hearer) so as to drive the robot toward the target area.  

The ability of robots located outside target areas to switch their signalling behaviour off 
(i.e. to produce the signal D) as soon as they detect the signal B plays an important function 
both in the case of reactive and non-reactive robots. Indeed, by testing a team of two robots in 
an environment including a single target area, in a normal condition and in a control condition 
in which robots were prevented from the ability to switch between signal A and D, we 
observed that performance in the control condition are much worse. More precisely, the 
percentage of trials in which both robots were able to reach the target area within 100 seconds 
drop from 80.9% to 22.5% (in the case of reactive robots) and from 97.2% and 23.8% (in the 
case of non-reactive robots) in the normal and control conditions, respectively (Figure 11).  
 

0

20

40

60

80

100

Normal No Modulation
 

Figure 11. Percentage of trials in which a team of two robots randomly placed in an environment with only one 
target area are able to both enter in the target area. Tests performed in a normal conditions and in a control 
condition in which robots outside target area were not allowed to switch their signalling behaviour from A to D.  
Grey and black bars represent the performance of reactive and non-reactive robots, respectively. Average 
performance obtained by testing the robots for 1000 trials lasting 100 seconds. 
 
On the contrary, when two robots are located in the same target area, none of the two robots 
have access to the relevant information (i.e. the fact that the target area contains two robots).  
This information, however, can be generated by the interaction between the two robots 
through a bi-directional communication modality. This is indeed the communication modality 
that is selected by evolved robots in this circumstance. The signal produced by one of the two 
robot affects the signal produced by the second robot, and vice versa. This bi-directional 
interaction allow the two robots to switch from signal B (i.e. a signal that increases the 
chances that other robots will joint the area) to signal C (i.e. a signal that decreases the 
chances that other robots will joint the area).  

Interestingly in this circumstance, evolved robots also rely on a dynamical 
communication modality, i.e. they produce signals that vary in time as a result of signals 
previously produced and detected by the two robots. More precisely, in the case of non-
reactive robots, the signal C tend to vary in time as a result of the following factors: (1) the 
value of the signal detected inhibits the signal produced, (2) the intensity of the inhibition also 
depends on the direction of the detected signal, (3) the signal tends to be detected by always 
varying relative directions since robots located inside target area turn on the spot.  

 16


The production of an oscillatory signal with an average value of 0.57 (in the case of non-
reactive robots) in this situation, rather then a stable non-dynamical signal, plays an important 
functional role. Indeed, we observed that evolved robots rely on oscillatory signals in all the 
replications of the experiment (both in the case of reactive and non-reactive robots). 
Moreover we observed that stable signals do not allow to reach the same level of 
performance. To ascertain whether the production of a stable signal could lead to the same 
functionality of this oscillatory signal we performed a test in which non-reactive robots were 
forced to emit a stable signal when located in a target area that contained two robots. Robots 
were allowed to behave normally in all other cases. The test was repeated 10 times by using 
stable signals with 10 different values ranging from 0.1 to 1.0. The fact that, as shown in 
Figure 12, obtained performance is always lower than performance obtained by allowing the 
robots to produce the oscillatory signal confirms that the dynamical nature of the signal is 
functional.  

 
0

0.2

0.4

0.6

0.8

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
 

Figure 12. The continuous line represents the fitness values obtained by a team of non-reactive robots tested in a 
control condition in which robots are forced to emit a stable signal when they are located in a target area that 
contains two or more robots. The vertical and horizontal axis represent the fitness and the value of the signal, 
respectively.  The dashed line represents a benchmark showing the value of the fitness reached by a team tested 
in normal condition. For each condition robots are tested for 1000 trials. 
 
One reason that might explain the necessity to rely on an oscillatory signal in this 
circumstance is the fact that the signal C has at least three different functions: it informs other 
robot located in the target area of the presence of the signalling robot, it reduces the 
probability that other robots enter the target area, and it increases the probability that, when 
the target area contains more than two robots, one of the robot will exit the area. Indeed by 
analysing the behaviour of the robots in the test in which robots were forced to produce 
signals with a fixed value we observed that: (a) when the value of the signal is below 0.7, 
robots tend to erroneously exit from the target area also when the area includes only two 
robots, and (b) when the value of the signal is 0.7 or above, both the ability to reduces the 
chances that other robots joint the target area and the ability to increase the chances that 
robots exit from the target area (when the area contains more that two robots) are severely 
impaired. Another possible reason that might explain the necessity to produce an oscillatory 
signal is the fact that the signal C must produce the same effect (i.e. reduce the chances that 
other robots enter in the target area) both when the signal is produced by two or three 
interacting robots located into the same target area, and two different effects (i.e. increase the 
chances that one robot exit from the target area or not) when the signal is produced by three 
or two robots located in the same target area, respectively. 

The frequency of oscillation of signal C varies when the signal is produced by two or 
three robots located in the same target area. Indeed, by analysing the signals produced in these 
two circumstances by a Fourier Transform, we observed that the frequency and the 

 17


spectrograms are different in the two cases (see Figure 13). These two signals, C1 and C2, 
emitted by a robot located in an area with another or two other robots respectively, have 
different functions since signal C2 increases the chances that one of the robot exits from the 
target area while signal C1 does not. 
 
 
Figure 13. The two spectrograms (obtained by the Fourier Transform) of the signal C emitted by a robot. The 
left and right pictures correspond to the signal produced by a robot located in a target area that contains another 
or another two robots, respectively. The sampling frequency is 10Hz, since each communication output is 
emitted by a robot each 0.1 seconds. Therefore, the frequency components range is [0,5] Hz. 

6. Relation between individual and social/communicative behaviour  

Since the robots individual and social behaviour co-evolve we might wonder which the 
relation between these two forms of behaviour is and how the possibility to co-adapt these 
forms of behaviour is exploited in evolved individuals. 

The fact that the performance of robots that are tested in the “Deprived” control condition 
is similar to that of robots evolved and tested in a “No-signal” control condition (see Figure 3) 
indicates that evolved robots develop an effective individual behaviour (i.e. a behaviour that 
maximizes the performance that can be achieved without signalling) even if they have always 
been evaluated in a normal condition (in which signals are available). The adaptive pressure 
toward the development of an effective individual behaviour can be explained by considering 
that the social enhancement that can be achieved by exploiting the signal produced by the 
other robots is not always guaranteed. Indeed, the availability of the signals required is due to 
the presence of other robots in the right environmental locations that, in turn, is influenced by 
unpredictable variable such us the initial positions and orientations of the robots.  

By analysing the behaviour displayed by evolved robots tested in the “Deprived” control 
condition (Figures 15), we can see that both reactive and non-reactive robots are able to spend 
about 60% of their lifetime in the three most favourable conditions (in which the team gathers 
a fitness of 0.5, 0.75, or 1.0) and less than 10% of their lifetime in the two least favourable 
conditions (in which the team gathers a fitness of -0.25 or -1.0). These performances are 
achieved through a simple behaviour (see Figure 14) that includes the following elementary 
behaviours: (a) when robots approach walls or other robots they avoid the obstacles by 
turning approximately 90o; (b) when the robots are far from walls and are not located in target 
areas they move by producing a curvilinear trajectory; (c) when the robots are located in a 
target area, they remain in the area by turning on the spot.  

Reactive and non-reactive robots mainly differ with respect to the curvilinear trajectories 
produced far from walls, which lead to smaller and larger semi-circles in the case of reactive 
and non-reactive robots, respectively. These larger semi-circular trajectories allow non-
reactive robots to find the target areas much more quickly with respect to non-reactive robots. 

 18


Indeed, the percentage of lifecycle in which all the four robots are located in the target areas 
is about 17% and 44%, on average, in the case of reactive and non-reactive robots (see Figure 
15). The fact that non-reactive robots are better in finding the target areas, however, does not 
translate into better performance (in the “Deprived” condition) since non-reactive robots are 
more likely to spend their lifetime both in positive conditions (in which each target area 
contains one or two robots) and negative conditions (in which a target area contains three or 
four robots). As a consequence, although non-reactive robots display a better exploration 
strategy than reactive robots, overall performance of reactive and non-reactive robots is 
similar in “Deprived” conditions. 
 

Figure 14. Typical behaviour displayed by the team of evolved robots in a “deprived” condition in which they 
are not allowed to detect other robots’ signals for reactive (left) and non-reactive robots (right). The numbers 
indicate the starting and ending position of the corresponding robot (the ending position is also marked with a 
white circle). 

 
0

0.05

0.1

0.15

0.2

0.25

0.3

void 1 2 1 + 2 2+2 1+3 3 4
 

 19


Figure 15. Percentage of lifecycles spent by a team of four robots in the 8 possible different states tested in a 
“Deprived” condition in which robots are not allowed to detect other robots’ signals. “Void” indicates the case 
in which all the four robots are located outside target areas (fitness = 0.0). “1” indicates the case in which only a 
single robot is located in a target area (fitness = 0.25). “2” indicates the case in which two robots are located in 
target areas. “1+2” indicates the case in which one robot is located in a target area an other two robots are 
located in the other target area (fitness = 0.75). “2+2” indicates the case in which each of the two target area 
contains two robots (fitness = 1.0). “1 + 3” the case in which one target area contains one robot and the other 
three robots (fitness = 0.0). “3” indicates the case in which three robots are located in the same target area 
(fitness = -0.25). “4” indicates the case in which four robots are located in the same target area (fitness = -1.0). 
Grey and black bars represent the performance of reactive and non-reactive robots, respectively. Average 
performance obtained by testing the robots for 1000 trials lasting 1000 cycles. 
 
When we look at the relation between individual and social behaviour, however, we can see 
that the characteristics of the individual behaviour exhibited by reactive and non-reactive 
robots perfectly match with the characteristics of their social/communicative behaviour. In 
other words, individual and social behaviour are tightly co-adapted.  

The fact that reactive robots display a ‘sub-optimal’ exploration behaviour (i.e. a 
behaviour that does not maximize the probability to quickly find and enter into target areas) 
can be explained by considering their limited ability to avoid target areas that already contain 
two robots and to exit from target areas that contain more than two robots (see Figure 6 and 
7). A reliable ability to avoid situations in which more than two robots are located into the 
same target area thus constitute a pre-requisite for the emergence of a better exploration 
ability. The lack of this pre-requisite in reactive individuals explains why their exploration 
ability is not further optimised. 

On the other hand, the better capability of non-reactive robots to avoid situations in 
which more than two robots are located in the same target area (see Figure 6 and 7), explains 
why evolved non-reactive robots developed a more effective exploration behaviour. Indeed, if 
we look at the time spent by robots in target areas that contain more than two robots, we can 
see that in a “Deprived” condition, non-reactive robots are much worse than reactive robots 
(Figure 15). In normal conditions, instead, non-reactive robots are much better than reactive 
robots (Figure 16).  
 

 20


0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

void 1 2 1 + 2 2+2 1+3 3 4
 

Figure 16. Percentage of lifecycles spent by a team of four robots in the 8 possible different states (see the 
legend of Figure 15) tested in a normal condition. Grey and black bars represent the performance of reactive and 
non-reactive robots, respectively. Average performance obtained by testing the robots for 1000 trials lasting 
1000 cycles. 
 

7. Discussion 
 
In this paper we have described the results of an experiment in which an effective 
communication system arises among a collection of initially non-communicating agents 
evolved for the ability to solve a collective navigation problem.  

By analysing the obtained results we observed how evolving individuals developed: (a) 
an effective communication system, (b) an effective individual behaviour, (c) an ability to rely 
on different communication modalities and to autonomously select the modality that is 
appropriate to the current circumstances. 

The communication system that emerges in the experiments is based on 4-5 different 
signals that characterize crucial features of the environment, of the agents/agents relations, 
and agents/environmental relations (e.g. the relative location of a target area, the number of 
agents contained in a target area, etc.). These features, that have been discovered 
autonomously by the agents themselves, are grounded in agents’ sensory-motor experiences. 
Used signals, therefore, do not only refer to the characteristics of the physical environment 
but also to those of the social environment constituted by the other agents and by their current 
state. Evolved individuals also display an ability to appropriately tune their individual and 
communicative behaviour on the basis of the signals detected (e.g. by approaching, avoiding, 
or exiting a target area, by modifying their exploratory behaviour, etc.)  . Indeed, the type of 
signals produced, the context in which they are produced, and the effect of signals detected 
constitute three interdependent aspects of the communication system that co-adapt during the 
evolutionary process and co-determine the ‘meaning’ and the efficacy of each signal and of 
the communication system as a whole. 

 21


The individual behaviour of evolved robots includes simple elementary behaviours that 
allow robots to avoid obstacles, explore the environment, and remain in target areas. 
Interestingly, robots individual behaviour tend to be optimised (with respect to the possibility 
to obtain the best possible performance when signals produced by other robots are not 
available) despite the fact that robots are always evaluated in social conditions during the 
evolutionary process. This unexpected result might be explained by considering that required 
signals might not always be available (even in normal conditions in which robots are allowed 
to communicate) since their availability depends on the physical location of the robots in the 
environment that in turn depends on unpredictable events such us robots initial positions and 
orientations or noise). In other words, optimised individual behaviours guarantee good 
performance even when required signals are not available. This tendency to optimise both 
individual and social behaviour leads to the development of control systems structured 
hierarchically according to a layered organization in which the individual abilities represent 
the most basic layer and communication/social ability represents an higher level layer that 
modulates the lower level. 

The fact that communication abilities represent a high level structure that modulates the 
basic individual behaviours of the robots does not prevent evolving robots to co-adapt their 
individual and communicative behaviour. Indeed, by comparing the results of different 
replication of the experiments, we observed that individual behaviours tend to be selected in 
order to maximize individual performance (when signals from other robots are not available) 
but also in order to maximize the performance that can be achieved by combining the robots 
individual and social capabilities.  

Evolved robots also exploit different communication modalities (e.g. mono-directional 
communication forms in which one robot act as a ‘speaker’ and a second robot act as a 
‘hearer’ or bi-directional communication forms in which two robots concurrently influence 
each other through their signalling and/or motor behaviour) by selecting the modality that is 
appropriate to each specific communicative interaction. Evolving individuals also engage in 
complex communication behaviours that involve three different robots that concurrently 
affect each other so to produce appropriate collective behaviours (e.g. so to push one of the 
three robots located inside the same target area out of the area). Evolved robots also exploit 
time varying signals that allow them to generate information that is not available to any single 
robot (e.g. information related to how many robots are located in a target area) and that serve 
different functions. 

The analysis of the evolutionary dynamics suggests that new individual capabilities might 
represent a crucial pre-requisite toward the development of new communication capabilities 
and vice versa. For example, the individual ability to explore the environment by entering and 
remaining into target areas represents a crucial pre-requisite for the development an ability to 
produce signal B, that attract other robots toward the target area. On the other hand, the 
emergence of a social/communicative ability to avoid target areas that contain two robots and 
to exit from areas that contain more than two robots, represents crucial pre-requisites for the 
development of better individual exploration strategies. In fact, as we showed in section 6, 
very effective exploration strategies provide an adaptive advantage only in combination with 
effective communication systems that allow to robots to avoid situations in which more than 
two robots are located in the same target area. This process in which progress in individual 
abilities might pose the basis for the achievement of progresses in communication abilities 
and vice versa might lead to an open ended evolutionary phases in which individuals tend to 
develop progressively more complex and effective strategies.  

 22


Acknowledgments 

The research has been supported by the ECAGENTS project funded by the Future and 
Emerging Technologies programme (IST-FET) of the European Community under EU R&D 
contract IST-1940.   
 

References 

Baldassarre G., Nolfi S. & Parisi D. (2003). Evolving mobile robots able to display collective 
behaviour. Artificial Life, 9: 255-267. 

Cangelosi A. & Parisi D. (1998) The emergence of a ‘language’ in an evolving population of 
neural networks. Connection Science, 10: 83-97 

Di Paolo E.A. (2000). Behavioural coordination, structural congruence and entrainment in a 
simulation of acoustically coupled agents. Adaptive Behaviour 8:1. 25-46. 

Kirby S. (2002). Natural Language from Artificial Life. Artificial Life, 8(2):185--215. 
Iizuka H. and Ikegami T. (2003a). Adaptive Coupling and Intersubjectivity in Simulated 

Turn-Taking Behaviours. In Banzahf et al. (Eds.), Proceedings of ECAL 03, Dortmund: 
Springer Verlag. 

Iizuka H. and Ikegami T. (2003b). Simulating Turn-taking Behaviors with Coupled 
Dynamical Recognizers. In  R.K. Standish, M.A. Bedau and H.A. Abbass (Eds.), MIT, 
Proceedings of Artificial Life VIII, Cambridge, MA: MIT Press. 

Iizuka H. & Ikegami T. (2004). Simulating autonomous coupling in discrimination of light 
frequencies. Connection Science. 16(4): 283-299. 

Marocco D., Cangelosi A. & Nolfi S. (2003), The emergence of communication in 
evolutionary robots. Philosophical Transactions of the Royal Society London - A, 361: 
2397-2421. 

Nolfi S. (2002). Evolving robots able to self-localize in the environment: The importance of 
viewing cognition as the result of processes occurring at different time scales. Connection 
Science (14) 3:231-244. 

Nolfi S. (2005). Emergence of Communication in Embodied Agents: Co-Adapting 
Communicative and Non-Communicative Behaviours. Connection Science. (17) 3-4:231-
248.   

Nolfi S. & Marocco D. (2001). Evolving robots able to integrate sensory-motor information 
over time, Theory in Biosciences, 120:287-310. 

Quinn M. (2000). Evolving cooperative homogeneous multi-robot teams. In Proceedings of 
the IEEE / RSJ International Conference on Intelligent Robots and Systems (IROS 2000). 
IEEE Press. 

Quinn M. (2001). Evolving communication without dedicated communication channels. In 
Kelemen, J. and Sosik, P. (Eds.) Advances in Artificial Life: Sixth European Conference on 
Artificial Life (ECAL 2001). Springer Verlag. 

Quinn M., Smith L., Mayley G. & Husbands P. (2003). Evolving controllers for a 
homogeneous system of physical robots: Structured cooperation with minimal sensors. 
Philosophical Transactions of the Royal Society of London, Series A: Mathematical, 
Physical and Engineering Sciences 361, pp. 2321-2344. 

Steels L. (1999). The Talking Heads Experiment, Antwerpen, Laboratorium. Limited Pre-
edition. 

Steels L. (2003) Evolving grounded communication for robots. Trends in Cognitive Science. 
7(7): 308-312. 

Steels L. and Kaplan F. (2001). AIBO's first words: The social learning of language and 
meaning. Evolution of Communication, 4:3-32. 

 23


Steels L. & Vogt P. (1997) Grounding adaptive language games in robotic agents. In: P. 
Husband & I. Harvey (Eds.), Proceedings of the 4th European Conference on Artificial 
Life. Cambridge MA: MIT Press. 

Werner, G.M. & Dyer M.G. (1991).  Evolution of communication in artificial organisms.  In 
Langton, C. G., Taylor, C., Farmer, J. D., and Rasmussen, S. (Eds.) Proceedings of the 
Workshop on Artificial Life. pages: 659-687.  Reading, MA, Addison-Wesley. 

Wagner K., Reggia J.A., Uriagereka J., Wilkinson G.S. (2003). Progress in the simulation of 
emergent communication and language. Adaptive Behavior, 11(1):37-69.  

   
 24


	Introduction
	Related literature
	Experimental set-up and emergence of communication
	The evolved communication system: signals produced and their
	Experiment I – Reactive robots
	Experiment II – Non-Reactive robots
	The evolved communication system: communication modalities
	Relation between individual and social/communicative behavio
	Discussion