PII: 0953-5438(89)90029-5


Modelling devices and modelling speakers 

T.J.M. Bench-Capon and A.M. McEnery 

The roles played in an illocutionary act by models of the means of 
communication and the communicator are distinguished, and qualita- 
tive differences between the models appropriate in the two cases 
identified. Applied to human-computer interaction, this means that a 
user must have models of the computer both as a communications 
device and a communications medium, and of the system author as 
interlocutor. 

Keywords: speech acts, illocutionary acts, user models, human- 
computer interaction 

This short note is a response to the commentary (Barlow et al., 1989) on our 
views that computer dialogues should be seen as mediated discourse 
(Bench-Capon and McEnery, 1989). One point needs clarification: the commen- 
tary stated that we restricted our discussion to nonintelligent, nonadaptive 
systems. While finding the use of ‘intelligent’ somewhat question-begging, we 
did intend to include systems of all kinds currently found, including expert 
systems, knowledge-based systems, hypertext systems, and anything else that 
might be termed ‘AI’. 

Even so, the commentary does take a rather more inclusive view of 
interaction than was the concern of the original paper. In that paper we were 
addressing only the question of how a user should go about the interpretation 
of messages from the computer system, and the selection of input to that 
system. 

It may be helpful to explain our reason for this focus. It was essentially that 
the elements of the interaction in which we were interested were those 
concerned with communication. We should perhaps at this point make clear our 
views on the nature of communicative acts. Following Austin (1962) and Searle 
(1969), we believe that an act of communication involves the simultaneous 
performance of a number of distinguishable acts. There is the ‘utterance act’ 
which involves the making of certain sounds, in the case of speech, or certain 
marks, in the case of writing, or causing certain images to appear on a computer 
screen, as in programming. Second there is a ‘propositional act’ in that these 

Department of Computer Science, University of Liverpool, PO Box 147, Liverpool L69 3BX, UK. Tel: 
(051) 794 3697. Email: sq35@uk.ac.liverpool.ibm 

220 0953-5438/89/020220-05 $03.00 @I 1989 Butteworth & Co (Publishers) Ltd 


Author’s reply 

sounds, marks or images may have some conventional semantics assigned to 
them by virtue of being part of some language. Third there is the illocutionary 
act in that the performer of the utterance act will have some intention to convey 
a meaning to the recipients of the act. This meaning is the illocutionary force, 
which may be distinct from the propositional force, the literal interpretation of 
the utterance. To complete the picture, there is the ‘perlocutionary act’, by 
which the recipient of the utterance ascribes a meaning to the utterance, which 
may be termed the ‘perlocutionary force’. A successful act of communication 
occurs when the illocutionary and perlocutionary forces coincide. An act of 
communication therefore requires both an originator of an utterance, whom we 
term the ‘speaker’ irrespective of the communication medium, and one or more 
recipients of the utterances (the ‘hearer(s)‘). It is part of the responsibility of the 
speaker to make his utterance in such a way as to best promote the coincidence 
of the illocutionary and perlocutionary forces. This will influence the utterance 
act both in the way it is performed, perhaps suggesting the use of block capitals 
for some handwritten information, and the utterance made, since a variety of 
utterances with different propositional forces could be used to perform a given 
illocutionary act. Again in the original paper we concentrated on the latter 
aspect, giving stress to the selection of utterance taking into account the 
intended hearer and his cultural and linguistic propensities. 

Thus our focus was on the interaction as an act of communication, posing the 
questions as to who is the speaker and who is the hearer, and offering 
programmer and user as answers. This was because we were interested in this 
aspect of the interaction and not because we wished to deny that any 
interaction at all with the device took place: to draw an analogy with telephone 
conversations, the act of communication is between the two people at either 
end of the telephone line. The telephone is a necessary condition of their 
communication, and they must understand how to use the telephone 
effectively, and must interact with their devices, but the communicative 
interaction that is taking place remains between the conversationalists. The role 
of secondary producers is a side issue with respect to communications aspects; 
the enjoyment of a telephone conversation may be enhanced by the ergonomic 
design of the instrument, but it contributes little to the understanding of what 
is being said. So too with books; a reader needs to have a model of the device so 
he can find particular statements, and may relish the fine binding and pleasant 
typeface, but understanding the content, ascribing a perlocutionary force, 
depends most crucially on the reader’s model of the author. When we meet the 
phrase ‘Mr Elton actually violently making love to her’ (Austen, 1816) in a novel, 
it cannot be understood in the absence of a model of the author: we must 
envisage an altogether less physical scene knowing that it was written by Jane 
Austen than we would if it had been written by a contemporary author such as 
Jackie Collins. 

Again one must not, however, read us as denying that the model of the device 
has any role in the speaker’s selection of utterance and the hearer’s 
interpretation of that utterance. A wise speaker will avoid the use of a word 

Bench-Capon and McEney 221 


which has a near homophone which could lead to misunderstanding when 
talking but need not take this trouble when writing, The use of the telephone 
places well-known constraints on useful forms of utterance; the lack of visual 
clues from the speaker may dictate a quite different form of words from what 
could be used in face to face conversation. Selection of utterance is a complex 
matter and must be influenced both by the speaker’s understanding of his 
audience, which we stressed, and the speaker’s understanding of the medium, 
which we did not. Similarly, interpretation is a complex matter to be performed 
in the light both of the hearer’s understanding of the speaker, and the hearer’s 
understanding of the medium. We remain of the opinion that the disregard of 
the medium is, however, more by way of a hindrance to effective communica- 
tion, whereas disregard of the partner in the communicative act is likely to have 
disastrous consequences. Thus the model of the medium may be seen as being 
of a secondary importance. 

Where the act of communication is mediated by a device, a model of the 
device is required for two different reasons. First, and most obvious, is that a 
model is needed because otherwise the participants in the interaction simply 
could not use the device to communicate. It is vital to know where to speak into 
a telephone if one is to be heard. This aspect of the device model assumes a 
greater importance in the case of computers than books and telephones. That is 
because we are all familiar with telephones and books (of a variety of sorts: 
novels, text books, directories, etc.), and they are so standard in their behaviour 
that we can use them without needing to develop a fresh device model for each 
telephone or book encountered or, indeed, without thinking about the model to 
any great extent. Only when confronted with a particularly novel sort of 
telephone do we need to consider how to use it. Computers, in contrast, are 
neither so familiar, nor so standardised, and have a potentially far greater range 
of functionality. Therefore when confronted with a new computer system it may 
well take us some time and experimentation before we develop our device 
model, and understand how to use the system effectively. Systems with WIMP 
interfaces provided a good example of this: when they were first introduced 
even experienced computer users took some time to become accustomed to the 
idea of opening icons with double clicks, dragging an icon around the screen by 
holding down the mouse button, and deleting files by dragging their icon to the 
mysterious picture at the bottom of the screen. Their existing models of the 
computer as a device were useless as a guide to the behaviour of the device with 
this extended functionality, and so needed extension to accommodate this extra 
functionality. 

The second role of the device model is that it provides an understanding of 
the constraints operating on communication via the device. If communication is 
to be effective, these constraints must be understood by both participants in the 
communicative act, so they can work within them. Both of these roles together, 
however, are not sufficient to enable effective communication: ultimatety 
successful understanding still requires that the hearer understands the speaker 
sufficiently to ascribe a meaning to the utterance. 

222 l~~e?acf~ng with Computers vot I no 2 {~989~ 


One concrete example may help here. Suppose a user is stuck while 
interacting with a system, knowing what he or she wishes to do, but unsure 
how to do it. The user must be aware, from knowledge of the constraints 
imposed by the medium, that it is necessary actively to seek help; while a 
puzzled look might elicit help in a face to face situation, it is of no avail in this 
kind of mediated situation. Next, also from the device model, the user must be 
aware of how to solicit help on the topic causing confusion. Assuming that the 
device model is sufficient to enable the user to reach this point, it will now be 
possible for the user to secure some help message. But this message needs to be 
interpreted, and here the user must employ some model of the originator of the 
message, namely the programmer. 

Users thus require models both of the computer, as a device and as a medium 
of communication, and of the system author, as speaker. The programmer 
requires similar models of the computer, and a model of the user as hearer. It is, 
however, vital to recognise that these models are qualitatively quite distinct. 
The models of the device are not only playing different roles from the model of 
the speaker, but are of an entirely different nature. The model of the system as a 
device is essentially there to answer questions of how to achieve certain effects, 
and comprises of a number of causal hypotheses of the form ‘if I do this, then 
this will happen’. These hypotheses are meant to be consistent, so that the same 
cause is expected to result in the same effect. The model of the system as means 
of communication is quite different: this comprises a number of constraints on 
the interaction resulting from the features of the medium as compared with 
other potential communications media. Thus this kind of model inhibits forms 
of utterance dependent on visual clues from use over a telephone line, and 
forms of utterance depending on tone of voice from writing, and so on. 
Obedience to the constraints is, however, a matter of judgement: irony often 
depends on visual clues for its detection, but is not wholly impossible over the 
telephone, provided one is well known to the person with whom one is 
speaking. The mechanism of cause and effect is not paramount here. Models of 
speakers and hearers are different yet again; while there are strong causal 
elements here in that it is used to answer questions of the form ‘how can I best 
explain this to my hearer? and *what made the speaker use that expression?‘, 
we have no expectation of consistency such as we have with devices. Use of the 
speaker/hearer models is an art, while use of the device models can be a science. 
This may just be because of the complexity and mutability of speakers and 
hearers as compared with devices, but it needs to be recognised by the users of 
such models. 

The remark towards the end of the commentary (Barlow et al. 1989) about the 
probable cognitive impossibility of a user modelling the producer of the system 
probably stems from a confusion of these two kinds of model. A definitive 
causal model, in the sense of knowing which buttons to press to evoke a 
particular response from a person, may well be impossible. This cannot, 
however, be the case for a model of the sort appropriate to a speaker, since were 
it impossible, it would be impossible to understand any communication. Such a 
model is a necessary prerequisite of any communicative function, as we 

Bench-Capon and McEnery 223 


Author’s reply 

attempted to argue in our original paper, and as is supported in Grice (1982), 
Diaper (1988) and elsewhere. Such a model must take into account any effect of 
the medium on the message, but while the characteristics of the device being 
used may be of some significance, the illocution will still depend for its success 
on the models held by the communicants of one another. All users of a language 
are more or less good at forming such models, and form such models whenever 
they participate in an interaction with another person. Nor is there any reason 
why the ‘best professional psychologists’ should be any better than anyone else 
at forming these kinds of models. The skill required here is not an 
understanding of the mind in a causal sense, but a social skill everyone 
develops and hones from infancy. 

To conclude, our original paper focused on one particular aspect of 
human-computer interaction, namely the interaction as a communicative act. 
We did so because we felt that this aspect had been neglected or misunder- 
stood: had we been trying to be uncontroversial we might have better titled our 
paper ‘People communicate through computers, not with them’. Of course it is 
necessary to consider the role of the device in the interaction, and the 
constraints that it imposes on the communication. We do, however, believe that 
it is fruitful to see the interaction fundamentally as a communication between 
programmer and user, since this enables the distinctions made in philosophy 
alluded to above to be exploited within the discipline of human-computer 
interaction. It is worth considering as separate issues how to aid users in 
constructing device models so they can know how to use the medium to 
communicate, how the utterance should be formed (particularly as the 
computer offers such a range of options such as text, pictures, and sound), what 
constraints the mediation of the computer places on the forms of utterance, and 
how the user can be aided in constructing a model of the speaker. Our view is 
that these distinctions are currently blurred, to the disservice of all concerned. 
Finally it would assist the designers of systems if they recognised themselves as 
speaking to their users through the computer rather than as building a system 
that would speak to users on its own account. We trust that this note helps make 
clear the distinctions that underlie our views, and corrects any overstatement in 
our original paper. 

References 

Austen, J. (1816) Emma Thomas Nelson, London, 116 

Austin, J.L. (1962) Now to do things with words Oxford 

Barlow, J., Rada, R. and Diaper, D. (1989) ‘Interacting WiTH computers’ Interacting with 
Computers 1, 1, 39-42 

Bench-Capon, T.J.M. and McEnery, A.M., (1989) ‘PeopIe interact through computers 
not with them’ ~~ferucfi~g wifh Co~~ufers 1, 1, 31-38 

Diaper, D. (1988) ‘Natural language communication with computers: theory, needs and 
practice’ in Duffin, P. (ed.) KBS in Government 88 Blenheim Online, Pinner, UK, 19-44 

Grice, H.P. (1982) ‘Meaning revisited’ in Smith, N. (ed.) Mutual knowledge Academic 
Press, London, UK 

Searle, J.R. (1969) Speech acts Cambridge University Press, Cambridge, UK 

224 Interacting with Computers vol 1 no 2 (1989)