At the conference on Human Computer Interaction in Paris (CHI-2013), one of the more interesting panels asked why spoken word dialogues between humans and computers have not had the success predicted. Voice recognition is now good, and the points of interaction with machines make voice-based dialogues not only easy but often preferable for safety reasons. Using voice commands when driving a car, for example, is certainly less hazardous than keyboard data entry. Voice-based systems are quite common, too; most people can hardly say they reject them because of unfamiliarity. Finally, voice-based dialogues seem ‘natural’; ‘intuitive’ one might say.
One would think that, taken together, these reasons would make voice-based interactions, dialogues with computing, the norm. And yet it isn’t.
Many of the participants in the panel (and those who added comments from the floor) suggested that the reason(s) for this had to do with a profound resistance amongst users to speaking with computers. Something about doing so left people feeling as if trust was at issue. Users either don’t trust in the systems they are dialoguing with, fearing they are being misled or fobbed off with interactions designed to trap them. Or they don’t trust in their own participation in such interactions: they fear they are being made fools of in ways they cannot understand.
These discussions led me to reflect on my own current reading. Dialogues with computing is certainly a hot topic – though the concern here is not with the adequacy of the technology that enables this – speech recognition engines, dialogue protocols and so forth. It has to do with the purposes or consequences of such dialogues.
For example, Douglas Rushkoff argues in his brief and provocative book, Program or be Programmed (2010), when people rely on computers to do some job, it is not like Miss Daisy trusting in her chauffeur to take her car to the right destination (an allusion to a film and book of the same name). It’s not what computers are told that is the issue. It’s what computers tell us, the humans, as they get on with whatever task is at hand. And this in turn implies things about who and what we are because of these dialogues with computing.
According to Rushkoff, there is no knowing what the purpose of an interaction between person and machine might be: it is certainly not as simple as a question of command and response. In his metaphor about driving, what comes into doubt are rarely questions about whether the computer has correctly heard and identified a destination. The dialogues that we have with computers lead us to doubt in why some destination is chosen. This in turn leads to doubts about whether such choices should be in the hands of the human or the computer. The computer seems to ‘know’ more, why should it not decide?
John Naughton, in his From Gutenburg to Zuckerberg (2012), raises similarly large issues again illustrated with destinations. For him we need to ask whether we can trust computing (and the internet in particular) to lead us to dystopia or to heaven–though the contrast he presents is not entirely without irony: heaven is represented in the duplicitous appeal of Huxley’s Brave New World or dystopia in the self-evidently bleak form of Orwell’s Nineteen Eighty Four (1984).
Meanwhile, Pariser complains in his Filter Bubble (2011) that we cannot trust in the dialogue with have with search engines: today, in the age of ‘the cloud’ and massive aggregation systems, search engine providers can hide things away from us in ways that we cannot guess. When we ask search engines something we cannot know what the answer will be for search engine technology is now deciding what we need or want; even what is good for us to know. That this is so is at once sinister and capitalistic, Pariser argues: sinister since it is disempowering of the human, capitalistic since it places the market above the public good. Search engines take you to what companies want to sell, not to what you need to know.
These books, subtle though they are, seem to miss something: they all assume that the issue is one about trusting either the computer or ourselves: that dialogues are between two parties, and the issue is that not both can be trusted – at least not all the time. And, importantly, it is not always the computer that breaks trust: sometimes a computer does know more than the human interlocutor, and so should be trusted to make the right decisions in certain circumstances. What these authors seem to miss is the question of what speaking with computers says about the value that people – that society more generally gives – to speech. John Durham Peters argues in his book, Speaking into the Air (1999), that one of the essential values that came out of the Old Testament was the Hebrew idea that speech distinguishes people from beasts. Or, rather, it is the capacity to speak to God that distinguishes humanity from the wild animal.
At the CHI conference I mention above, one of the panellists argued something similar: that people treat speaking as something hallowed, precious, a unique bond between people. It is therefore not a skill that should be debased into being a method of dealing with computers. As it happens this individual, Professor Matt Jones, of Swansea University, is a trained priest and so this view might reflect his desire to honour the spoken word as does the Old Testament. But as I listened to the various points of view put forward, including his own, I began to think that perhaps there is something to do with the status given to speech that leads people to resist defiling it with the mere task of communicating with computers. Perhaps there is something about our capacity to talk with other people (and our Gods if we so choose) that we want to preserve as well as honour.
This lead me to think of Wittgenstein and his remarks that if lions could speak we would not find anything to talk about with them. In his view, our conversations are about our human experience; what it means and feels to be human.
And then, as I reflected on the tribulations that using voice-based dialogues with computing induce, how foolish they can make one seem as they force us to keep repeating words and phrases, I began to realise that this foolishness might be making us feel less human. It degrades our hopes for what we want to be: gifted with words and talk, talk that bonds us with each other (and for some, like Matt Jones, to their God).
And then, as I recalled also the tasks one often seeks to undertake in such dialogues, I thought there was even more credit to the idea that talk with people is special. After all, a typical use of voice dialogues is to be found when someone calls a company to complain about a service or product. They find their attempts to speak with someone are spurned: they end up in engaged in endless and seemingly pointless dialogues with a computer!
This too, like the shame we feel when we are instructed on how to speak by computers, attests to our desire to speak to people.
Speech is not then a mere modality of interacting with computers; it’s a modality that has especial status for people: it’s the modality for being human. No wonder then that voice-based dialogues are not as popular as predicted. We really don’t want dialogues with computers.