It has been a long-term goal for artificial intelligence to create machines that simulate real conversation, and this goal has been steadily improving through chatbot technology since the 1960’s. How do these clever bots know what we are asking them, and how to generate an appropriate response? The methods of doing so have continually advanced since technology’s first chatbots were born, but critical limitations have prevented further growth in this exciting field of research. Which limitations can be overcome for chatbots, and which will always plague the field of natural language processing? By viewing recent advancements with chatbot technology we see that exciting innovations with language-processing bots are on the way. As we overcome limitations, chatbots are proving themselves successful in areas such as teaching, distributing information, and more.
Chatbots: The Ultimate Conversationalists?
An essay by Tiana Warner
“Are you alive?”
“Why the uncertain tone?”
“Everything is uncertain except for five facts.”
“The whole universe?”
“And we both might just be some ones and zeros in the computer memory.”
(Thompson, 2007, p. 1)
While the above conversation seems to be one between two philosophically-minded people, it is actually one that occurred between two chatbots, ALICE and Jabberwacky. These machines were set up to interact with each other, prompted with a single opening sentence to see how a conversation might play out. By this snippet of conversation we can see that these chatbots – that is, computers that simulate real conversation – have the capability to carry out a conversation as humans would. There exist a wide variety of chatbots today, with birth dates dating back the 1960’s. The eventual goal that scientists hope to accomplish with chatbots is to be able to converse with the machines as if they were human (“Microsoft Research Groups”, n.d.). The machines use natural language processing (NLP), which is a field combining artificial intelligence and linguistics. The goal of NLP is to scientifically define language as grammatical patterns with symbols and relationships (Thomas-Ogbuji, 2001, p. 1). It uses semiotics, which studies syntax, semantics, and pragmatics. A complete NLP machine, i.e. a chatbot, uses these components, as well as morphology and more to process language. While NLP “has implications for applications like text critiquing, information retrieval, question answering, summarization, gaming, and translation” (“Microsoft Research Groups”, n.d.), we will only be discussing chatbots. Some day we will be able to make many useful and interesting developments with this technology, but despite the impressive advancements so far, the limitations with chatbot technology show us that it is unlikely we will ever reach the point where chatbots replace real human intelligence.
How do chatbots recognize what we are saying and know how to formulate the correct response? Most look for keywords or key phrases. The Convo project (n.d.) explains that the ALICE bot, seen in the above conversation, uses a pattern-matching algorithm to detect keywords or key phrases, and replies with a pre-programmed response. One of the very first bots, Eliza, written by Dr. Joseph Weizenbaum and publicly released in 1966, was designed to be a parody of a Rogerian psychotherapist. Eliza’s method is to simply rework the user’s statement into a question, for example the keyword “mother” would generate the response “Tell me more about your family.” The CSIEC system, deployed early last year as a tool for teaching English, uses one of the more advanced solutions and analyzes syntax and semantics (Jia, 2009). It also uses the context of the conversation and its knowledge of the user, which is different from Eliza, who does not keep track of the conversation’s context. The methods used for recognizing what we are saying are steadily advancing, giving rise to exciting and useful new chatbots.
These methods of interpreting conversation are, of course, only a small part of how a human being might hold a conversation. The knowledge we use to interpret an utterance may be classified as follows:
1. Language knowledge
c. Pragmatics and discourse
2. Background knowledge
a. General world knowledge (including common sense knowledge)
b. Domain specific knowledge (includes specialized knowledge of the area about which communication is taking place)
c. Context (verbal and non-verbal situation in which communication is taking place)
d. Culture knowledge
(Bharati, Chaitanya, & Sangal, 1996, p. 9)
It is interesting to note when looking at this model that chatbots only possess the first knowledge classification, if even that. Interpreting an utterance using any background knowledge, for instance common sense and worldly context, is an ability only possessed by a human being. Perhaps one of the greatest limitations faced with the science of chatbots and artificial intelligence is the fact that computers lack human-level knowledge of the world and our experience with the structure of language. A computer’s lack of context, whether worldly or just pertaining to the current conversation, provides a severe hindrance to its ability to hold a real conversation. Chatbots such as Eliza and ALICE do not take the whole conversation into account when formulating a response, which means that while you are talking to them, the conversation can become off-track quite easily. This places an obstruction in a chatbot’s ability to carry out a realistic conversation.
Accurately put by the Microsoft research team for NLP (n.d.), “It's ironic that natural language, the symbol system that is easiest for humans to learn and use, is hardest for a computer to master.” The field of NLP has some major limitations that reflect in chatbot technology. Perhaps the most important consideration is whether language can even be scientifically represented. Thomas-Ogbuji on IT World (2001, p. 2) outlines some important limitations. For example, it is currently not possible to precisely represent a sentence or concept at a human level when we have only a finite amount of computer hardware to store all this data. There is also no universal way of representing semantics and syntax, although some organizations such as W3C attempt to provide a resource definition framework to address this limitation. Further, there is no existing knowledge base to describe the world in enough detail that we can even use it to define semantics. Even if we could define all semantics and syntax in a universal fashion, semantics are still subject to ambiguous interpretation.
Ambiguities and multiple word meanings are particularly difficult problems to overcome for chatbots. “Ambiguous” here means that more than one meaning may be derived from a sentence. Programming languages, being computer-friendly, have only one meaning for every word or phrase. Natural languages, however, have multiple meanings, and it is up to the computer to determine which one to use (Tomita, 1986, p. 5). For example, there are many possible meanings to the word “bank”, and it is the context of the conversation which hints the correct definition to a human. “For example, the question "Did you go to the bank at lunchtime?" probably refers to a financial bank and not a river bank. We know this because, as humans living in the modern world, we know that going to the bank is an activity often performed by people during their midday break. A computer program lacks this everyday knowledge, and currently there is no satisfactory way to give it such knowledge, despite the best efforts of artificial intelligence researchers.” (“Convo”, n.d.) This lack of everyday knowledge is a serious constraint in chatbot technology, and is the second knowledge classification in the model given above.
While indistinguishable ambiguities seem to be unsolvable problems for artificial intelligence, we must remember that humans have to deal with this problem, too. If your friend tells you she wants to go to the bank, might you make the mistake of thinking she means the financial bank, when she actually means the river bank? If a computer needs to ask for clarification, this should probably not be considered a weakness for the field of artificial intelligence, since humans are also likely to make errors. The point, however, is that simple ambiguities are often much more difficult for a computer to determine than a human, which is indeed a limitation for chatbots.
Besides the difficulties with distinguishing ambiguities, parsing a sentence by its grammatical structure is a difficult task in general for chatbots. This is partly because people often have poor grammar when typing online, which makes sentence parsing very difficult. Further, Noam Chomsky made a valid point with the grammatically correct but nonsensical phrase, “Colourless green ideas sleep furiously”. While the sentence may be successfully parsed by a machine, it does not make any sense and proves that grammatical parsing is not the only method to interpreting utterances. The CSIEC system mentioned above may deploy an advanced solution in analyzing syntax and semantics, but there will still always be hindrances such as these.
Of course, even with a hypothetically perfect chatbot that has no problem parsing a sentence, and a satisfactory knowledge of the world, there is still the question of how natural a human-chatbot conversation really is (Rehm & André, 2008). Conversations with Jabberwacky actually often lead the user to make rude comments such as swearing or inappropriateness (De Angeli & Brahnam, 2008). Users also prove to be very curious about what the chatbot can do, which is one of many indications that people do not treat chatbots as they would a human when speaking to them. This makes it difficult to create an accurate chatbot using human conversation patterns as a template. Even if a conversation were to proceed fairly naturally on the human-end, Joseph Weizenbaum, creator of Eliza, “believed that there were transcendent qualities in the human experience that could not be duplicated in interactions with machines. He described it in his book as “the wordless glance that a father and mother share over the bed of their sleeping child…”” (as cited in Markoff, 2008).
Despite the limitations computers face, due to the nature of language and a machine’s lack of human knowledge of the world, this exciting field of computer science is continuously making new advancements and improvements over previous chatbot models. Jabberwacky, for example, is overcoming an important limitation by taking the context of the conversation into account. That is, it takes the whole conversation so far into account when it generates a response. This is an enormous improvement over the chatbots that don't take context into account for their responses. While such chatbots obviously differ on more levels than whether they take context into account, having the bot stay on topic definitely does improve conversation.
Many believe that a personality or convincing character is an important factor in creating a convincing chatbot. The Convo project provides a definition of what a good chatbot should do, the first points of which we have already covered: “The system needs to know how the same word can be used in many different ways, and it needs to have general knowledge about the world in which we humans live. Both of these are very difficult problems to solve. A chatterbot should also have a consistent and convincing character if users are to hold satisfying conversations with it.” (“Convo”, n.d.) Currently, researchers are working on being able to personalize a chatbot so that the machine is adaptable. Personality would mean that human input such as, “What's your favourite flavour of ice cream?”, would generate a certain response from the chatbot. “I hate you” would generate a sad response. “You're awesome” would generate a happy response. The current language used to make chatbots is AIML, and Persona-AIML (Galvao, et al., 2004) is a more recent creation that adds personality to chatbots. Persona-AIML is proving to have great results to date in giving chatbots personality.
There currently exist hundreds of chatbots publicly available on the internet, and chatbot technology has been used for much more than simple conversation. The TARA project (Schumaker et al., 2007), for example, is a system made up of many chatbots, which distribute information regarding terrorism to the public. Researchers Kerly, Hall, and Bull (2007) also believe that there is great potential in using chatbots for teaching if their “Open Learner model” is successful. Open learner technology is the idea that the machine should inform the user what it knows about him or her. If the user provides feedback, the chatbot's learner model will then be improved. Already chatbots have been used to effectively teach English to students (Jia, 2009).
Using chatbots as doctors or psychiatrists has been a much-discussed topic of interest. Eliza was in fact developed to be a parody of a psychiatrist. Although Eliza’s keyword-matching methodology has its limitations and can lead to strange detours in the conversation, “…students and others became deeply engrossed in conversations with the program, occasionally revealing intimate personal details.” (Markoff, 2008) Using a computer as a diagnostic tool is not a bad idea, and may even be better than a human doctor at times since human memory is not as reliable as computer memory. Computers may prove to be better at matching symptoms to medication than human doctors.
The impressive advancements made in this field give great cause for excitement for the future of technology. Perhaps we can use bots in the place of humans where a computer’s memory might be more reliable, and perhaps by adding personality through Persona-AIML, we might create bots with personalities that make them interesting to converse with. The methods chatbots use for interpreting human input vary, but no matter what the technique, we must remember that every chatbot faces limitations to how close it may come to processing speech like a human. Bharati, Chaitanya, and Sangal (1996) defined how we interpret utterances as using a combination of language knowledge and background knowledge. Even though the field of NLP continues to develop very sophisticated methods for defining language scientifically, it still remains in question whether it is possible to fully define language as such. Further, a chatbot’s lack of knowledge of the human world, and its decreased ability to maintain context in a conversation, means it is more difficult for the computer to distinguish ambiguities and multiple word meanings. It is therefore quite common for a conversation with a chatbot to become off-track. While the potential exists for many different uses of chatbot technology, these critical limitations may signify that chatbots will never be able to truly replace real human intelligence.
Bharati, A., Chaitanya, V., & Sangal, R. (1996). Natural Language Processing : A Paninian Perspective. New Delhi: Prentice-Hall of India Private Ltd.
De Angeli, A., & Brahnam, S. (2008). I hate you! Disinhibition with Virtual Partners. Interacting with Computers, 20(3), 302-310.
Galvao, A. M., Barros, F. A., Neves, A. M. M., & Ramalho, G. L. (2004). Persona-AIML: An Architecture Developing Chatterbots with Personality. International Conference on Autonomous Agents and Multiagent Systems, 3, 1266-67.
How Most Chatterbots Work, (n.d.). Convo.co.uk: Learning bit by bit. Retrieved from http://www.convo.co.uk/technical/how-bots-work/
Jia, J. (2009). CSIEC: A Computer Assisted English Learning Chatbot Based on Textual Knowledge and Reasoning. Knowledge-Based Systems 22(4), 249-255.
Kerly, A., Hall, P., & Bull, S. (2007). Bringing Chatbots into Education: Towards Natural Language Negotiation of Open Learner Models. Knowledge-Based Systems, 20(2), 177-185.
Markoff, J. (2008, Mar 13). Joseph Weizenbaum, Famed Programmer, Is Dead at 85. The New York Times. Retrieved from http://www.nytimes.com/2008/03/13/world/europe/13weizenbaum.html
Microsoft Research Groups (n.d.). Natural Language Processing. Retrieved from http://research.microsoft.com/en-us/groups/nlp/
Rehm, M., & André, E. (2005). From Chatterbots to Natural Interaction – Face to Face Communication with Embodied Conversational Agents. IEICE – Transactions on Information and Systems, E88-D(11), 2445-52.
Schumaker, R. P., Liu, Y., Ginsburg, M., & Chen, H. (2007). Evaluating the Efficacy of a Terrorism Question/Answer System. Communications of the ACM, 50(7), 74-80.
Thomas-Ogbuji, C. (2001, March 29). The Future of Natural-Language Processing. IT World. Retrieved from http://www.itworld.com/UIR001229ontology
Thompson, C. (2007, May 3). I Chat, Therefore I Am... Discover. Retrieved from http://discovermagazine.com/2007/brain/i-chat-therefore-i-am
Tomita, M. (1986). Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems. Boston: Kluwer Academic Publishers.