3 Efficiency of Knowledge Transfer by Hearing a Conversation While Doing Something

One of the most common means of acquiring useful knowledge is reading suitable documents and websites. However, this is time-consuming and cannot be done in parallel with other tasks. Is there a way to acquire knowledge when we cannot read written texts, such as while driving a car, walking around or doing housework? It is not easy to remember the contents of a document simply by listening to its reading aloud from the top, even if we concentrate while listening. In contrast, it is sometimes easier to remember words heard on the radio or television even if we are not concentrating on them.


Introduction
One of the most common means of acquiring useful knowledge is reading suitable documents and websites. However, this is time-consuming and cannot be done in parallel with other tasks. Is there a way to acquire knowledge when we cannot read written texts, such as while driving a car, walking around or doing housework? It is not easy to remember the contents of a document simply by listening to its reading aloud from the top, even if we concentrate while listening. In contrast, it is sometimes easier to remember words heard on the radio or television even if we are not concentrating on them.
While we are doing something, listening to conversation is better than listening to a precise reading out of a draft or summary for memorizing the contents and turning them into knowledge. We are therefore trying to improve the efficiency of knowledge transfer 1 by "hearing a conversation while doing something." In order to support knowledge acquisition by humans, we aim to develop a system which provides people with useful knowledge while they are doing something or not concentrating on listening. We did not try to edit notes to be read out, or to summarize documents; rather, we aimed to develop a way of transferring knowledge. Specifically, in order to provide knowledge efficiently with computers, we consider how to turn the content into a dialogue that is easily remembered, and develop a system to produce dialogue by which one can easily acquire knowledge.
In the next section of this article, we explain our prototype system named "Sophisticated Eliza" (Isahara et al., 2005) Then, we discuss the idea of "Efficient knowledge transfer by hearing a conversation while doing something" (Yamamoto & Isahara, 2008).

Intelligent Systems 68
In section 4, we evaluate the effectiveness of knowledge transfer via listening to conversation, comparing it with listening to monologues. We firstly choose several topics and select suitable documents on the topic. Then we extract informative sentences from the document and form conversation by splitting a sentence into conversational fragments. In order to verify our hypothesis, we conduct evaluation on the usefulness of listening conversation formed by the fragments with human subjects. We will also get the suggestion from the experiments how to select suitable domain for our system.
In section 5, we introduce our prototype system and present some examples of the conversations extracted by the system. As for the information transfer system, although our final target is to handle topics which are practically useful such as knowledge from newspapers, encyclopedia and Wikipedia, as a first step we tried to compile rules for small procedural domains such as cooking recipes.

Sophisticated Eliza
Recently, thanks to the improvement of natural language processing (NLP) technology, development of high-performance computers and the availability of huge amounts of stored linguistic data, there are now useful NLP-based systems. There are also practical speech synthesis tools for reading out documents and tools for summarizing documents. These tools do not necessarily use state-of-the-art technologies to achieve deep and accurate language understanding, but are based on huge amounts of linguistic resources that used not to be available. Although current computer systems can collect huge amounts of knowledge from real examples, it is not obvious how to transfer knowledge more naturally between such powerful computer systems and humans. We need to develop a novel way to transfer knowledge from computers to humans.
We believe that, based on large amounts of text data, it is possible to devise a system which can generate dialogue by a simple mechanism to give people the impression that two intelligent persons are talking. We verified this approach by implementing a system named Sophisticated Eliza which can simulate conversation between two persons on a computer. Sophisticated Eliza is not a Human-Computer Interaction system; instead, it simulates conversation by two people and users acquire information by listening to the conversation generated by the system. Concretely, using an encyclopedia in Japanese (Kodansha International, 1998) as a knowledge base, we develop rules to extract information from the knowledge base and create fragments of conversation. We extract rules with syntactic patterns to make a conversation, for example, "What is A?" "It's B." from "A is B." The system extracts candidate fragments of conversation using these simple scripts and two voices then read the conversation aloud. This system cannot generate long conversations as humans do on one topic, but it can simulate short conversations from stored linguistic resources and continue conversations while changing topics. Figure 1 shows a screenshot of Sophisticated Eliza and Figure 2 shows its system flow. Figure 3 is examples of conversation generated by the system.
The encyclopedia utilized here contains all about Japan, e.g., history, culture, economy and politics. All sentences in the encyclopedia are analyzed syntactically using a Japanese parser www.intechopen.com and we use rules to extract the fragments of conversation using information in the encyclopedia. As for the manual compilation of rules, we carefully analyzed the output of the Japanese parser and found useful patterns to extract knowledge from the encyclopedia.
The terms extracted during the syntactic analysis are stored in the keyword table and are used for selection of topics and words during the conversation synthesis.

Fig. 3. Examples of Generated Conversation
Note that in our current system, we use Japanese documents as the input. Because we are using only syntactic information output by the Japanese parser, our mechanism is also applicable to other languages such as English. We use a rather simple mechanism to generate actual conversations in the system, which includes rules to select fragments containing similar words and rules to change topics. The contents in the encyclopedia are divided into seven categories, i.e. geography, history, politics, economy, society, culture and life. When the topic in a conversation moves from one topic to another, the system generates utterance showing such move. As for the speech synthesis part, we use the synthesizer "Polluxstar" developed by Oki Electric Industry Co. Ltd., Japan. The two authors of this paper, one male and one female, recorded 400 sentences each and the two characters in the system talk to each other by impersonating our voices. The images of the two characters are also based on the authors.
Because this system uses simple template-like knowledge, it cannot generate semantically deep conversation on a topic by considering context or by compiling highly precise rules to extract script-like information from text. Thus, the mechanism used in this system has room for improvement to create conversations for knowledge transfer.

Efficiency of hearing a conversation comparing with hearing a monologue
In the daily transfer of knowledge, such as in a cooking program on TV, there are not only the reading aloud of recipes by the presenter but also conversation between the cook and assistant. Through such conversations, information which viewers want to know and which they should memorize is transferred to them naturally.
To verify above mentioned fact, we conducted experiments with human subject.

Settings
We utilized the speech synthesizer "Polluxstar" by Oki Electric Co. Ltd., which enables speech synthesis with one's own voice. We input information of voice of authors of this paper (one male and one female).
We prepared three materials to be synthesized for the experiments. Two are about recipes and another is about sports news. For recipes, we chose them from one of recipes sites with movies in Japan (http://recipe.gnavi.co.jp/movie/sweetkitchen/). One is about cooking rice bowl with chicken and eggs, and the other is about cooking gratin. This site contains a short movie with chef and assistant, and contains written recipes for each dish. For dialogue example, we transcribed all conversation between chef and assistant and made speech synthesizer read it aloud. For monologue example, we simply made speech synthesizer read it aloud with one of two voices in the system. As for news article, we chose a news article about women's soccer games in the newspaper in Japan. For its monologue, we made speech synthesizer read it aloud with one of two voices in the system. For its dialogue, we added inquiries manually about some of the point of the news, and made speech synthesizer read it aloud.
We gathered participants of our experience among students of Toyohashi University of Technology, Japan. We had 33 participants and additional 4 male student participants. Because the main topic of the experience is recipe, we gathered mainly woman students. The participants were requested to listen to all six synthesized speeches, i.e. two dialogues for cooking, two monologues for cooking, one dialogue of news, and one monologue of news, and fill questionnaire when one finished each speech.
The items which are asked in the questionnaire are as follows;  (100) It seems that dialogue is slightly better than monologue. However, the experiment about gratin recipe shows different result, i.e., monologue is better than dialogue. We checked the result carefully and found the followings; A group which listened to Gratin dialogue listened to it at the beginning of the experiment. But another group which listened to Gratin monologue listened to Riceball speech before they listen to Gratin monologue. Therefore each participant who listened to Gratin monologue already knows what kind of inquiries they will be asked. They can concentrate to grasp such points. We did additional experiments with smaller participant where each participant listened to the Gratin dialogue after listening to the Riceball speech. Then, the results became closer to the Riceball case. Actually, Gratin dialogue can not be such worse than its monologue. In the free answer opinion in the questionnaire, more participants wrote that they prefer dialogue to monologue than the reverse.
This situation also occurred for Riceball case, i.e., Riceball monogolue was heard after Gratin speech. The difference between Dialogue and Monologue for riceball recipe can be bigger than the figures above.

News
We asked participant which you prefer between monologue and dialogue. Then more than two third of participants explicitly wrote that they prefer dialogue to monologue.

Discussion
As for recipe listening, dialogue seems slightly better than monologue. However, there are several factors in our experiment which affect the result in favor of monologue. We utilized written text on the web as a text for monologue. The important points of the recipe are listed at the end of the texts, therefore it will be memorable to listeners. If we make text for dialogue from written text, the result will be better than the one in our current settings.
As for news listening, the second speaker inserted only several inquiries about topics talked next. This is not a conversation but something like an interview. Some participants strongly prefer this situation. We should establish the way to generate dialogues properly from texts.
Our hypothesis is that dialogue is more useful to get information while doing something. However, in this experiment, participants were asked to listen to monologue and dialogue and answer questionnaire. This situation is different from our original settings. We should make more suitable way to verify our hypothesis.

Efficient knowledge transfer by hearing a conversation while doing something
We started to develop a mechanism to achieve natural knowledge acquisition for humans by turning information that is written in documents into conversational text. Efficient methods of acquiring knowledge include not only "reading documents" and "listening to passages read aloud," but also "hearing a conversation while doing something," provided that information is appropriately embedded into the conversation. We believe that we can verify that this "conversation hearing" can assist knowledge acquisition by developing a system for synthesizing conversations by collecting fragments of conversation and conducting experiments by using the system.
As a means to transfer information, contents conveyed by an interpretive reading with pronounced intonation are better retained in memory than if read monotonously from a document or summary. Furthermore, by turning contents into conversation style, even someone who is not concentrating on listening may become interested in the topic and acquire the contents naturally. This suggests that several factors in conversations, such as throwing in words of agreement, pauses and questions, which may appear to decrease the density of information, are actually effective means of transferring information matching humans' ability to acquire knowledge with limited concentration. Based on this idea, we propose a novel mechanism of an information transfer system by considering the way of transferring knowledge from computers to humans.
Various dialogue systems have already been developed as communication tools between humans and computers (Waizenbaum, 1966;Matsusaka et al., 1999). However, in our novel approach, the dialogue system regards the user as an outsider, presents conversation by two speakers in the computer which is of interest to the outside user, and thus provides the user with useful knowledge.
There are dialogue systems (Nadamoto & Tanaka, 2004; ALICE; UZURA) which can join in a conversation between a human and a computer, but they simply create fragments of conversation and so do not sound like an intelligent human speaker. One reason is that they do not aim to provide knowledge or transfer information to humans, and few theoretical evaluations have been done in this field. In this research, we consider a way to transfer knowledge and develop a conversation system which generates dialogue by which humans can acquire knowledge from dialogue conducted by two speakers in the computer. We analyze the way to transfer knowledge to humans with this system. This kind of research is beneficial not only from an engineering viewpoint but also cognitive science and cognitive linguistics. Furthermore, a speech synthesis system in which two participants conduct spoken conversation automatically is rare. In this research, we develop an original information-providing system by assigning conversation to two speakers in the computer in order to transfer knowledge to humans.

System implementation
The principle of Sophisticated Eliza is that because a large amount of text data is available, even if the recall of information extraction is low, we can obtain sufficient information to generate short conversations. However, the rules still need to be improved by careful analysis of input texts.
As for the information transfer system, although our final target is to handle topics which are practically useful such as knowledge from newspapers, encyclopedia and Wikipedia, as a first step we are trying to compile rules for small procedural domains such as cooking recipes. Concretely, we are developing the new system via the following five steps repeatedly.
1. Enlargement of conversational script and template in order to generate sentences in natural conversation We have already compiled simple templates for extracting fragments of conversation as a part of Sophisticated Eliza. We are now enlarging the set of templates to handle wider contexts, domain-specific knowledge and insertion of words. This enlargement is basically being done manually. Here, domain-specific knowledge includes domain documents in a specific format, such as recipes. Insertion of words includes words of agreement and encouragement for the other speaker, part of which is already introduced in Sophisticated Eliza. An example of synthesized conversation is shown in Figure 4. 2. Implementation of system in which two speakers (agents/characters) make conversation in a computer considering dialogue and document contexts Using the conversational templates extracted based on the contexts, the system continues conversation with two speakers. Fundamental functions of this kind have already been developed for Sophisticated Eliza.
Here, there are two types of "context." One is the context in the documents, i.e. knowledge-base. For the recipe example, cooking heavily depends on the order of each process and on the result of each process. The other type is the context in the conversation.

Fig. 4. Example conversation
If all subevents included in an event are explicitly uttered in conversation, it would be very dull and makes understanding obstruct. For example, "Make hot water in a pan. Peel potatoes and boil them" is enough and it is not necessary to say "boil peeled potatoes in the hot water in a pan." Appropriate use of ellipsis and anaphoric representation based on the context in the conversation are useful tools for easy understanding.
Though speech synthesis itself is out of the scope of our research, pauses in utterances are also important in natural communication. 3. Mechanism to extract (fragment of) knowledge from text Sophisticated Eliza outputs informative short conversations, but the content of the conversation is not consistent as a whole. In this research, we are developing a system to provide people with some useful knowledge. We have to recognize the useful part of the knowledge base and to place great importance on the extracted useful part of the text. We previously reported how to extract an informative set of words using a measure of inclusive relations (Yamamoto et al., 2005), and will apply a similar method to this conversation system. 4. Improvement of conversation script and template considering "fragment of knowledge" By considering the useful part of information written in the knowledge base, we modify the templates to extract conversational text. Contextual information such as ellipsis and anaphora is also treated in this part. As a first step, we will handle anaphora resolution in a specific domain, such as cooking, considering factors described at 2). We will use domain knowledge about cooking such as cookware, cookery and ingredient.

Evaluation
We will conduct tests with participants to evaluate our methodology and verify the effectiveness of our method for transferring knowledge. So far, we are reported by some small number of participants that it is rather easy to listen to the voice of the system, however, objective evaluation is still our future work.

Conclusion
We introduced an approach for developing an information-providing system in order to support knowledge acquisition. The system can transfer knowledge to humans even while the person is doing something or is not concentrating on listening to the voice. Our approach does not create a summary of the key points of what is being read out, but focuses on the knowledge transferring method. Specifically, to provide knowledge efficiently, we consider what kinds of conversation are naturally retained in the brain, as such conversations may enable people to obtain knowledge more easily. We aim to construct an intelligent system which can create such conversations by applying natural language processing techniques.