iTech: An Interactive Virtual Assistant for Technical Communication

A manual accompanies almost every product or device. Manuals are usually included with products or services to provide customer assistance and provide technical information to users. However, Thimbleby states, “User manuals are the scapegoat of bad system design.” (Major, 1985; Thimbleby, 1996). Technical communications are provided through several mediums and manuals are one example of this. Other mediums range from interactive animation to virtual reality (Hailey, 2004), with each new medium attempting to improve upon the drawbacks of the previous one. The first medium introduced was the paper manual. However, issues with paper manuals have been widely documented, especially by technicians in the armed forces. Problems include lack of portability, inaccuracy, and increasing content and complexity (Ventura, 1988). To improve upon the drawbacks of paper manuals alternative mediums such as online manuals for technical communication emerged.


Literature review 2.1 Technical communication
Technical communication refers to the process of delivering technical information to the user. Albing defines it as, "…the creation, control, delivery, and maintenance of distributed information across the enterprise and in a network that includes sources and users." (pp. 67) (Albing, 1996). An effective technical document is determined by the following factors (Zachary et al., 2001): 1. Is the analysis of the communication problem complete? 2. Is the goal/task to be explained clearly identified? 3. Is the vocabulary used to explain the goal comprehensive and does it follow conventional guidelines?
These factors are used for evaluating all mediums of technical communications. However, as the need for manuals has grown, little investment has been placed in the development of these manuals. Paper manuals often need updates; therefore, become outdated quickly, are hard to understand, inaccessible, erroneous, and difficult to search if the index is not designed properly. On the other hand, while online manuals provide additional benefits such decreased search time, smaller documentation, and better search techniques (Barnett, 1998), they also require increased query pre-processing for either the user or the search engine. With the introduction of web-based mediums such as animations and virtual reality has also come the concern of available technologies, end user expectations, and usercentered design (Zachary et al., 2001). Therefore, this suggests that mediums that provide higher degrees of user satisfaction and ease of use may be important to delivering effective technical communication.

Animated agents and interactive assistants
Interactive assistants aim to aid users in managing their environment (Kirste & Rapp, 2001). Because computers are continually becoming more ubiquitous, permeating aspects of people's daily lives, there is a need for an efficient interface between users and computers. Interactive assistants address this need by providing natural, intuitive, and effective interaction between people and computers (Oviatt et al., 2000). Interactive assistants typically contain multimodal features including speech input and output, gesture and handwriting recognition, and animated agents or avatars. These features provide users with interaction choices that can circumvent personal and/or environmental limitations, require little or no training. They also have great potential to promote new forms of computing and expand the accessibility of computing to a diverse group of users ( (Lester et al., 1997).
Interactive assistants have been used in various types of user help systems including training, education, and marketing [11,12]. Additionally, research has been conducted on how the inclusion of such agents impacts user's interactions with the system. Some of the earlier agents were designed in the domain of education. Rosis et al. designed the XDM-Agent, an animated character that aids in illustrating interface objects for software development in user-adapted interfaces and explain which tasks may be performed and

Design
iTech is an interactive virtual assistant that was designed and developed to address the limitations of the current mediums of technical communication including paper-based and online systems. More specifically, iTech was designed to address the limitations associated with current technical documentation including understandability, portability, accessibility, accuracy, search time, and the ability to make updates. Although these limitations do not apply to all mediums of technical communication, there is an application limitation for each medium. In the process of designing iTech several additional limitations associated with automatic speech recognition (ASR) engines were encountered, i.e. population of the question-database and conversational questioning answering. In the design of iTech, the goal was to provide iTech with the ability to understand natural language queries from a variety of speakers as is without any additional training. To do so, is was necessary to eliminate the preprocessing step that is associated with many other techniques. In addition, it was desirable to have iTech be able to effectively answer (return an appropriate answer) even if the question asked did not appear in the database.
The accuracy of automated speech recognition (ASR) engines for speaker-independent systems has a higher word error rate (WER) than those that are trained. The WER can be reduced by a limited grammar, but natural language questions necessitate a larger grammar to account for the questions that may be asked. To allow for a large grammar, the database used to generate the grammar must be populated with all relevant questions that a user may ask. Each answer must then be mapped to a relevant question.
Additionally, each answer is not restricted to a specific question. Because of this the database must be populated with a massive amount data. Furthermore, current techniques for conversational question answering require pre-processing (parts-of-speech tagging, semantic interpretation) of queries before execution and removing what the authors argue to be relevant information.
iTech utilizes the Answers First (A1) approach for conversational question answering (Wilson et al., 2010). In A1, unlike many other information retrieval or natural language processing techniques, requires no language processing before the query is executed. The users query once recognized is sent to a server and decomposed into bigrams (word pairs). The bigrams are matched to a repository of questions using a question resolution algorithm and the question with the highest concentration of matched terms is returned. This process continues, prompting the user appropriately until an appropriate solution is found.
iTech has a client-server architecture as shown in Figure 4. iTech's Architecture. The user initiates the conversation with iTech by pressing a button to speak and ask a question. The built-in speech recognition engine, Microsoft English ASR Version 5 Engine, recognizes the user's question and passes the recognized speech to the browser environment of the page where the Speech Application Language Tags (SALT) are hosted (Cisco Systems Inc. et al, 2002). Additional client-side scripts then manipulate the SALT elements. The resulting text of the recognized speech is then sent as a request to the server. retrieves the relevant answer (Wilson et al., 2010). The retrieved answer is then displayed to the user.
The system works in the following way. A user initiates the system by opening up the application's browser. Once loaded, iTech welcomes the user and tells them of his purpose and how to ask a question. The user presses the 'Push 2 Speak' button and asks their question. The browser interacts with the user and identifies the exact content of the question. The question is converted into text and sent to the QRA module (Wilson et al., 2010). The QRA module performs three tasks. First the users text is broken into bigrams or word pairs. Second, the QRA matches the question's terms against a corresponding table of word pairs residing in the KR. Third, the KR finds the question with the highest concentration of terms and the indexed answer to that question is returned to the iTech's interface with a link to the corresponding document. Finally, the answer is displayed for the user.
iTech's interface is multimodal and can be housed on any personal computing device with a microphone or the ability to add a microphone. The microphone is used to collect the user's speech. The graphical user interface (GUI) consists of two frames: the Navigation frame and the Content frame. The Navigation frame consists of an animated agent and the Speech Application Language Tags (SALT). The presence of a likeable animated pedagogical agent has been shown to improve student performance by enhancing the student's desire to learn (Baylor & Ryu, 2003). This desire is increased as the student forges a personal connection with the agent, thereby making the learning experience more enjoyable. However, the agent must possess certain characteristics for this to be effective. The agent must be engaging, person-like, and credible; promoting relationships with the learner requires the presence of these characteristics (Baylor & Ryu, 2003). iTech is male. This choice was deliberate and influenced because findings suggest that male pedagogical agents are perceived as more extraverted and agreeable resulting in a more satisfying experience by the learner (Baylor & Kim, 2003). The ethnicity of iTech was chosen as African-American. This choice was determined by study results that indicated African-Americans were more inclined to choose an agent of the same ethnicity than Caucasians . The agent was generated using SitePal and embedded into a HTML file (Oddcast Inc., 2008). The SitePal application allows for greater developer control over the appearance of iTech. To enable the agent's perceived participation in conversations, SALT and JavaScript were used.
JavaScript was used to provide text-to-speech (TTS) capabilities to the agent. SALT is then used to enable iTech's hearing. SALT is embedded in a compliant browser and using Microsoft's recognition engine allows iTech to listen to user's questions. Once the question is recognized, the question resolution algorithm is applied, an answer is identified and retrieved and is displayed in the Content Frame.
When iTech is loaded for the first time, the Content frame displays the cover of the vi manual (See Figure 2. iTech's welcome screen and welcome instructions). Once interaction begins, the Content frame dynamically displays the solutions retrieved by the question resolution algorithm (QRA. The QRA is initiated by a PHP script that connects to a MySQL database that houses the KR. www.intechopen.com

Equipment
To test the iTech design, a usability study to measure performance and user satisfaction was conducted. The authors set out to answer three research questions related to search time and user satisfaction: Search time refers to the amount of time the participant spent referencing their assigned medium before the correct solution was found. After finding the correct solution, the user had to read the solution. The task completion time was the amount of time the participant spent referencing their assigned medium before the participant read the correct solution.
User satisfaction refers to the effectiveness, efficiency and user's overall experience with the system.

iTech: Hello, I am iTech …. iTech: I am here to help you with the vi editor iTech: When you need assistance. Just push the button to ask me a question
The experiment was setup in a private room furnished with one large table and five chairs. All testing was conducted on a Gateway 2000e CPU running the Windows XP operating system and equipped with a 17" Sony Monitor, a standard scroll mouse, and a Logitech USB headset. In addition, we downloaded Internet Explorer 6.x, the Microsoft Internet Speech Add-in 1.0., and the SecureCRT 4.07 software on the machine. All user interactions were recorded with a Sony 700x Digital Handy cam video recorder.

Participants
Seventy-four college level students were recruited as participants. Institutional Review Board (IRB) approval was granted by Auburn University and all participants signed informed consent forms before participating in the study. Participants had little or no exposure with the vi editor before participating in the study. The vi editor is a short hand editor used on Unix and Linux operating systems. Computer programmers often used the vi editor on these systems because it provided an efficient editing tool at the Unix/Linux command line; however, the vi editor has a huge learning curve. The vi editor is not a WYSIWYG (What You See Is What You Get) editor. It requires the user to know several keyboard shortcuts before using vi. As such, the vi editor is very difficult to use without proper training. Today, most users of Unix/Linux systems prefer WYSIWYG editors like pico; therefore, very few college students are familiar with vi. Many participants in the study had some experience using either Microsoft Office Word or Corel Word Perfect. To ensure that all participants had similar experience using a personal computer and editing text documents, our recruitment was focused on student enrolled in at least one course from the College of Engineering.

Experiment
The usability evaluation was designed as a controlled experiment. To reduce the casual effects of other factors, the following controls were applied:  All participants sat in the same chair in the same room with the researcher.  Each participant completed the same task in the same order. The only independent variable that changed was the medium of technical communication.


Participants were randomly selected to use the book manual, online manual, or iTech.  Participants who were assigned the iTech medium used the Logitech USB headset.  The delay time before starting the survey was the same for each participant. The preexperiment survey was started when the participant arrived in the experiment room. The post-experiment survey was started immediately after the participant finished his or her task.  Participants were asked not to discuss the experiment with others to ensure that all participants had equal knowledge of the experiment.
The experiment was conducted for three different mediums. Medium I was a book manual entitled Learning the vi Editor published by O'Reilly. Medium II was the combination of a search engine and the electronic PDF version of the book manual used for Medium I. The combination of a search engine and the PDF was used to ensure that each medium being tested had the same content. To generate this medium, each section of the PDF was separated and saved as an individual file. Once the electronic manual was decomposed into www.intechopen.com individual sections, Google Desktop (Google Inc., 2009) was installed on the experiment computer. The preferences for Google Desktop were set to search a specific folder on the experiment computer's hard drive. This was to once again insure that the content of Medium II was the same as that of Mediums I and III. Medium II could be accessed through a floating desktop bar that was positioned in the top right corner of the monitor. When a participant entered a search query a list of all relevant documents was returned. Medium III was iTech. iTech was populated with information from the book manual. The answers indexed in iTech were the same electronic copies of the individual sections of the manual used in the online medium (Medium II). Consistency in content was maintained across all three mediums to reduce the probability that any difference in search and/or task completion were not due to any variable other than medium.
Twenty participants were assigned Medium I, twenty-four participants Medium II, and thirty participants Medium III at random. At the beginning of each experiment each participant was asked to fill out a pre-experiment questionnaire. Participants were then given an information sheet explaining the experiment and an instruction sheet that included tasks that the participants were asked to complete. Participants were assigned a medium and it was explained to the participant that they would be using the medium in completing the task. If the participant was assigned Medium I, they were given the book Learning the vi Editor. If the participant was assigned Medium II, the participant was directed to the floating desktop bar in the right hand corner of the screen and was instructed that he or she would be using a search engine linked to an online manual to assist in completing the task. If Medium III was assigned, the participant was instructed to put on and adjust the Logitech headset and iTech was launched. The participant was then directed to the SecureCRT terminal containing a file named example.txt to be edited. The participant was informed that they would be accessing the vi editor and the file from the current terminal. Lastly, the video recorder was started and the participant began his or her task.
The tasks were selected from the Exploring Microsoft Office 2003 textbook (Grauer, 2003).
Participants were asked to figure out how to open the specified file and edit it. Editing included deleting individual words, changing words, changing characters, deleting and inserting sentences, and deleting and inserting paragraphs. When the participants completed the assigned tasks, they were then asked to fill out a post-experiment survey.

Data collection
During the course of the experiment, several approaches were used to collect data including video recordings and surveys. Table 1. Experimental Instruments and Measures provides an overview of the experimental measures and instruments used. Pre-experiment surveys were used to gather demographic information about participants and to determine whether they met the criteria established for classification as a vi editor novice. In addition, questions were asked about the participant's familiarity with computers such as how long they had used a computer, how often they use a computer, computer programming experience, and experience with specific software applications like word processors.
Performance data was collected using a video camera. Recordings were used to measure search, reading, and task completion times. Characteristics of spoken queries such as the average number of spoken queries per search per user, the number of recognition errors, and the total number of spoken terms per query were also derived from the participants' utterances. Informal and formal user observations were also employed to gather performance data.
Post-experiment questionnaires were used to gather user satisfaction data. Two postexperiment questionnaires were designed for the experiment. One was administered to participants that used iTech and the other was administered to all other participants. Part I of the questionnaire was identical in both versions of the questionnaire. It gathered overall participant ratings using six bi-polar rating scales. Part II of the questionnaire included a series of Likert-like scales where participants were asked to rate their reactions to the system. This part of the questionnaire included statements concerning the medium's ease of use and intuitiveness.

Results
For the purposes of this paper, the authors will focus on results related to search time, task completion time and user satisfaction. The Jmp Statistical Software package was used to analyze the data collected for each of these measures (SAS Institute, 1984).

Participant analysis
An analysis of participant data shows that participants' ages ranged from 18 to 27 years with a mean age of 20 years (See Table 2). Of these, 71% were male and 29% were female. Participant's average number of years of computer use was 12 and the minimum number of years of computer use was 8. Therefore, the majority of the participants were comfortable using a computer.

Measurement
Medium I N = 20

Performance results (search time and task completion)
To determine if iTech provided an improvement in search time compared to the book and online mediums, an analysis of the distribution of search times for each medium was performed. Additionally, the Shapiro-Wilk's test for normality was used for its resilience to the outliers present in the data. The Kruskal-Wallis nonparametric analysis of variance provides a method for coping with data that contain extreme outliers and that have more than 2 independent variables. It does this by replacing the observation values by their ranks in a single sample and applying a one-way analysis of the F-test on the rank-transformed data (NIST, 2003 provided strong evidence to reject this null hypothesis. Thereby, there is statistical significance that supports that the search time means are different. Thus, there is strong evidence to reject the null hypothesis that states that the means were equal. The Kruskal-Wallis test allows for the comparison between three or more unpaired groups, however it does not allow for deductions between specific pairs or means. The resulting p-value, which is very small, indicates that the deduction can be made that the difference in the group means is not a coincidence. However, this does not mean that every group differs from every other group. The Kruskal-Wallis test only determines that at least one group differs from one of the others. Thus, a post-test was applied to determine which groups differed from the other groups. The Tukey-Kramer test analyzes data of unequal sample sizes and determines whether the differences between all existing pairs are due to coincidence (NIST, 2003). The results of the Tukey-Kramer test provided very strong evidence that the differences in the pairs of means were statistically significant (See Table 4). The positive values between each pair of means indicate that their differences are significantly different. Thus, there is sufficient evidence to deduce that the independent variable of medium type had a statistically significant effect on the search times, with the search times for the iTech medium being the most expeditious. Next an analysis was performed to determine the effects of search time on task completion time. The same tests were applied to medium search times to task completion times. The mean task completion times are displayed in Table 5. Mean Task Completion Time by and the normality spreads are shown in Figure 4. understanding the text, therefore although iTech found solutions faster, participants read the solutions longer than with the Book medium. It should be noted that the solutions were all identical regardless of the medium used. Also, a significant proportion of participants did not read the solution carefully and as a result either had to return to the solution several times, or implemented an incorrect action that led them further away from the correct action. These results suggest improvements in the content and understandability of technical communications may increase the improvements in search time provided by the iTech medium. Lastly, performance analysis of task success showed that nearly all participants were able to successfully complete the assigned tasks. Task success was determined by comparing the file updated by each participant to a correct version of the updated file. 95% of all participants successfully completed the task using one of the three mediums provided.

User satisfaction
To get a better idea of users reactions to iTech compared to the book and online mediums a post-experiment questionnaire was used to collect data using two rating scales. The first rating scale included a five-point bi-polar scale. This scale presented several qualities that might influence usability. The rating means are shown in Table 7. Bi-polar Rating Scales assessing General Usability. For each of these scales a higher rating indicates a number closer to the positive side with the exception of the anchor usable to not usable. For this anchor a higher rating indicates a number closer to the negative side. A quick review suggests that the participants' reactions to iTech were generally more favorable than the other two mediums. However, investigation of just the means does not provide a complete picture of the users' evaluations. For example, although iTech's rating of the Terrible-Wonderful anchor is lower than the Online medium, iTech received 19 ratings at levels 4 -5 while the Online medium received only 16. Therefore, an analysis of the entire distribution for each rating was conducted. Three scales were used to examine the usability of iTech: 1) terrible -wonderful, 2) dullstimulating and 3) boring -fun. The five-point rating inherently assigns the score of 3 a neutral rating, with scores 1 and 2 being negative and scores 3 and 4 positive. For the book medium, 33.33% of the participants rated that medium with a score of 4 or higher on the terrible to wonderful scale. The online medium received 53.33% and iTech 65.52% for the same score values (See Figure 6. Terrible Wonderful Distributions). The second set of rating scales consisted of items designed to examine reactions to specific aspects of the participants' interaction experience. These scales each contained an assertion e.g. 'The medium was easy to use', to which the participants responded using a five-point scale. This scale contained the following ratings: Strongly Agree, Agree, Neutral, Disagree and Strongly Disagree. We assigned each rating a weight. This weight was used for statistical analysis. The Strongly Agree was assigned a rating of 5, Agree a rating of 4, Neutral a rating of 3, Disagree a rating of 2, and Strongly Disagree a rating of 1.

Bi-Polar Scale Anchors
Version I of the post-experiment survey contained 10 Likert-like ratings and Version II of our post-experiment survey contained 22 ratings. The first 9 ratings for each questionnaire were identical and as a result were compared across all three mediums.

Fig. 8. Boring -Fun Bi-polar Distribution
The first property analyzed was the affordance of the mediums. This property was derived from the question, "It was easy to get started". Results show that iTech received a score of 4 or higher from 60.7% of the participants, while the book and online mediums received 46.67% and 30.0% respectively (See Figure 9. Affordance Distributions). This data is in agreement with the trends found in the mediums' search times. The online medium had the worst average search time with iTech having the best, suggesting that an application's affordance is an important feature of the application's success.

Fig. 9. Affordance Distributions
The scores for "understanding document updates" were over 80.0% for all mediums suggesting that we selected tasks that needed little training to get started. The results for the property of 'ease of use' reflect the problems with speech recognition accuracy. There were problems with recognition accuracy due to heavy southern accents and incorrect usage of the recording box. Subsequently, though the range for the medium averages is small, the scores for the iTech medium are the lowest in response to the statement, "It was easy retrieving an answer". The results are as follows: book medium -63.3%, online medium -50.0% and iTech medium -48.27% for scores of 4 or higher. In spite of the recognition accuracy issues, iTech received the highest ratings with respect to knowing how to use the medium (See Figure 10. Getting Started Distributions). Next, we analyzed the user's reactions to the iTech medium. Before beginning the analysis of user's reactions to iTech, statements unique to iTech were placed into one of six possible categories. These categories represented the six factors investigated user's attitudes towards speech systems (Hone, 2003). Results are shown in Table 8.
Participants liked the appearance of iTech and results suggest that they would reuse the application. Participants were able to understand iTech and thought that the application retrieved their answers in an expedient fashion. In addition, they agreed that computer novices would be able to use the application. The high user satisfaction ratings were solidified by additional comments.
"Worked greater than expectations based on previous speech help programs…" "Pretty easy to use. User friendly" "I really enjoyed iTech …the layout and technology used was great" "It was overall very helpful and would be useful for people whom are computer literate".

Discussion and implications
The results suggest that overall; iTech is a viable technology for use in the area of technical communications. The introduction of an animated agent that allows users to speak questions and return an appropriate solution through our research has been shown to decrease search time and task completion time, as well as overall user satisfaction compared to both book manuals and online searchable manuals. Therefore, such systems may provide some improvement over traditional technical communication mediums such as books and online search systems. More generally, because iTech is a speaker-independent system that employs conversational questioning answering techniques suggests additional advantages. Because iTech is speaker independent, there is no need for training. Additionally, because iTech allows users to answer spoken questions, query-preprocessing time is eliminated.
Another advantage of iTech like systems is that it improves on many of the limitations of online and paper manuals including portability and frequent updates. iTech is computerbased and therefore there is no need for a bulky paper manual, only a computer, cell-phone, or other Internet connected device is needed. Also, iTech transfers the search task from the user to the computer, removing the need for users to understand indexes. Because, iTech's content resides on a server in a database, the ability to make frequent updates is less time consuming and does not require an entire re-print and shipment of manuals to users. Users therefore will have access to the most recent version of the manual at all times. The study results present a new opportunity for professional communicators to incorporate the best of these two mediums, search engines and manuals. Furthermore, iTech has the potential to change the way technical information is communicated across numerous domains. For example, automobile manuals have significantly grown in size. At the same time, these manuals have found their way online in the form of Adobe Acrobat documents that can be easily searched by drivers. iTech has the potential to be integrated within the vehicle to provide instant access to manual information using the driver or passenger's voice. Another compelling domain is the military. When military personnel travel, they carry a great deal of equipment, including a laptop and manuals. iTech has the potential to consolidate their manuals into a single laptop with a natural language interface, e.g. typed or spoken text.
The need for effective technical communication to provide user assistance will continue to be an issue of importance as long as new products and devices are introduced into society. As these devices become more complex so will the documentation accompanying them. Therefore, usability and user satisfaction will continue to become an important factor in creating documentation that is among others easy to search, easy to understand, and easy to use. iTech addresses the limitations of paper and online manuals by providing technical communication through a personable virtual interactive technical assistant.

Conclusions
The results of this study show that iTech yielded faster search times than its paper and online counterparts. In addition, iTech had favorable usability results. Overall, the use of iTech was favorable and provided evidence that such a tool would be a viable option for providing technical assistance. In addition positive user comments show that users of iTech were satisfied with their experience with the tool. In addition, this research suggests that the application of interactive virtual assistants to technical communication is a viable research area for increasing usability and user satisfaction. It is widely accepted that technology is one of the forces driving economic growth. Although more and more new technologies have emerged, various evidence shows that their performances were not as high as expected. In both academia and practice, there are still many questions about what technologies to adopt and how to manage these technologies. The 15 articles in this book aim to look into these questions. There are quite many features in this book. Firstly, the articles are from both developed countries and developing countries in Asia, Africa and South and Middle America. Secondly, the articles cover a wide range of industries including telecommunication, sanitation, healthcare, entertainment, education, manufacturing, and financial. Thirdly, the analytical approaches are multi-disciplinary, ranging from mathematical, economic, analytical, empirical and strategic. Finally, the articles study both public and private organizations, including the service industry, manufacturing industry, and governmental organizations. Given its wide coverage and multidisciplines, the book may be useful for both academic research and practical management.

How to reference
In order to correctly reference this scholarly work, feel free to copy and paste the following: Dale-Marie Wilson, Aqueasha M. Martin and Juan E. Gilbert (2012). iTech: An Interactive Virtual Assistant for Technical Communication, Management of Technological Innovation in Developing and Developed Countries, Dr. HongYi Sun (Ed.), ISBN: 978-953-51-0365-3, InTech, Available from: http://www.intechopen.com/books/management-of-technological-innovation-in-developing-and-developedcountries/itech-an-interactive-virtual-assistant-for-technical-communication