In order to provide personalized service to web users, the first thing to do is to find the web pages that contain interesting and useful contents among all visited web pages. So far, many researchers have attempted to do the task using various usage logs – viewing time, scrolling, bookmarking, saving, printing and so on - that can be collected without notice to users while they visit web pages. Consequently, it has been important things for many researchers to find useful logs that indicate users’ interest level efficiently. However, it is still challengeable to find useful logs that can be considered as good implicit interest indicators. In other words, we need an efficient and effective method to elicit users’ interest implicitly under current web environments. In addition, although a lot of researchers have focused on implicit elicitation of users’ interest level for contents of web pages, many other influential factors that may make users interact with web pages have been also investigated [Kelly, 2004; Fogg et al., 2003; Wathen & Burkell, 2002; Chi et al., 2001; Kim & Allen, 2002; Kellar et al., 2007; Rieh, 2002]. For examples, users may show more interactions on web pages that contain difficult or even on web pages that have more complex layout structures irrespective of interest level. Therefore, various other influential factors also should be investigated carefully. As we can see in figure 1, it is necessary to analyze the factors and usage logs to understand web usage patterns. We are going to find various influential factors that can be identified by computer system while users browse the web.
To find the factors correctly, it is also important to develop a tool to assess users’ feedbacks about a web page. Such tools may run on the browser-side or on the server-side, but we are more interested in browser-side tools because they do not restrict information sources to a specific website and also because they can derive timely information from dynamic web sites whose contents may change frequently. One of the important requirements of a good browser-side
tool is that it should preserve the web browsing environment as much as possible, which in turn means that it should not depend too much on the implementation details of diverse web browsers. We supposed that one of such tools for monitoring users’ interests may be built simply based on the amount of processed Windows GUI messages while users are reading a web page.
In these perspectives, we designed several experiments to see whether influences of the factors can be inferred based on the Windows GUI messages efficiently at browser side. We also built a software module, called the Browser Monitoring Module (BMM), which runs behind the Internet Explorer and counts Windows GUI messages. Our experimental results showed that the amount of message traffic, though it may sound simplistic, is indeed an effective indicator of users’ interests about a web page and some influential factors. Therefore, we concluded that we can use the BMM for our future designs of various user-adaptive services, for example, personalized web browsers, personalized search engines, recommendation systems web usage mining, and so on.
2. Related work
Many studies have implicitly measured web users’ interest. Server-side analyses have shown good performance and have been successfully applied to consumer analyses of commercial web sites. The users’ interest at the server-side can be analyzed more easily because relevant information can be found in log files maintained by the server. For example, from the log file, the users’ login/logout time, the web pages that users have visited, users’ IP addresses, and so on can be obtained. However, the server-side analysis has critical limitations – only users on a specific site can be analyzed and the contents of a server are not sufficient to construct a general user model. On the other hand, through a browser-side analysis, the users’ interest can be analyzed from various sites and a user model can be constructed using a wealth of information. Therefore, many researchers have focused on browser-side analyses. However, this form of analysis also brings forth certain challenges, largely because there are no standardized methods to determine what users’ activities are relevant to users’ interests. Finding relevant activities is important in that an explicit user feedback method (e.g.: think aloud protocol, post interview) cannot be applied in natural web browsing environments.
In order to predict users’ interest implicitly at the browser side, a modified web browser was built [Claypool et al., 2001]. This browser monitors the number of mouse clicks, mouse movement, the scrolling amount, and the elapse time on a page. From this method, it was found that a user’s interest for contents of a web page is correlated not with a unit activity but a combination of several activities. However, with regard to measuring the amount of scrolling, the authors counted only the number of mouse clicks on the scrollbars and measured the duration of scrollbar usage. However, users may also use up/down keys or a mouse wheel to scroll the windows. In [Goecks & Shavlik, 2000], the authors measured the number of command-state changes to detect scroll activity and assessed status-bar-text changes for mouse activity. Changes may occur according to the activities but they may also arise by performing a different activity. In [Hijikata, 2004], the authors collected the results of several activities for analyzing user behaviors but the methods of detecting the activities were not described in detail. In [Reeder et al., 2000], the authors built Weblogger, a tool to detect several activities on Internet Explorer. However, the detection of all events from a browser is not an easy task. In [Kelly & Belkin, 2004], the method of using the length of time a user views a document in his/her web browser as implicit feedback was investigated and their conclusion was that there is no significant relationship between display time and document preference. However, they only addressed display time. In [Seo & Zhang, 2000], the authors used bookmarking as a relevant activity, but this approach is inadequate for dynamic websites whose contents may change frequently.
In numerous studies, the activities during task-oriented web browsing, such as using search engines, information seeking, and problem solving, exert been analyzed. For example, in [Badi et al., 2006], the authors analyzed various user activities and document properties, but their focus was limited to organize some links for class material as high school teacher. Users browse the web not only for searching important information but also for entertainment or distraction. In other words, one of the purposes of web browsing is merely to seek enjoyment.
Moreover, other factors that may have influences on users’ activities have been studied. The factors may belong to one of three categories – contents attributes, user characteristics, and context. As one of the factors in contents attributes, in [Kelly, 2004], the familiarity of topics has been discussed and in [Fogg et al., 2003; Wathen & Burkell, 2002], information credibility has been considered and important factors of credibility were suggested. Information scent [Chi et al., 2001] has been understood as an influential factor on user activities and cognitive authority and information quality were also suggested [Rieh, 2002]. Structural complexity and reading pattern are still on debate. For user characteristics, cognitive style and problem solving style were studied [Kim & Allen, 2002]. Context factor that has been mostly discussed was user’s task at hand [Kellar et al., 2007]. And finally, in [Kelly & Belkin, 2004], the relationships between display time and various factors – task, topic, usefulness, endurance, frequency, stage, persistence, familiarity, and retention were investigated. However, in spite of the fact that there are many factors that may have influences on users’ activities, the previous researches have mostly focused on interest.
3. Our approach
We analyzed several methods that have been proposed thus far in order to identify some requirements. First, web user analyses should be conducted at the browser side and in a real time manner. In addition, it is necessary to find simple but effective methods for detecting user activities that can be used as a measure of interest and influences of other factors with minimal unnatural change to the web-browsing environment. The last aspect is that evaluation of the proposed method should be conducted with natural tasks such that users read web pages without any specific goals in mind. To meet these requirements, BMM detects Windows GUI messages while users are reading web pages and thus it is possible to measure user activities in real time without any interruption to the users. We also evaluated the proposed method under a natural web browsing environment in which users could read web pages of various topics in a desired manner.
3.1 Simple log data at a browser side
In order to analyze users’ implicit interest at the browser side, we have to monitor several usage logs, for example, the viewing time, scroll movement, sequences of visited URLs, keyboard typing, and so on. In our research, we have chosen several usage logs to record while users view different Web pages. The viewing time that has mainly been investigated in the related researches so far is the time during which users remain on a particular web page. The mouse wheel counts the number of WM-MOUSEWHEEL messages. For mouse and scrollbar movement, we measured the distance between two consecutive positions of the mouse cursor and scroll bar at regular intervals and summed the distances. We also counted the number of processed WM-PAINT messages, as WM-PAINT messages are processed when users change the size of their browser window, scroll within the window, move their mouse cursor, and so on. The number of mouse clicks and keyboard typing were also considered. We believe that these activities are good indicators of user interest regarding the contents of Web pages. We have chosen these logs because they can be measured without much effort. However, for scroll movement, we were unable to obtain the position of the scrollbar on some of the Web pages, and the WM-PAINT messages can be affected by the dynamic content of certain Web pages. This means that we have to be careful when using these data as logs for measuring user activities. We did not record some of the behaviors that have been considered by other researches – bookmarking, saving, printing, and coping and pasting – because users do not always show those behaviors on every valuable Web page, and hence their records do not suit our purpose. We collected some physical data of Web pages - the scroll height and file size - of each visited Web page.
3.2 Various subjective feedbacks
There are several factors that make users interact with web pages. For example, a user may stay for relatively long time at a specific web page because there are interesting contents or the user feels that the contents are more difficult than others. Sometimes the user may roll the mouse wheel more frequently on one Web page than on others because it may not easy for him/her to find necessary information from the pages. Therefore, we selected some factors that may exert an influence on user interactions with web pages – interest, credibility, complexity, and difficulty. Because these factors are inherently subjective and cannot be measured with only usage logs, we collected various types of feedback regarding the current context directly from users.
3.3 Data logging software – Browser monitoring module (BMM)
In some of the previous researches, custom-built browsers have been used [Kellar et al., 2007], as have some specialized logging software that works “in stealth mode” [Kelly & Belkin, 2004]. Although there are several merits in using custom-built browsers, because various data can be collected easily, we developed a browser-monitoring module (BMM) that runs behind Internet Explorer without any modification to the browser, as we wanted to preserve the natural state of the Web browsing environment as much as possible.
BMM is a type of monitoring software that was developed to detect Windows GUI messages while users read Web pages, and thus it is possible to measure user activities in real-time without any interruption to the users. BMM uses a global hooker library, written in C++, which runs in the background and hooks all Windows operating system events. In addition, using Windows Shell API, BMM can collect all instances of currently running Internet Explorers through the COM object. In addition, necessary properties of Web pages can be obtained from the COM object. BMM is written in C#, running under a Windows platform with.NET Framework 2.0.
BMM consists of four components - hooker, data recorder, data aggregator, and feedback window. The data to hook are the number of keys pressed, events of program focus changes, number of WM_PAINT events, mouse click and mouse wheel messages, and so on. Basically, the hooker catches every message passed within the operating system, so we should filter out irrelevant messages to record only necessary data for our studies. For instance, because a WM_PAINT message is invoked whenever the O/S needs to re-draw some parts of a window, we have to be able to ignore the messages from unfocused windows and count the number of messages that are invoked for only the currently focused browser window. The aggregator can acquire several properties of web pages by using a Document Object Model (DOM). Acquired properties are the viewing size of a document (in pixels), file size (in bytes), current location of the scrollbar, and character set of the page. The location of scroll bar is periodically updated so that the total displacement of the scrollbar can be estimated. However, a critical issue arises at several 'fancy' Web pages that have different structures from standard Web documents, eventually yielding no data while accessing the DOM property. The data aggregator also aggregates all data from these multiple components, and the data recorder stores the aggregated data in a human-readable XML format for future analysis. After Web searching, using the feedback window, users can review the visited Web pages and choose radio buttons that ask about several types of assessments about the contents of each Web page. If the users do not want to answer questions regarding some of the Web pages, they can even remove the records easily. In figure 2, the structure of the BMM is shown.
4. Experiment 1
In the first experiment, we analyzed viewing time and 3 GUI messages - WM_PAINT, WM_MOUSEWHEEL, and WM_MOUSEMOVE - and formulated the following hypotheses.
1. The number of processed GUI messages is relatively higher on web pages that contain interesting contents.
2. The amount of information in a web page affects the number of processed GUI messages.
Under the above assumptions, we conducted experiments to verify the positive relationship between the amount of processed GUI messages and users’ interest for the content. First, we collected 120 text-based web pages offering information on various topics – Politics, Economics, Education, Engineering, Entertainment, Science, Health, and Sports – with varying content size. 25 subjects read each page in their own desired manner. To obtain appropriate data, the subjects were not told that some activities would be measured while they read the web pages. During the experiments, user activities while reading a web page and some measurable data were recorded in a log file for future analysis. In addition, whenever a subject finished reading a web page, a small window appeared wherein the
subject recorded his/her interest and preference level for the contents of the page. There are 5 levels of interest, and subjects recorded their interest for the contents of a web page accordingly. Due to some malfunctions of the BMM in the users’ browsing environment and failures to properly obtain user feedback, 5 users’ log files were excluded. Therefore, we analyzed 20 users’ log files.
In figure 3, an example of a log record is shown. BMM records several data – the visited URL, the number of typed keys, the number of GUI messages, file size of the web page, viewing time, and feedback levels as obtained from the user’s feedback. Each element represents log data of a web page that user visited. Among these data, the number of GUI messages, the file size of the web page, viewing time, and user feedback were analyzed in this experiment.
The main objective of the experiment was to determine whether there is a positive relationship between the number of processed GUI messages, which is normalized by the information size, and the users’ interest level. We measured the amount of users’ interaction on a web page as follows., ( : index of web page)(1)
In Eq. (1), is the normalized value of the number of processed GUI messages on the -th web page, is the normalized value of the elapsed time on the -th web page, and is the file size of the -th web page (amount of information). Because the absolute number of each user’s processed messages varies according to the user’s habit, we normalized each user’s and using min-max normalization and took the average of all users’ and according to each interest level. From figure 4 and table 1, it is observed that the users caused the browser to process more GUI messages for interesting web pages. For example, the value of at interest level 5 is much higher than that of level 1 in all cases. We verified that the results are statistically significant on the basis of a one-way ANOVA test (p-value < 0.01). The pattern is most distinct in the case of WM_PAINT message – the value of increases according to the interest level. We also observed that there is a strong positive relationship between the viewing time and interest level, but there remains debate as to whether viewing time can be used as a measure of interest. [Kelly & Belkin, 2004]
4.3 Conclusion of experiment 1
In this experiment, we used the number of processed GUI messages to predict users’ interest, as the messages are processed whenever users perform certain activities while reading web pages. It was found that the proposed method is simple and easy to develop while still being adequately effective. The results of our experiments showed that if a user engages in more activities that make the system process more GUI messages while reading a web page, even in the event that the page offers relatively little information, it can be inferred that the page contains interesting or preferable content. This provides an important guideline to follow, because finding preferable web pages is the first step of user modeling procedures and personalization services.
In this work, we considered only text-based web pages for ease of defining the amount of information on a web page. However, web pages contain an abundance of multimedia objects such as pictures and videos. Subsequent experiments should consider the use of a well-defined measure for the amount of information in such web pages. Also, because users’ interaction is believed to be influenced by the layout of a web page, this aspect will also be taken into consideration in later works.
5. Additional experiment
5.1 Experiment 2
In the second experiment, we collected 160 web pages offering information on various topics – Politics, Economics, Education, Engineering, Entertainment, Science, Health, and Sports – with various content sizes in which text, images, tables and videos are all presented naturally with various layouts. 20 graduate students read each page in their own desired manner. We just gave the list of numbers to click without showing any information about the contents of web pages in advance because we want to exclude any effect of information scent [Chi et al., 2001]. To obtain appropriate data, the subjects were not told that some of activities would be recorded while they are viewing the web pages. The subjects’ activities and some necessary data were recorded in a log file for future analysis. After browsing all the web pages, the subjects were instructed to review the visited web pages and answer some questions about their feedback level - interest, difficulty, complexity and credibility – in 5 point scales through the feedback window.
5.2 Experiment 3
|Viewing time||Mouse move||Mouse click||Mouse wheel||WM-PAINT|
This experiment was an extension of the previous experiments and conducted under more natural environment that subjects can do their web searching task on their own phases. The usage logs and feedbacks to record were same with the first experiment. We gave the subjects two tasks to perform. The first task is a kind of information gathering that requires accuracy, trust, efficiency, and responsibility about the search results. The subjects had to find some laboratories in universities or companies that research similar topics with the subjects’ own research topics and read carefully each page to judge the relevance of the information. We encouraged the subjects to perform this task as normally as possible. The second task is a kind of information gathering and browsing that can perform without any burden or responsibility about the search results. For examples, the subjects can search some
information about their hobbies, favorite products to buy, famous tourist spots, favorite sports or movie star and so on. We also encouraged the subjects to perform these tasks as normally as possible. Differently with the first and second experiments that controlled the subjects’ activities in that the subjects could only visit the collected Web pages without any pre-information clues, in this experiment, the subjects could visit any Web page that they wanted and use any search engine or portal site they wanted to use. Therefore, we observed a lot of re-visitation patterns. Thus, during the feedback phase, we let the subjects delete the logs of Web pages that they just used to find other Web pages to visit. In this way, we excluded the navigational Web pages [Fu et al., 2001].
5.3 Result of experiment 2
Actually, we thought that there are supposed to be some differences between the result patterns of the first experiment and that of the second experiment because the types of web pages are quite different. However, there were no big differences between the results. The figure 5 and table 2 shows us that there were also positive correlations between the amount of all usage logs and interest levels similarly with the results of the first experiment. In addition, we also found significant differences of the amount of usage logs among the interest levels. This means that the type of web pages is not important factor. Differently from the result of interest levels, difficulty and complexity levels showed negative correlation with the amount of usage logs. The credibility levels showed no big correlation with the amount of usage logs. From the results, we concluded that interest level has the most significant influence on the amount of usage logs and users are inclined to leave quickly web pages that have difficult contents or complex structures without much interaction.
5.4 Result of experiment 3
In figure 6 and table 3, we can see that the viewing time and the amount of mouse movement have positive correlations with the interest levels and the differences among the interest levels also statistically significant. The amount of mouse wheel, mouse clicks and processed WM_PAINT messages also showed positive correlations with interest level but the differences were not statistically significant. The amount of usage logs increased according to complexity levels but dropped steeply at level 5. The difficulty levels showed no big correlation with the amount of usage logs. The most interesting pattern that we found in the results of the third experiment was that the amount of usage logs showed positive correlation with the credibility levels and the difference of the amount of usage logs among the credibility levels were statistically significant. This result was not found in the results of the second experiment in which users browsed pre-collected web pages without proximal cues. Therefore, we concluded that the usage logs are under influences of credibility levels as well as interest levels in ordinary web browsing environments.
6. Interest level inference
Because we found that there are positive relationships between interest levels and the amount of usage logs from the results of our experiments, we used the usage logs for the training of decision trees and Bayesian networks to infer the interest levels. For construction
|Viewing time||Mouse move||Mouse click||Mouse wheel||WM-PAINT|
and testing of the machine learning models, the logs were normalized to each subject’s scale in order to ignore the variances that can be included due to differences in viewing style and we applied a 10-fold cross validation. Actually, it was not easy to infer 5 interest levels exactly – the classification accuracy was below 50%. Therefore, we merged levels 1 and 2 into a low-interest group and merged levels 4 and 5 into a high-interest group to see the accuracy of binary classification. The overall classification accuracy was 82% using Bayesian networks that are constructed by Simulated annealing algorithm and 84% using Decision trees.
Thus far, numerous researchers have attempted to obtain users’ preferences or interests implicitly for the contents of web pages by observing their interactions on a browser. This information could then later be used for information filtering, recommendation applications, and adaptive user interfaces. However, due to the limitations of interaction channels between users and computers – i.e., only a mouse and keyboard - predicting users’ interests is not an easy task.
In this chapter, we mainly used the number of processed GUI messages to predict users’ interest, as the messages are processed whenever users perform certain activities while reading web pages. It was found that the proposed method is simple and easy to develop while still being adequately effective. The results of our experiments showed that if a user engages in more activities that make the system process more GUI messages while reading a web page, even in the event that the page offers relatively little information, it can be inferred that the page contains interesting or preferable content. This provides an important guideline to follow, because finding preferable web pages is the first step of user modeling procedures and personalization services.
In our first experiment in which we considered only text-based web pages, we found that if a user engages in more activities that make the system process more GUI messages while reading a web page, it can be inferred that the page contains interesting contents. In later experiments, based on new results of our extended experiments using natural web pages that contain images, frames, videos, and so on, we further confirmed that users have tendency to interact more on the interested web pages. This result looks natural and simple but it is very important fact that we can infer users’ interest for web contents based on the amount of simple interaction logs that can be measured easily. The amount of log data can be measured without any modification to current browsing environments so that our method can be applied very easily.
Many previous researches [Fogg et al., 2003; Wathen & Burkell, 2002] already have focused on credibility on the web and it is currently considered as important factor to improve web environment by usability researchers, media researchers, web designers, psychologists, HCI practitioners, and so on. However, influence of the information credibility on users’ interactions on web pages has not been clearly investigated. From our experimental results, we showed that users interact more on web pages that contain credible contents and the pattern may be clear in natural browsing environments. We think the results may come from the fact that users select their links using various information scents, therefore, the interest or usefulness of the contents may be partially determined in advance so that users may interact more on web pages that contain credible contents. Because current web environments give users multiple methods to find interesting contents – search engines, portal sites, commercial sites, and so on – credibility as well as interest plays an important role to make users interact with web pages. From this result, it is apparent to us that we are supposed to consider what makes the contents of web pages more credible as well as interesting in order to attract users’ attention to our web sites.
Many web personalization systems make use of a user’s activity logs at the web browser in order to build a user model. A user model here is simply a structured summary of web pages where a user exhibited high activity. A web personalization system then uses a user model to determine whether a new page is recommendable or not by computing content similarity between the new page and a user model. It is important to realize at this point that the high-activity web pages do not only reflect a user’s topics of interest but also a user’s way of assessing the credibility of a web page. If a personalization system intends to recommend a page that is both interesting and credible to a user, it should not only depend on content similarity but also utilize a user’s way of assessing credibility of a web page, which can be inferred from the same web pages that provide keywords for a user model. A better user model should not only contain topic keywords but also contain page attributes that a user thinks important for a web page to be credible. In some previous researches [Fogg et al., 2003; Wathen & Burkell, 2002], several important factors for web credibility have been discovered, and we recognize easily that many of the factors cannot be explained by content similarity only. A new method to integrate the credibility factors into a recommendation model is highly desired. Currently, from this point of view, we are also carrying out some extended experiments based on long-term monitoring of over 10 subjects to verify whether we can find some relationships among activity logs, credibility levels, and some credibility factors for the purpose of developing an initial user model in which the credibility factors are also integrated.