An Overview of Interaction Techniques and 3D Representations for Data Mining

Since the emergence of databases in the 1960s, the volume of stored information has grown exponentially every year (Keim (2002)). This information accumulation in databases has motivated the development of a new research field: Knowledge Discovery in Databases (KDD) (Frawley et al. (1992)) which is commonly defined as the extraction of potentially useful knowledge from data. The KDD process is commonly defined in three stages: pre-processing, Data Mining (DM), and post-processing (Figure 1). At the output of the DM process (post-processing), the decision-maker must evaluate the results and select what is interesting. This task can be improved considerably with visual representations by taking advantage of human capabilities for 3D perception and spatial cognition. Visual representations can allow rapid information recognition and show complex ideas with clarity and efficacy (Card et al. (1999)). In everyday life, we interact with various information media which present us with facts and opinions based on knowledge extracted from data. It is common to communicate such facts and opinions in a virtual form, preferably interactive. For example, when watching weather forecast programs on TV, the icons of a landscape with clouds, rain and sun, allow us to quickly build a picture about the weather forecast. Such a picture is sufficient when we watch the weather forecast, but professional decision-making is a rather different situation. In professional situations, the decision-maker is overwhelmed by the DM algorithm results. Representing these results as static images limits the usefulness of their visualization. This explains why the decision-maker needs to be able to interact with the data representation in order to find relevant knowledge. Visual Data Mining (VDM), presented by Beilken & Spenke (1999) as an interactive visual methodology "to help a user to get a feeling for the data, to detect interesting knowledge, and to gain a deep visual understanding of the data set", can facilitate knowledge discovery in data.


Introduction
Since the emergence of databases in the 1960s, the volume of stored information has grown exponentially every year (Keim (2002)).This information accumulation in databases has motivated the development of a new research field: Knowledge Discovery in Databases (KDD) (Frawley et al. (1992)) which is commonly defined as the extraction of potentially useful knowledge from data.The KDD process is commonly defined in three stages: pre-processing, Data Mining (DM), and post-processing (Figure 1).At the output of the DM process (post-processing), the decision-maker must evaluate the results and select what is interesting.This task can be improved considerably with visual representations by taking advantage of human capabilities for 3D perception and spatial cognition.Visual representations can allow rapid information recognition and show complex ideas with clarity and efficacy (Card et al. (1999)).In everyday life, we interact with various information media which present us with facts and opinions based on knowledge extracted from data.It is common to communicate such facts and opinions in a virtual form, preferably interactive.For example, when watching weather forecast programs on TV, the icons of a landscape with clouds, rain and sun, allow us to quickly build a picture about the weather forecast.Such a picture is sufficient when we watch the weather forecast, but professional decision-making is a rather different situation.In professional situations, the decision-maker is overwhelmed by the DM algorithm results.Representing these results as static images limits the usefulness of their visualization.This explains why the decision-maker needs to be able to interact with the data representation in order to find relevant knowledge.Visual Data Mining (VDM), presented by Beilken & Spenke (1999) as an interactive visual methodology "to help a user to get a feeling for the data, to detect interesting knowledge, and to gain a deep visual understanding of the data set", can facilitate knowledge discovery in data.
In 2D space, VDM has been studied extensively and a number of visualization taxonomies have been proposed (Herman et al. (2000), Chi (2000)).More recently, hardware progress has led to the development of real-time interactive 3D data representation and immersive Virtual Reality (VR) techniques.Thus, aesthetically appealing element inclusion, such as 3D graphics and animation, increases the intuitiveness and memorability of visualization.Also, it eases the perception of the human visual system (Spence (1990), Brath et al. (2005)).Although there is still a debate concerning 2D vs 3D data visualization (Shneiderman (2003)), we believe that 3D and VR techniques haves a better potential to assist the decision-maker in analytical tasks, and to deeply immerse the user's in the data sets.In many cases, the user needs to explore data and/or knowledge from the inside-out and not from the outside-in, like in 2D techniques (Nelson et al. (1999)).This is only possible in using VR and Virtual Environment (VEs).VEs allow users to navigate continuously to new positions inside the data sets, and thereby obtain more information about the data.Although the benefits offered by VR compared to desk-top 2D and 3D still need to be proven, more and more researchers is investigating its use with VDM (Cai et al. (2007)).In this context, we are trying to develop new 3D visual representations to overcome some limitations of 2D representations.VR has already has been studied in different areas of VDM such as pre-processing (Nagel et al. (2008), Ogi et al. (2009)), classification (Einsfeld et al. (2006)), and clustering (Ahmed et al. (2006)).
In this context, we review some work that is relevant for researchers seeking or intending to use 3D representation and VR techniques for KDD.We propose a table that summarizes 14 VDM tools focusing on 3D -VR and interaction techniques based on 3 dimensions: • Visual representations; • Interaction techniques; • Steps in the KDD process.This paper is organized as follows: firstly, we introduce VDM.Then we define the terms related to this field of research.In Section 3, we explain our motivation for using 3D representation and VR techniques.In Section 4, we provide an overview of the current state of research concerning 3D visual representations.In Section 5, we present our motivation for interaction techniques in the context of KDD.In Section 6, we describe the related work about visualization taxonomy and interaction techniques.In Section 7, we propose a new classification for VDM based on both 3D representations and interaction techniques.In addition, we survey representative works on the use of 3D and VR interaction techniques in the context of KDD.Finally, we present possible directions for future research.2006))information visualization framework Beilken & Spenke (1999) presented the purpose of VDM as a way to "help a user to get a feeling for the data, to detect interesting knowledge, and to gain a deep visual understanding of the data set".Niggemann (2001) looked at VDM as a visual representation of the data close to the mental model.In this paper we focus on the interactive exploration of data and knowledge that is built on extensive visual computing (Gross (1994)).
As humans understand information by forming a mental model which captures only the main information, in the same way, data visualization, similar to the mental model, can reveal hidden information encoded in the data.In addition to the role of the visual data representation, Ankerst (2001) explored the relation between visualization and the KDD process.He defined VDM as "a step in the KDD process that utilizes visualization as a communication channel between the computer and the user to produce novel and interpreted patterns".He also explored three different approaches to VDM, two of which affect the final or intermediate visualization results.The third approach involves the interactive manipulation of the visual representation of the data rather than the results of the KDD methods.The three definitions recognize that VDM relies heavily on human perception capabilities and the use of interactivity to manipulate data representations.The three definitions also emphasize the key importance of the following three aspects of VDM: visual representations; interaction processes; and KDD tasks.
In most of the existing KDD tools, VDM is only used during two particular steps of the KDD process: in the first step (pre-processing) VDM can play an important role since analysts need tools to view and create hypotheses about complex (i.e.very large and / or high-dimensional) original data sets.VDM tools, with interactive data representation and query resources, allow domain experts to explore quickly the data set (de Oliveira & Levkowitz (2003)).In the last step (post-processing) VDM can be used to view and to validate the final results that are mostly multiple and complex.Between these two steps, an automatic algorithm is used to perform the DM task.Some new methods have recently appeared which aim at involving the user more significantly in the KDD process; they use visualization and interaction more intensively, with the ultimate goal of gaining insight into the KDD problem described by vast amounts of data or knowledge.In this context, VDM can turn the information overload into an opportunity by coupling the strengths of machines with that of humans.On the one hand, methods from KDD are the driving force of the automatic analysis side, while on the other hand, human capabilities to perceive, relate and make conclusions turn VDM into a very promising research field.Nowadays, fast computers and sophisticated output devices can create meaningful visualization and allow us not only to visualize data and concepts, but also to explore and interact with this data in real-time.Our goal is to look at VDM as an interactive process with the visual representation of data allowing KDD tasks to be performed.The transformation of data / knowledge into significant visualization is not a trivial task.Very often, there are many different ways to represent data and it is unclear which representations, perceptions and interaction techniques needs to be applied.This paper seeks to facilitate this task according to the data and the KDD goal to be achieved by reviewing representation and interaction techniques used in VDM.KDD tasks have different goals and diverse tasks need to be applied several times to achieve a desired result.Visual feedback has a role to play, since the decision-maker needs to analyze such intermediate results before making a decision.We can distinguish two types of cognitive process within which VDM assists users to make a decision: • Exploration: the user does not know what he/she is looking for (discovery).
• Analysis: the user knows what he/she is looking for in the data and tries to verify it (visual analysis).

From 2D to 3D visualization and virtual reality
There is a controversial debate on the use of 2D versus 3D and VR for information visualization.In order to justify our choice of 3D and VR, we first review the difference between 3D visualizations and VR techniques: • 3D visualization is a representation of an object in a 3D space by showing length, width and height coordinates on a 2D surface such as a computer monitor.3D visual perception is achieved using visual depth cues such as lighting, shadows and perspective.
• VR techniques enable user immersion in a multi-sensorial VE and user interaction devices and stereoscopic images to increase depth perception and the relative 3D position of objects.

2D versus 3D
Little research has been dedicated to the comparison of 2D and 3D representations.generally not been advised ever since the publications by Tufte (1983) and Cleveland & McGill (1984).Nevertheless, the experiments of Spence (1990) and Carswell et al. (1991) show that there is no significant difference of accuracy between 2D and 3D for the comparison of numerical values.In particular, Spence (1990) pointed out that it is not the apparent dimensionality of visual structures that counts but rather the actual number of parameters that show variability.Under some circumstances, information may be processed even faster when represented in 3D rather than in 2D.Concerning the perception of global trends in data, experimental results of Carswell et al. (1991) also show an improvement in answer times using 3D but to the detriment of accuracy.Other works compare 2D and 3D within the framework of interactive visualization.Ware & Franck (1994) indicated that displaying data in 3D instead of 2D can make it easier for users to understand the data.Finally, Tavanti & Lind (2001) pointed out that realistic 3D displays could support cognitive spatial abilities and memory tasks, namely remembering the place of an object, better than with 2D.
On the other hand, several problems arise such as intensive computation, more complex implementations than 2D interfaces, and user adaptation and disorientation.The first problem can be addressed by using powerful and specialized hardware.However, one of the main problems of 3D applications is user adaptation.In fact, most users just have experience with classical windows, icons, menu pointing devices (WIMP) and 2D-desktop metaphors.Therefore, interaction with 3D presentations and possibly the use of special devices demand considerable adaptation efforts to use this technology.There is still no commonly-accepted standard for interaction with 3D environments.Some research has shown that it takes users some time to understand what kind of interaction possibilities they actually have (Baumgärtner et al. (2007)).In particular, as a consequence of a richer set of interactions and a higher degree of freedom, users may be disoriented.

Toward virtual reality
To overcome limitations of interaction with 3D representations, VR interfaces and input devices have been proposed.These interfaces and devices offer simpler and more intuitive interaction techniques (selection, manipulation, navigation, etc.), and more compelling functionality (Shneiderman (2003)).In VR, the user can always access external information without leaving the environment and the context of the representation.Also, the user's immersion in the data allows him to take advantage of stereoscopic vision that enables him to disambiguate complex abstract representations (Maletic et al. (2001)).Ware & Franck (1996), compared the visualization of 2D and 3D graphs.Their work shows a significant improvement in intelligibility when using 3D.More precisely, they found that the ability to decide if two nodes are connected or not is improved by a factor 1.6 when adding stereo cues, by 2.2 when using motion parallax depth cues, and by a factor of 3 when using stereoscopic as well as motion parallax depth cues.Aitsiselmi & Holliman (2009), found that the participants obtained better scores if they were doing a mental rotation task on a stereoscopic screen instead of a 2D screen.This result demonstrates the efficiency of VR and shows that the extra depth information given by stereoscopic display makes it easier to move a shape mentally.It is generally considered that only stereoscopy allows one to fully exploit the characteristics of the 3D representations.It helps the viewer to judge the relative size of objects and the distances between them.It also helps him to mentally move a shape in the 3D visualization area.Finally, Cai et al. (2007), found that visualization increases robustness in object tracking and positive detection accuracy in object prediction.They also found that the interactive method enables 189 An Overview of Interaction Techniques and 3D Representations for Data Mining www.intechopen.comthe user to process the image data 30 times faster than manually.As a result, they suggested that human interaction may significantly increase overall productivity.
We can therefore conclude that stereoscopy and interaction are the two most important components of VE and the most useful to users.Therefore, the equipment used should be taken into account from the very beginning of application design, and consequently be taken into account as a part of VDM techniques taxonomy.

Visual representations for Visual Data Mining
One of the problems that VDM must address is to find an effective representation of something that has no inherent form.In fact, it is crucial not only to determine which information to visualize but also to define an effective representation to convey the target information to the user.The design of a visualization representation must address a number of different issues: what information should be presented?How this should be done?What level of abstraction to support?etc.For example, a user tries to find out interesting relations between variables in large databases.This information may be visualized as a graph (Pryke & Beale (2005)) or as an abstract representation based on a sphere and cone (Blanchard et al. (2007)).
Many representations for VDM have been proposed.For instance, some visual representations are based on abstract representations, such as graphs (Ahmed et al. ( 2006)), trees (Einsfeld et al. (2007), Buntain (2008)), and geometrical shapes (Ogi et al. (2009), Nagel et al. (2008), Meiguins et al. (2006)) and others on virtual worlds objects (Baumgärtner et al. (2007)).The classification proposed in this chapter provides some initial insight into which techniques are oriented to certain data types, but does not assert that one visual representation is more suitable than others to explore a particular data set.Selecting a representation depends largely on the task being supported and is still a largely intuitive process.

Abstract visual representations
3D representations are still abstract and require the user to learn certain conventions, because they do not look like what they refer to or they do not have a counterpart in the real-world.There are 3 kinds of abstract representations: graphs, trees, and geometrical shapes.

Graphs
A graph (Figure .3) is a network of nodes and arcs, where the nodes represent entities while the arcs represent relationships between entities.For a review on the state of the art in graph visualization see Herman et al. (2000).
At the beginning, graph visualization was used in 2D space to represent components around simple boxes and lines.However, several authors think that larger graph structures can be viewed in 3D (Parker et al. (1998)).In the empirical study of Ware & Franck (1996), which measured path-tracing ability in 3D graphs, they suggested that the amount of information that can be displayed in 3D with stereoscopic and motion depth cues exceeds 2D representations by a factor of 3. Another experiment with new display technologies confirmed the previous experiment and showed much greater benefits than previous studies.Ware & Mitchell (2008) experiments showed that the use of stereoscopic display, kinetic depth and 3D tubes was much more beneficial than using lines to display the links as in previous studies.

www.intechopen.com
A technique based on the hyper system (Hendley et al. (1999)) for force-based visualization can be used to create a graph representation.The visualization consists of nodes and links whose properties are given by the parameters of the data.Data elements affect parameters such as node size and color, link strength and elasticity.The dynamic graphs algorithm enables the self-organization of nodes in the visualization area by the use of a force system in order to find a steady state, and determine the position of the nodes.For example, Beale (2007) proposed a Haiku system (Figure.3(b))which provides an abstract 3D perspective of clustering algorithm results based on the hyper system.One of the characteristics of this system is that the user can choose which parameters are used to create the distance metrics (distance between two nodes), and which ones affect the other characteristics of the visualization (node size, link elasticity, etc.).Using the hyper system allows related things (belonging to the same cluster) to be near to each other, and unrelated things to be far away.visualization of hierarchical information structures is an important topic in the information visualization community (Van Ham (2002)).Because trees are generally easy to layout and interpret (Card et al. (1999)), this approach finds many applications in classification visualization (Buntain (2008)).3D trees were designed to display a larger number of entities than in 2D representations, in a comprehensible form (Wang et al. (2006)).Various methods have been developed for this purpose, among which, space-filling techniques and node-link techniques.
Space-filling techniques (Van Ham (2002), Wang et al. (2006)) based upon 2D tree-maps visualization proposed by Johnson & Shneiderman (1991) have been successful for visualizing trees that have attributes values at the node level.Space-filling techniques are particularly useful when users care mostly about nodes and their attributes but do not need to focus on the topology of the tree, or consider that the topology of the tree is trivial (e.g 2 or 3 levels).
The users of space-filling techniques also require training because of the unfamiliar layout (Plaisant et al. (2002)).
Node-link techniques, on the other hand, have long been frowned upon in the information visualization community because they typically make inefficient use of screen space.Even trees of a hundred nodes often need multiple screens to be completely displayed, or require scrolling since only part of the tree is visible at a given time.(1997), making the opacity of each voxel a function of points density.Using scatter-plots is intuitive since each data is faithfully displayed.Scatter-plots have been used successfully for detecting relationships in two dimensions (Bukauskas & Böhlen (2001), Eidenberger (2004)).This technique hit limitations if the dataset is large, noisy, or if it contains multiple structures.
With large amounts of data, the amount of displayed objects makes it difficult to detect any structure at all.

Virtual worlds
Trying to find easily-understandable data representations, several researchers proposed the use of real-world metaphors.This technique uses elements of the real-world to provide insights about data.For example, some of these techniques are based on a city abstraction ).The virtual worlds (sometimes called cyber-spaces) for VDM are generally based either on the information galaxy (Krohn (1996)) or the information landscape metaphor (Robertson et al. (1998)).The difference between the two metaphors is that in the information landscape, the elevation of objects is not used to represent information (objects are placed on a horizontal floor).The specificity of virtual worlds is that they provide the user with some real world representations.

Interaction techniques for Visual Data Mining
Interaction techniques can empower the user's perception of information when visually exploring a data set (Hibbard et al. (1995)).The ability to interact with visual representations can greatly reduce the drawbacks of visualization techniques, particularly those related to visual clutter and object overlap, providing the user with mechanisms for handling complexity in large data sets.Pike et al. (2009) explored the relationship between interaction and cognition.They consider that the central percept of VDM is that the development of human insight is aided by interaction with a visual interface.As VDM is concerned with the relationship between visual displays and human cognition, merely developing only novel visual metaphors is rarely sufficient to make new discoveries provide or confirmation or negation of a prior belief.
Interaction also allows the integration of the user in the KDD process.KDD is not a completely human-guided process, since DM algorithms analyze a data set searching for useful information and statistically valid knowledge.The degree of automation of the KDD process actually varies considerably since different levels of humans guidance and interaction are usually required.But it is still the algorithm, and not the user, that is looking for knowledge.In this context, de Oliveira & Levkowitz (2003) suggested that VDM should have a greater role than a traditional application of visualization techniques to support the non-analytic stages of a KDD process.It is through the interactive manipulation of a visual interface that knowledge is constructed, tested, refined and shared.
We can distinguish 3 different interaction categories: exploration, manipulation and human-centered approaches.

Visual exploration
Visual exploration techniques are designed to take advantage of the considerable visual capabilities of human beings, especially when users try to analyze tens or even hundreds of graphic variables in a particular investigation.Visual exploration allows the discovery of data trends, correlations and clusters, to take place quickly, and can support users in formulating hypotheses about the data.It is essential in some situations to allow the user to simply look at the visual representation in a passive sense.This may mean moving around the view point in order to reveal structure in the data that may be otherwise masked and overlooked .In this way, exploration provides the means to view information from different perspectives to avoid occlusion and to see object details.It can be very useful to have the ability to move the image to resolve any perceptual ambiguities that exist in a static representation when a large amount of information is displayed at once.The absence of certain visual cues (when viewing a static image) can mask important results (Kalawsky & Simpkin (2006)).
Navigation is often the primary task in 3D worlds and refers to the activity of moving through the scene.The task of navigation presents challenges such as supporting spatial awareness and providing efficient and comfortable movements between distant locations.Some systems enable users to navigate without constraint through the information space (Nagel et al. (2008)  In visual exploration, the user can also manipulate the objects in the scene.In order to do this, interaction techniques provide means to select and zoom-in and zoom-out to change the scale of the representation.Beale (2007) has demonstrated that using a system which supports the free exploration and manipulation of information delivers increased knowledge even from a well know dataset.Many systems provide a virtual hand or a virtual pointer (Einsfeld et al. (2007)), a typical approach used in VE, which is considered as being intuitive as it simulates real-world interaction (Bowman et al. (2001)).
• Select: this technique provides users with the ability to mark interesting data items in order to keep track of them when too many data items are visible, or when the perspective is changed.In these two cases, it is difficult for users to follow interesting items.By making items visually distinctive, users can easily keep track of them even in large data sets and/or with changed perspectives.
• Zoom: by zooming, users can simply change the scale of a representation so that they can see an overview (context) of a larger data set (using zoom-out) or the detailed view (focus) of a smaller data set (using zoom-in).The essential purpose is to allow hidden characteristics of data to be seen.A key point here is that the representation is not fundamentally altered during zooming.Details simply come into focus more clearly or disappear into context.
Visual exploration (as we can see in Section.7) can be used in the pre-processing of the KDD process to identify interesting data (Nagel et al. (2008)), and in post-processing to validate DM algorithm results (Azzag et al. (2005)).For example, in VRMiner (Azzag et al. (2005)) and in ArVis (Blanchard et al. (2007)), the user can point to an object to select it and then obtain informations about it.

Visual manipulation
In KDD, the user is essentially faced with a mass of data that he/she is trying to make sense of.He/she should look for something interesting.However, interest is an essentially human construct, a perspective of relationships among data that is influenced by tasks, personal preferences, and past experience.For this reason, the search for knowledge should not only be left to computers; the user has to guide it depending upon what he/she is looking for, and hence which area to focus computing power on.Manipulation techniques provide users with different perspectives of the visualized data by changing the representation.On of this techniques is the capability of changing the attributes presented in the representation.For example, in the system shown by Ogi et al. (2009), the user can change the combination of presented data.Other systems have interaction techniques that allow users to move data items more freely in order to make the arrangement more suitable for their particular mental model (Einsfeld et al. (2006)).Filter interaction techniques enable users to change the set of data items being presented on some specific conditions.In this type of interaction, the user specifies a range or condition, so that only data meeting those criteria are presented.Data outside the range or not satisfying the conditions are hidden from the display or shown differently; even so, the actual data usually remain unchanged so that whenever users reset the criteria, the hidden or differently-illustrated data can be recovered.The user is not changing data perspectives, just specifying conditions within which data are shown.ArVis (Blanchard et al. (2007)), allows the user to look for a rule with a particular item in it.To do this, the user can search for it in a menu which lists all the rule items and allows the wanted object to be shown.

196
Applications of Virtual Reality

www.intechopen.com
An Overview of Interaction Techniques and 3D Representations for Data Mining 13

Human-centered approach
In most existing KDD tools, interaction can be used in two different ways: exploration and manipulation.Some new methods have recently appeared (Baumgärtner et al. (2007), Poulet & Do (2008)), trying to involve the user in the DM process more significantly and using visualization and interaction more intensively.In this task, the user manipulates the DM algorithm and not only the graphical representation.The user sends commands to the algorithm in order to manipulate the data to be extracted.We speak here about local knowledge discovery.This technique allows the user to focus on interesting knowledge from user's point of view, in order to make the DM tool more generically useful to the user.It is also necessary for the user to either change the view point or manipulate a given parameter of the knowledge discovery algorithm and observe its effect.There must therefore be some way in which the user can indicate what it is considered interesting and what is not, and to do this the KDD tool needs to be dynamic and versatile (Ceglar et al. (2003)).The human-centered process should be iterative since it is repeated until the desired results are obtained.From a human interaction perspective, a human-centered approach closes the loop between the user and the DM algorithm in a way that allows them to respond to results as they occur by interactively manipulating the input parameters (Figure .8).
With the purpose of involving the user more intensively in the KDD process, this new kind of approach has the following advantages (Poulet & Do (2008)) • The quality of the results is improved by the use of human-knowledge recognition capabilities; • Using the domain knowledge during the whole precess (and not only in the interpretation of the results) allows guided searching for knowledge.
• The confidence in the results is improved as the DM process gives more comprehensible results.
In Arvis (Blanchard et al. (2007)), the user can navigate among the subsets of rules via a menu providing neighborhood relations.By applying a neighborhood relation to a rule, the mining algorithm extracts a new subset of rules.The previous subset is replaced by the new subset in the visualization area.

Related work on taxonomies of visual representations and interaction techniques
Many researchers have attempted to construct a taxonomy for visualization.Chi (2000) used the Data State Model (Chi & Riedl (1998)) to classify information visualization techniques.This model is composed of 3 dimensions with categorical values: data stages (value, analytical abstraction, visualization abstraction, and view), transformation operators (data transformation, visualization transformation, and visual mapping transformation), and within-stage operators (value stage, analytical stage, visualization stage, and view stage).This model shows how data change from one stage to another requiring one of the three types of data transformation operators.This state model helps implementers understand how to apply and implement information visualization techniques.Tory & Moller (2004), present a high-level taxonomy for visualization which classifies visualization algorithms rather than data.Algorithms are categorized according to the assumption that they make about the data being visualized.Their taxonomy is based on 2 dimensions:  Another area of related research is interaction and user interfaces.In this area, (Bowman et al. (2001)) present an overview of 3D interaction and user interfaces (3DUI).This paper also discuses the effect of common VE hardware devices on user interaction, as well as interaction techniques for generic 3D tasks and the use of traditional WIMP styles in 3D environments.They divide most user interaction tasks into three categories: navigation, selection/manipulation and system control.Arns (2002) thinks that Bowman's taxonomy is general and can encompass too many parts of a VR system.For that reason, she created a classification for virtual locomotion (travel) methods.This classification includes information on display devices, interaction devices, travel tasks, and the two primary elements of virtual travel: translation and rotation.Dachselt & Hinz (2005) have proposed a classification of 3D-widget solutions by interaction purpose/intention of use, e.g, direct 3D object interaction, 3D scene manipulation, exploration and visualization.Finally, Teyseyre & Campo (2009) presented an overview of 3D representations for visualizing software, describing several major aspects such as visual representations, interaction issues, evaluation methods, and development tools.

A new classification of Visual Data Mining based on visual representations and interaction techniques
In this section, we present a new classification of VDM tools composed of 3 dimensions: visual representations, interaction techniques, and KDD tasks.Currently, visualization tools have to provide not only effective visual representations but also effective interaction metaphors to facilitate the exploration and help users achieve insight.
Having a good 3D representation without a good interaction technique does not mean having a good tool.This classification looks at some representative tools for doing different KDD tasks, e.g., pre-processing and post-processing (classification, clustering and association rules).Different tables summarize the main characteristics of the reported VDM tools with regard to visual representations and interaction techniques.Other relevant information such as interaction actions ( navigation, selection and manipulation, and system control), input-output devices (CAVE, mouse, hand tracker, etc.) presentation (3D representation or VR representation) and year of creation is also reported.

Pre-processing
Pre-processing (in VDM) is the task of data visualization before the DM algorithm is used.It is generally required as a starting point of KDD projects so that analysts may identify interesting and previously unknown data by the interactive exploration of graphical representations of a data set without heavy dependence on preconceived assumptions and models.The basic visualization technique used for data pre-processing is the 3D scatter-plots method, where 3D objects with attributes are used as markers.The main principle behind the design of traditional VDM techniques, such as The Grand Tour (Asimov (1985)), the parallel coordinate (Inselberg & Dimsdale (1990)), etc., is that they are viewed from the outside-in.In contrast to this, VR lets users explore the data from inside-out by allowing users to navigate continuously to new positions inside the VE in order to obtain more information about the data.Nelson et al. (1999) demonstrated through comparisons between 2D and VR versions of the VDM tool XGobi that the VR version of XGobi performed better.
In the Ogi et al. (2009) system, the user can see several data set representations integrated in the same space.The user can switch the visible condition of each data set.This system could be used to represent the relationships among several data sets in 3D space, but it does not allows the user to navigate through the data set and interact with it.The user can only change the visual mapping of the data set.However, the main advantage of this system is that the data can be presented with a hight degree of accuracy using hight-definition stereo-images that can be beneficial especially when visualizing a large amount of data.This system has been applied to the visualization and analysis of earthquake data.Using the 3rd dimension has allowed the visualization of both the overall distribution of the hypocenter data and the individual location on any earthquake, which is not possible with the conventional 2D display.
Figure 9 shows hypocenter data recorded over 3 years.The system allows the visualization of several databases at the same time e.g.map data, terrain data, basement depth, etc and the user can switch the visible condition of each data in the VE.For example, the user can change the visualization data from the combination of hypocenter data and basement depth Table 2. 3D VDM tool summary for pre-processing KDD task data to the combination of hypocenter data and terrain data.Thus, the system can shows the relationships between only any two data sets among the others.
As a result of using VR, the 3DVDM system (Nagel et al. (2008)) is capable of providing real-time user response and navigation as well as showing dynamic visualization of large amounts of data.Nagel et al. (2008) demonstrated that the 3DVDM visualization system allows faster detection of non-linear relationships and substructures in data than traditional methods of data analysis.An alternative proposal is available with DIVE-ON (Data mining in an Immersed Visual Environment Over a Network) system, proposed by Ammoura et al. (2001).The main idea of DIVE-ON is visualizing and interacting with data from distributed data warehouses in an immersed VE.The user can interact with such sources by walking or flying toward's them.He/she also can pop up a menu, scroll through it and execute all environment, remote, and local functions.Thereby, DIVE-ON makes intelligent use of the natural human capability of interacting with spatial objects and offers considerable navigation possibilities e.g.walking, flying, transporting and climbing.Inspired by treemaps Wang et al. (2006) presented a novel space-filling approach for tree visualization of file systems (Figure .10).This system provides a good overview for a large hierarchical data set and uses nested circles to make it easier to see groupings and structural relationships.By clicking on an item (a circle), the user can see the associated sub-items represented by the nested circles in a new view.The system provides the user with a control panel allowing him/her to filter files by types; by clicking on one file type, the other files types are filtered out.A zoom-in/zoom-out function allows the user to see folder or file characteristics such as name, size, and date.A user-feedback system means that user interaction techniques are friendly and easy to use.2006) presented a tool for multidimensional VDM visualization in an augmented-reality environment where the user may visualize and manipulate information in real time VE without the use of devices such as a keyboard or mousse and interact simultaneously with other users in order to make a decision related to the task being analyzed.This tool uses a 3D scatter-plot to visualize the objects.Each visualized object has specific characteristics of position (x, y and z axes), color, shape, and size that directly represent data item values.The main advantages of this tools is that provide users with a dynamic menu which is displayed in an empty area when the user wants to execute certain actions.The tool also allows users to perform many manipulation interactions tasks such as real-time filter attributes, semantic zoom, rotation and translation of objects is the visualization area.A detailed comparison of these techniques is presented in Table .2.

Post-processing
Post-processing is the final step of the KDD process.Upon receiving the output of the DM algorithm, the decision-maker must evaluate and select the interesting part of the results.

Clustering
Clustering is used for finding groups of items that are similar.Given a set of data items, this set can be partitioned into a set of classes, so that items with similar characteristics are grouped together.
The GEOMI system proposed by Ahmed et al. ( 2006) is a visual analysis tool for the visualization of clustered graphs or trees.The system implements block model methods to associate each group of nodes to corresponding cluster.Two nodes are in the same cluster if they have the same neighbor set.This tool allows immersive navigation in the data using 3D head gestures instead of the classical mouse input.The system only allows the user visual exploration.Users can walk into the network, move closer to nodes or clusters by simply aiming in their direction.Nodding or tilting the head rotates the entire graph along the X and Y axes respectively, which provides users with intuitive interaction.
The objective of @VSIOR (Baumgärtner et al. (2007)), which is a human-centered approach, is to create a system for interaction with document, meta-data, and semantic relations.Human capabilities in this context are spatial memory and the fast visual processing of attributes and patterns.Artificial intelligence techniques assist the user, e.g. in searching for documents and calculating document similarities.
Otherwise, VRMiner (Azzag et al. (2005)) uses stereoscopic and intuitive navigation; these allow the user to easily select the interesting view point.VRMiner users have found that using this tool helps them solve 3 major problems: detecting correlation between data dimensions, checking the quality of discovered clusters, and presenting the data to a panel of experts.In this context, the stereoscopic display plays a crucial role in addition to the intuitive navigation which allows the user to easily select the interesting view point.
A detailed comparison of these techniques is presented in Table .3.

Classification
Given a set of pre-defined categorical classes, determine which of these classes a specific data item belongs to.(Buntain (2008)).The structure classes and relations among those classes can be presented to the user in a graphic form to facilitate understanding of the knowledge domain.This view can then be mapped onto the document space where shapes, sizes, and locations are governed by the sizes, overlaps, and other properties of the document classes.This view provides a clear picture of the relations between the resulting documents.Additionally, the user can manipulate the view to show only those documents that appear in a list of a results from of a query.Furthermore, if the results view includes details about subclasses of results and "near miss" elements in conjunction with positive results, the user can refine the query to find more appropriate results or widen the query to include more results if insufficient information is forthcoming.The third dimension allows the user a more expressive space, complete with navigation methods such as rotation and translation.In 3D, overlapping lines or labels can be avoided by rotating the layout to a better point of view.
DocuWorld (Einsfeld et al. (2006)), is a prototype for a dynamic semantic information system.This tool allows computed structures as well as documents to be organized by users.Compared to the web Forager (Card et al. (1996)), a workspace to organize documents with different degrees of interest at different distances to the user, DocuWorld provides the user with more flexible possibilities to store documents at locations defined by the user and visually indicates cluster-document relations (different semantics of connecting clusters to each other).
A detailed comparison of these techniques is presented in Table.4.

Association rules
On account of the enormous quantities of rules that can be produced by DM algorithms, association rule post-processing is a difficult stage in an association rule discovery process.Table 5. 3D VDM tool summary for association rules KDD task post-processing is also exploited during association rule mining to reduce the search space and avoid generating huge amounts of rules.Götzelmann et al. (2007) proposed a VDM system to analyze error sources of complex technical devices.The aims of the proposed approach is to extract association rules from a set of documents that describe malfunctions and errors for complex technical devices, followed by a projection of the results on a corresponding 3D model.Domain experts can evaluate the results gained by the DM algorithm by exploring a 3D model interactively in order to find spatial relationships between different components of the product.3D enables a flexible spatial mapping of the results of statistical analysis.The visualization of statistical data on their spatial reference object by modifying visual properties to encode data (Figure.6(a) ) can reveal apriori unknown facts, which where hidden in the database.By interactively exploring the 3D model, unknown sources and correlations of failures can be discovered that rely on the spatial configuration of several components and the shape of complex geometric objects.
A detailed comparison of these techniques is presented in Table .5.

Combining several methods
The Haiku tool (Figure .3(b))combines several DM methods: clustering, classification and association rules (Beale (2007)).In this tool, the use of 3D graphs allows the visualization of high-dimensional data in a comprehensible and compact representation.The interface provides a large set of 3D manipulation feature of the structure, such as zooming in and out, moving through the representation (flying), rotating, jumping to specific location, viewing data details, and defining an area of interest .The only downside is that the control is done using a mouse.A detailed presentation is shown in

Conclusion
A new classification of VDM tools composed of 3 dimensions: visual representations; interaction techniques; and DM tasks, has been presented along with a survey of visual representations and interaction techniques in VDM.We can see that most of the recent VDM tools still rely on interaction metaphors developed more than a decade ago, and do not take into account the new interaction metaphors and techniques offered by VR technology.It is questionable whether these classical visualization/interaction techniques are able to meet the demands of the ever-increasing mass of information, or whether we are losing ground because we still lack the possibilities to properly interact with the databases to extract relevant knowledge.Devising intuitive visual interactive representations for DM and providing real-time interaction and mapping techniques that are scalable to the huge size of many current databases, are some of the research challenges that need to be addressed.In answer to this challenge, Mackinlay (1986) proposes two essential criterias to evaluate data mapping by visual representation: expressiveness and effectiveness.Firstly, expressiveness criteria determine whether a visual representation can express the desired information.Secondly, effectiveness criteria determine whether a visual representation exploits the capabilities of the output medium and the human visual system.Although the criteria were discussed in a 2D-graphic context, they can be extended to 3D and VR visualization.Finally, VDM is inherently cooperative requiring many experts to coordinate their activities to make decisions.Thus, collaborative research visualization may help to improve VDM processes.For example, current technology provided by 3D collaborative virtual worlds for gaming and social interaction, may support new methods of KDD.

Fig. 3 .
Fig. 3.An example of graph representations: (a) Source code Ougi (Osawa et al. (2002)), (b) Association rules: Haiku (Beale (2007)), (c) DocuWorld (Einsfeld et al. (2006)) A well-known node-link representation in cone trees was introduced by Robertson et al. (1991) for visualizing large hierarchical structures in a more intuitive way.3D trees may be displayed vertically (Cone Tree) or horizontally (Cam Tree).Buntain (2008) used 3D trees for ontology classification visualization (Figure.4(a)).Each leaf represents a unique concept in the ontology, and the transparency and size of each leaf is governed by the number of documents associated with the given concept.A molecule is constructed by clustering together spheres that share common documents, and surrounds the leaves with a semi transparent shell (Figure.4(b)).

193
An Overview of Interaction Techniques and 3D Representations for Data Mining www.intechopen.com(Figure.6

Fig. 6 .
Fig. 6.Example of virtual world representation (a) faults projected onto a car model in (Götzelmann et al. (2007)) (b) documents classification in @VISORBaumgärtner et al. (2007) ,Einsfeld et al. (2006),Azzag et al. (2005)).Other systems restrict movement in order to reduce possible user disorientation(Ahmed et al. (2006)).As an illustration, in VRMiner(Azzag et al. (2005)) a six-degree freedom sensor is fixed to the user's hand (Figure.7)allowing him/her to easily define a virtual camera in 3D space.For example, when the user moves his hand forward in the direction of the object, he/she may zoom in or out.The 3DVDM system(Nagel et al. (2008)) allows the user to fly around and within the visualized scatter-plot.The navigation is controlled by the direction of a "wanda" device tracked with 6 degrees of freedom.Dissimilarly, in GEOMI (Ahmed et al. (2006)), the user can only rotate the representation along the X and Y axes but not along the Z axis.

197An
Overview of Interaction Techniques and 3D Representations for Data Mining www.intechopen.com

Fig. 8 .
Fig. 8.The human-centered approach • Data values: discrete or continuous • How the algorithm designer chooses to display attributes: specialization, timing, color, and transparency.

Table 1 .
Arns (2002)sents the different modalities of each of the three dimensions.The proposed taxonomy takes into account both the representation and the interaction technique.In addition, many visualization Dimension modalities design taxonomies include only a small subset of techniques (e.g., locomotionArns (2002)).

Table 3
. 3D VDM tool summary for clustering KDD task In SUMO (Figure.4), a tool for document-class visualization is proposed

Table 6 .
Table.6.205 An Overview of Interaction Techniques and 3D Representations for Data Mining www.intechopen.com3D VDM tool combining several methods