Present-day home appliances have more functions, are more complicated, and are expected to process information together as more home networks and protocols are developed. This situation makes many users feel uneasy as they need to understand more complex information. They cannot intuitively understand what functions objects have and it has become more difficult to accept information from them in these situations. Therefore engineers are faced with a massive challenge to improve their interfaces and design products that facilitate easier use.
However, it is difficult to improve the designs and interfaces of all objects. Instead of improving the designs or the interfaces of objects, we preferred to provide information via anthropomorphic and communicative agents such as though a humanoid robot (Kanda et al., 2003) or a virtual agent (Mukawa et al., 2003), which seemed to be more useful and user friendly.
We propose a “display robot” as one agent system. It transforms an object into an by using anthropomorphization, which makes the interaction between humans and the object more intuitive. Users can understand the functions of objects more intuitively using the display robot and can accept information from them. We also think that the display robot can solve problems with impediments where users accept the agents themselves as “obstacles” to acquisition (Fukayama et al., 2003) (Fig. 1 top). The display robot does not use additional agents that are not related to an object, but it makes the object as additional agent that interacts with users (Fig. 1 bottom). As this situation does not create any additional agents in the field of interaction, users are not encumbered by additional information. It is also possible to identify the object's segments such as its “head” or “stomach” if it is anthropomorphized and has an imaginary body image. It can also use metaphorical and intuitive expressions for functions, such as “Something is wrong with my stomach” using the virtual body image.
We have already conducted an experiment to evaluate the anthropomorphization of an object (Osawa et al. 2006) and its virtual body image (Osawa et al. 2007). We used three anthropomorphized refrigerators in these experiments, the first was anthropomorphized by eye-like parts attached to its top, the second was anthropomorphized by the parts attached to its bottom, and the third was anthropomorphized by voice only. The study found that users can detect requests by an object more easily if it is anthropomorphized using the eye-like parts than if it is just the object itself. This indicated that the eye-like appearance reinforced the “body image of the stomach” in the situation where the Iris-board was attached to the top of the object, and users could recognize its top segment as the “head” and interact with it as such.
However, these experiments were conducted with a limited category, i.e., university students. Therefore we needed to find what sorts of people (gender and age) accept anthropomorphized objects.
We developed eye-like parts and arm-like parts for this study, and we did on-the-spot research on human-object interaction by using these. Our result indicates that anthropomorphization by the display robot was accepted mostly by female participants and accepted by everyone except for those aged 10 to 19.
2.1. Theoretical background
Reeves noted in Media Equation (Reeves & Nass, 1996) that people can accept objects as communicative subjects and act as if they had a “virtual” body under some circumstances. Their study revealed that we have the tendency to regard non-communicative objects as communicative agents.
Bateson et al. demonstrated the effect of anthropomorphization in an experiment using an honesty box (Bateson et al. 2006). They attached a picture of an eye to the top of a menu and participants gazed 2.76 times more at this than the picture of a flower that had also be attached to its top. Their study revealed that attaching human-like parts to a menu affects human actions.
The display robot extends this “virtual” body of an object that participants basically accept because human-like moving body parts have been attached to it to extend its subjectivity. For example, if washing machines are anthropomorphized, users can accept their door as being “mouths” (Fig. 2). Anthropomorphic agent on the machine is considered by C-Roids (Green et al. 2001). However, a user can accept machine's “virtual body” by attached display robot. So this kind of robot extends expression of machines more than C-Roids.
We can convert instructions from the object using these virtual body images. For example, an anthropomorphized washing machine using a display robot can use intuitive expressions like “please throw it in my mouth” instead of “please throw it through the door.” We think that these expressions are intuitive to users and they increase his or her intimacy with the object.
2.2. System construction
Figure 3 outlines the system construction for the display robot.
The display robot first calculates the scale of its virtual body image and determines its basic motions and voices for interaction. The main process runs on the scenario server (Fig. 3 center), which selects an appropriate scenario and generates speech and eye and arm motions according to the selected scenario. The eye and arms motions are affected by the scale and position of the virtual body image constructed according to the location of the user's face and locations of eye-like parts and arm-like parts.
2.3. Eye-like parts
The eye-like parts imitated human eyes.
The human eye (1) enables vision and (2) indicates what a person is looking at (Kobayashi & Kohshima, 2001). We focused on objects being looked at and hence used a positioning algorithm design.
The eye-like module that simulates the human eye (Fig. 4) uses an “iris” that represents the human iris and pupil together. The open elliptical region on the right in Fig. 4 represents the sclera and the closed circle, the iris and pupil. Here, the eye-like parts looking at a cup consist of a pair of displays to simulate the eyes. The locations of the irises are calculated with respect to the location of the object, which is acquired by a position sensor.
First, it calculates each iris position as shown below. Each board has an “imaginary eyeball” and it calculates the point of intersection, p, of a vector from the object, i, to the center of the eyeball, c, and board plane A. Based on this point of intersection, the eye-like parts convert the global coordinates of p into display coordinates, i; these processes are performed in both eye-like panels (Fig. 5).
Second, it calculates the orientation of the front of anthropomorphized target by the directions of two eye boards as shown below.
While calculating the normal vector a in certain cases, for example, if the eye-like parts are based on one panel, some additional sensors need to be used, e.g., gyros, to calculate the orientation of panel A.
Since the eye-like parts use two panels, a is calculated from the vector r between the position sensors in the right and left panels. Restrictions exist when the two panels are symmetrically oriented with plane in the middle of the two boards, when the panels are placed vertically (i.e., their pitch angles are 90 degree), and when the tilt angle is known. Under these restrictions, the eye-like parts calculate the iris positions even if one of the two panels moves.
2.4. Arm-like parts
The arm-like parts of the robot imitated a human arm in all respects except in terms of manipulating objects.
When the arm-like parts pointed at the outside of an attached common object, we used the vector from the root of the limb to the tip of the hand as the pointing vector, as shown on the left side of Fig. 6 according to Sugiyama's study on pointing gestures of a communication robot (Sugiyama et al., 2006). However, when the arm-like parts pointed at the inside of an attached common object, we used the vector from the root of the hand to the tip of the hand as the pointing vector, as shown on the right side of Fig. 6.
The display robot did not need to manipulate other objects. Because the target already has its own task, and our devices are used for just expressionism. Instead of manipulation, these devices must be simple and light so they can be easily attached. We developed human-like robotic devices and attached them to our target by using hook and loop fasteners.
The eye-like parts are consisted of a TFT LC Panel. They were used to determine the positions of the pupils and irises using the 3-D coordinate of the places they were attached to and their direction vectors. The eye-like parts were 2-cm wide. They were thin and could be attached anywhere. They can be used to gaze in any directions as if the implemented eye of the object were watching.
The arm-like parts are consisted of six servo motors. Its hand had three motors and it could express delicate gestures with its fingers. The hands looked like long gloves, were covered with cloth, and concealed the implementation required for intuitive interaction.
The parts' locations are obtained from ultrasonic 3D tags (Nishida et al., 2003) on the parts. They send ultrasonic waves to implemented ultrasonic receivers, which calculate 3D axis of the tags. Humanoid parts search for “anthropomorphize-able” objects according to the locations of the parts.
|Scale||120mm x 160mm x 50mm|
|Connection method||Velcro tape|
|Cover||Sponge sheet, Plastic board|
We conducted research to attach the display robot to home appliances to evaluate it. Subjects were given an “invitation task” for interaction where an anthropomorphized home appliance directly invited users with its eyes and arms to interact.
We conducted research in a booth at a university laboratory. The research was conducted over two days. We did experiments for five hours on the first day and seven hours on the second day.
The flow for the interaction between the display robot and users is mapped in Fig. 8. We first attached eye-like parts, arm-like parts, camera and speaker to the object and initialized the coordinates of all the devices. After they had been set up, the display robot detected the user's face with the camera and calculated its position. After it had detected the face, the display robot gazed at it by showing pupil and the iris on eye-like parts and directed him or her with the arm-like parts. If detection lasted 4 s, the display robot randomly chose voices from four alternatives (“Hello!”, “Welcome!”, “Hey!”, and “Yeah!”) and said one of these and beckoned to the user. The display robot with the devices attached invited users to a booth at the laboratory according to the flow in Fig. 8.
We attached the display robot to a small trash box on a desk on the first day (Fig. 9 left), and attached it to an exercise bike on the second day (Fig. 9 right). We manually input the positions of all devices.
3.1. Method of evaluation
We sent participants a questionnaire after the interactions. The questionnaire consisted of two parts, and participants answered it voluntarily. The first question consisted of a paired-adjective test (7-level evaluations of the 17 paired-adjective phrases in Table 3) and a free description of their impressions in watching and being called by the display robot.
There were 52 valid replies to the questionnaire (17 on the first day and 35 on the second). There were 31 male and 16 female participants (five did not identify their gender). Only 46 participants gave their age. The age of the participants ranged from under ten to over fifty years old. Most participants did not interact with the robots until the experiment started and then all the participants interacted with them.
4.1. Sociability value extracted using basic method of analysis
We could not evaluate the results obtained from the questionnaire (17 values from -3 to 3) by simply using the paired-adjective-test results, because participants were not obliged to complete the questionnaire. We applied a principal component analysis to the results of the paired-adjective-test to find hidden trends. We found six axes where the estimated values exceeded one. The results are listed in Table 4.
The most effective axis for evaluating the display robot was PC1 (sociability value) which affected results by approximately 30%. We calculated the sociability values of participants according to gender and age categories. As a result, the average value for male participants was -0.378 and the average value for female participants was 0.434 (Fig. 10). The average values by age are in Fig. 11. We also categorized participants who thought interaction was positive and those who thought interaction was negative according to situations involving watching and calling. The results are listed in Tables 5 and 6.
5.1. Difference between genders
The sociability values between genders are plotted in Fig. 10. These results indicate that female participants had a more favorable impression of the display robot than the males. One female participant said that she felt the display robot was “cute and unique” in an uncoerced answer to the questionnaire. Other descriptions by female participants indicated that they saw the display robot intuitively and the object as a unified agent. Some female participants seemed surprised after the researcher had explained that the display robot and the object were separate devices.
The reason for the difference may have been because the female participants accepted the display robot and object as one unified character (agent) and felt good about it, but male participants accepted the display robot and object as separate devices. Male participants also only paid attention to the display robot's functions and found deficiencies in the devices. They felt the display robot was weirder than the female participants did.
We not only need to improve the accuracy of the display robot's devices but also to design a natural scenario for male users to increase their favorable impressions.
5.2. Differences between age groups
The sociability values for the six different age groups are plotted in Fig. 11. We can see that the values decrease for those under 10 years old an gradually increase for those over 10 years old.
The reasons for this phenomenon may be as follows. If participants are under 10 years old, they freely admit the object has eyes and arms. However, if they are 10 to 19, they think it is embarrassing to interact with anthropomorphized objects and their sociability values are decrease as a result. The experimenters found in observing the participants that those under 10 years of age acted aggressively with the display robot, pulling its arms or pushing its eyes, but those between 10 years of age to university-age students watched the display robot from a distance.
We also found that those who were more than 30 years old had greater sociability values than younger participants. This may have been because they could objectively interact with the anthropomorphized object, and felt less embarrassed because they were older.
These results indicate that 10 to 19 years olds had a tendency to find interaction with anthropomorphized objects to be embarrassing. We need to design a more attractive scenario where the 10 to 19 year old age group can interact with objects without being embarrassed.
5.3. Impressions for watching action
Table 5 shows that participants who felt watching were negative said that they felt the object was horrible because it could not gaze at them accurately. It also shows that participants who felt watching was positive said that they felt the Iris-board itself was beneficial. We need to improve its gaze so that it is more precise by developing better accurate facial recognition and capturing a wider area with the camera to improve participants' impressions of the display robot.
5.4. Impressions for calling action
Table 6 shows that participants who felt calling was negative said that they could not understand the intentions of anthropomorphized objects and they could not respond to them. We expected that the invitation by an object using its eye gaze and beckoning would attract participants toward the object. The research results indicate that participants could not understand the “invitation by the object” because the trash box and exercise bike basically had no functions and there was no need to invite people. We found that we needed to design scenarios that extended the “intention of the object.” For example, if the trash box is anthropomorphized, it needs to interact in the situation where “it needs to collect garbage” and if the exercise bike is anthropomorphized, it needs to interact in the situation where “it needs participants to exercise.” However, participants who felt calling was positive says that they felt it was not only “cute or cool” but also “safe”. This indicates that anthropomorphization increased the subjectivity of the objects and participants felt more glances from them.
This chapter proposed a display robot that acts as an agent to anthropomorphize objects by changing them, using devices that are like human body parts. We did research on the interaction between users and anthropomorphized objects using the displaying robot. As a result, we found that anthropomorphization by the display robot was mostly appreciated by female participants and accepted by people of all ages except for those aged 10 to 19.
However, we need to clarify how the virtual body image is created in the future and what interaction is possible by conducting more experiments and researches.