InTech uses cookies to offer you the best online experience. By continuing to use our site, you agree to our Privacy Policy.

Robotics » Humanoid Robotics » "Human-Robot Interaction", book edited by Daisuke Chugo, ISBN 978-953-307-051-3, Published: February 1, 2010 under CC BY-NC-SA 3.0 license. © The Author(s).

# Interaction between a Human and an Anthropomorphized Object

By Hirotaka Osawa and Michita Imai
DOI: 10.5772/8128

Article top

## Overview

Figure 1. Difference between anthropomorphic agent and display robot.

Figure 2. Difference between anthropomorphic agent and display robot.

Figure 3. System construction.

Figure 4. Human Eye.

Figure 5. Positioning of iris on each board.

Figure 6. Pointing vector.

Figure 7. Humanoid parts.

Figure 8. Flow of interaction between display robot and users.

Figure 9. Anthropomorphized trash box and exercise bike.

Figure 10. Distribution of sociability value'' by gender.

Figure 11. Distribution of “sociability value” by age.

# Interaction between a Human and an Anthropomorphized Object

Hirotaka Osawa1 and Michita Imai

## 1. Introduction

Present-day home appliances have more functions, are more complicated, and are expected to process information together as more home networks and protocols are developed. This situation makes many users feel uneasy as they need to understand more complex information. They cannot intuitively understand what functions objects have and it has become more difficult to accept information from them in these situations. Therefore engineers are faced with a massive challenge to improve their interfaces and design products that facilitate easier use.

However, it is difficult to improve the designs and interfaces of all objects. Instead of improving the designs or the interfaces of objects, we preferred to provide information via anthropomorphic and communicative agents such as though a humanoid robot (Kanda et al., 2003) or a virtual agent (Mukawa et al., 2003), which seemed to be more useful and user friendly.

We propose a “display robot” as one agent system. It transforms an object into an by using anthropomorphization, which makes the interaction between humans and the object more intuitive. Users can understand the functions of objects more intuitively using the display robot and can accept information from them. We also think that the display robot can solve problems with impediments where users accept the agents themselves as “obstacles” to acquisition (Fukayama et al., 2003) (Fig. 1 top). The display robot does not use additional agents that are not related to an object, but it makes the object as additional agent that interacts with users (Fig. 1 bottom). As this situation does not create any additional agents in the field of interaction, users are not encumbered by additional information. It is also possible to identify the object's segments such as its “head” or “stomach” if it is anthropomorphized and has an imaginary body image. It can also use metaphorical and intuitive expressions for functions, such as “Something is wrong with my stomach” using the virtual body image.

We have already conducted an experiment to evaluate the anthropomorphization of an object (Osawa et al. 2006) and its virtual body image (Osawa et al. 2007). We used three anthropomorphized refrigerators in these experiments, the first was anthropomorphized by eye-like parts attached to its top, the second was anthropomorphized by the parts attached to its bottom, and the third was anthropomorphized by voice only. The study found that users can detect requests by an object more easily if it is anthropomorphized using the eye-like parts than if it is just the object itself. This indicated that the eye-like appearance reinforced the “body image of the stomach” in the situation where the Iris-board was attached to the top of the object, and users could recognize its top segment as the “head” and interact with it as such.

#### Figure 1.

Difference between anthropomorphic agent and display robot.

However, these experiments were conducted with a limited category, i.e., university students. Therefore we needed to find what sorts of people (gender and age) accept anthropomorphized objects.

We developed eye-like parts and arm-like parts for this study, and we did on-the-spot research on human-object interaction by using these. Our result indicates that anthropomorphization by the display robot was accepted mostly by female participants and accepted by everyone except for those aged 10 to 19.

## 2. Design

### 2.1. Theoretical background

Reeves noted in Media Equation (Reeves & Nass, 1996) that people can accept objects as communicative subjects and act as if they had a “virtual” body under some circumstances. Their study revealed that we have the tendency to regard non-communicative objects as communicative agents.

Bateson et al. demonstrated the effect of anthropomorphization in an experiment using an honesty box (Bateson et al. 2006). They attached a picture of an eye to the top of a menu and participants gazed 2.76 times more at this than the picture of a flower that had also be attached to its top. Their study revealed that attaching human-like parts to a menu affects human actions.

The display robot extends this “virtual” body of an object that participants basically accept because human-like moving body parts have been attached to it to extend its subjectivity. For example, if washing machines are anthropomorphized, users can accept their door as being “mouths” (Fig. 2). Anthropomorphic agent on the machine is considered by C-Roids (Green et al. 2001). However, a user can accept machine's “virtual body” by attached display robot. So this kind of robot extends expression of machines more than C-Roids.

### Figure 2.

Difference between anthropomorphic agent and display robot.

We can convert instructions from the object using these virtual body images. For example, an anthropomorphized washing machine using a display robot can use intuitive expressions like “please throw it in my mouth” instead of “please throw it through the door.” We think that these expressions are intuitive to users and they increase his or her intimacy with the object.

### 2.2. System construction

Figure 3 outlines the system construction for the display robot.

The display robot first calculates the scale of its virtual body image and determines its basic motions and voices for interaction. The main process runs on the scenario server (Fig. 3 center), which selects an appropriate scenario and generates speech and eye and arm motions according to the selected scenario. The eye and arms motions are affected by the scale and position of the virtual body image constructed according to the location of the user's face and locations of eye-like parts and arm-like parts.

### Figure 3.

System construction.

### 2.3. Eye-like parts

The eye-like parts imitated human eyes.

The human eye (1) enables vision and (2) indicates what a person is looking at (Kobayashi & Kohshima, 2001). We focused on objects being looked at and hence used a positioning algorithm design.

The eye-like module that simulates the human eye (Fig. 4) uses an “iris” that represents the human iris and pupil together. The open elliptical region on the right in Fig. 4 represents the sclera and the closed circle, the iris and pupil. Here, the eye-like parts looking at a cup consist of a pair of displays to simulate the eyes. The locations of the irises are calculated with respect to the location of the object, which is acquired by a position sensor.

### Figure 4.

Human Eye.

First, it calculates each iris position as shown below. Each board has an “imaginary eyeball” and it calculates the point of intersection, p, of a vector from the object, i, to the center of the eyeball, c, and board plane A. Based on this point of intersection, the eye-like parts convert the global coordinates of p into display coordinates, i; these processes are performed in both eye-like panels (Fig. 5).

Second, it calculates the orientation of the front of anthropomorphized target by the directions of two eye boards as shown below.

While calculating the normal vector a in certain cases, for example, if the eye-like parts are based on one panel, some additional sensors need to be used, e.g., gyros, to calculate the orientation of panel A.

### Figure 5.

Positioning of iris on each board.

Since the eye-like parts use two panels, a is calculated from the vector r between the position sensors in the right and left panels. Restrictions exist when the two panels are symmetrically oriented with plane in the middle of the two boards, when the panels are placed vertically (i.e., their pitch angles are 90 degree), and when the tilt angle is known. Under these restrictions, the eye-like parts calculate the iris positions even if one of the two panels moves.

### 2.4. Arm-like parts

The arm-like parts of the robot imitated a human arm in all respects except in terms of manipulating objects.

When the arm-like parts pointed at the outside of an attached common object, we used the vector from the root of the limb to the tip of the hand as the pointing vector, as shown on the left side of Fig. 6 according to Sugiyama's study on pointing gestures of a communication robot (Sugiyama et al., 2006). However, when the arm-like parts pointed at the inside of an attached common object, we used the vector from the root of the hand to the tip of the hand as the pointing vector, as shown on the right side of Fig. 6.

Pointing vector.

### 2.5. Implementation

The display robot did not need to manipulate other objects. Because the target already has its own task, and our devices are used for just expressionism. Instead of manipulation, these devices must be simple and light so they can be easily attached. We developed human-like robotic devices and attached them to our target by using hook and loop fasteners.

The eye-like parts are consisted of a TFT LC Panel. They were used to determine the positions of the pupils and irises using the 3-D coordinate of the places they were attached to and their direction vectors. The eye-like parts were 2-cm wide. They were thin and could be attached anywhere. They can be used to gaze in any directions as if the implemented eye of the object were watching.

The arm-like parts are consisted of six servo motors. Its hand had three motors and it could express delicate gestures with its fingers. The hands looked like long gloves, were covered with cloth, and concealed the implementation required for intuitive interaction.

The parts' locations are obtained from ultrasonic 3D tags (Nishida et al., 2003) on the parts. They send ultrasonic waves to implemented ultrasonic receivers, which calculate 3D axis of the tags. Humanoid parts search for “anthropomorphize-able” objects according to the locations of the parts.

Specifications of parts for an experiment are presented in Tables 1 and 2, and the parts are depicted in Fig. 7.

 Scale 120mm x 160mm x 50mm Weight 180g TFT Controller ITC-2432-035 Wireless module ZEAL-Z1(19200bps) Microcontroller Renesas H8/3694 Connection method Velcro tape Cover Sponge sheet, Plastic board

### Table 1.

Specification of eye parts.

 Scale 250mm x 40mm x 40mm Weight 250g Motor Micro-MG x 3, GWS-pico x 3 Wireless module ZEAL-Z1(9600bps) Microcontroller Renesas H8/3694 Connection method Velcro tape Cover Aluminum, sponge, rubber, gloves

### Table 2.

Specification of arm parts.

## 3. Research

We conducted research to attach the display robot to home appliances to evaluate it. Subjects were given an “invitation task” for interaction where an anthropomorphized home appliance directly invited users with its eyes and arms to interact.

We conducted research in a booth at a university laboratory. The research was conducted over two days. We did experiments for five hours on the first day and seven hours on the second day.

#### Figure 7.

Humanoid parts.

The flow for the interaction between the display robot and users is mapped in Fig. 8. We first attached eye-like parts, arm-like parts, camera and speaker to the object and initialized the coordinates of all the devices. After they had been set up, the display robot detected the user's face with the camera and calculated its position. After it had detected the face, the display robot gazed at it by showing pupil and the iris on eye-like parts and directed him or her with the arm-like parts. If detection lasted 4 s, the display robot randomly chose voices from four alternatives (“Hello!”, “Welcome!”, “Hey!”, and “Yeah!”) and said one of these and beckoned to the user. The display robot with the devices attached invited users to a booth at the laboratory according to the flow in Fig. 8.

#### Figure 8.

Flow of interaction between display robot and users.

We attached the display robot to a small trash box on a desk on the first day (Fig. 9 left), and attached it to an exercise bike on the second day (Fig. 9 right). We manually input the positions of all devices.

#### Figure 9.

Anthropomorphized trash box and exercise bike.

### 3.1. Method of evaluation

We sent participants a questionnaire after the interactions. The questionnaire consisted of two parts, and participants answered it voluntarily. The first question consisted of a paired-adjective test (7-level evaluations of the 17 paired-adjective phrases in Table 3) and a free description of their impressions in watching and being called by the display robot.

 Formal Flexible New Horrible Uninteresting Cold Intimate Unpleasant Lively Foolish Plain Slow Selfish Simple Difficult to understand Weak Cool Informal Inflexible Old Gentle Interesting Hot Not intimate Pleasant Gloomy Wise Showy Fast Unselfish Complex Understandab le Strong Queer

## 4. Result

There were 52 valid replies to the questionnaire (17 on the first day and 35 on the second). There were 31 male and 16 female participants (five did not identify their gender). Only 46 participants gave their age. The age of the participants ranged from under ten to over fifty years old. Most participants did not interact with the robots until the experiment started and then all the participants interacted with them.

### 4.1. Sociability value extracted using basic method of analysis

We could not evaluate the results obtained from the questionnaire (17 values from -3 to 3) by simply using the paired-adjective-test results, because participants were not obliged to complete the questionnaire. We applied a principal component analysis to the results of the paired-adjective-test to find hidden trends. We found six axes where the estimated values exceeded one. The results are listed in Table 4.

 PC1: Sociability value (28.8%) Hot Cold Flexible Inflexible Fast Slow Showy Plain Wise Foolish 0.793 0.680 0.657 0.613 0.598 PC2: Uniqueness value (11.26%) Cool Weird New Old Plain Showy Flexible Inflexible 0.611 0.600 0.526 0.451 PC3: Intuitiveness value (8.30%) Cool Weird Understandable Difficult to understand Horrible Gentle 0.458 0.443 0.438 PC4: Simplicity value (7.96%) Understandable Difficult to understand Simple Complex Lively Gloomy 0.490 0.475 0.391 PC5: Freshness value (7.06%) Cool Weird Gentle Horrible Flexible Inflexible 0.480 0.422 0.404 PC6: Intimateness value (6.40%) Intimate Not intimate Selfish Unselfish Plain Showy 0.679 0.353 0.321

### Table 4.

Categories using basic method of analysis.

The most effective axis for evaluating the display robot was PC1 (sociability value) which affected results by approximately 30%. We calculated the sociability values of participants according to gender and age categories. As a result, the average value for male participants was -0.378 and the average value for female participants was 0.434 (Fig. 10). The average values by age are in Fig. 11. We also categorized participants who thought interaction was positive and those who thought interaction was negative according to situations involving watching and calling. The results are listed in Tables 5 and 6.

### Figure 10.

Distribution of sociability value'' by gender.

### Figure 11.

Distribution of “sociability value” by age.

 Positive Joyful! Surprised Cute It is very strange. It has eyeglasses. Very interesting. Negative Suspicious Horrible Terrible It was sure it watched me. I do not know whether it gazed at me or not. I did not understand it. I was terrified to think about what it would do. Its upward glance was unnatural.

### Table 5.

Impressions for watching action.

 Positive Wonderful. I woke up Friendly Surprised! I felt good. I felt relieved if it be... Negative Surprised. Vague. All thing I could say was "Yes." Its timing was astonishing Machinelike. Confused. What did we must to do? I could not hear its voice. I was amazed. I was surprised because I did not think it could talk.

### Table 6.

Impressions for calling action.

## 5. Discussion

### 5.1. Difference between genders

The sociability values between genders are plotted in Fig. 10. These results indicate that female participants had a more favorable impression of the display robot than the males. One female participant said that she felt the display robot was “cute and unique” in an uncoerced answer to the questionnaire. Other descriptions by female participants indicated that they saw the display robot intuitively and the object as a unified agent. Some female participants seemed surprised after the researcher had explained that the display robot and the object were separate devices.

The reason for the difference may have been because the female participants accepted the display robot and object as one unified character (agent) and felt good about it, but male participants accepted the display robot and object as separate devices. Male participants also only paid attention to the display robot's functions and found deficiencies in the devices. They felt the display robot was weirder than the female participants did.

We not only need to improve the accuracy of the display robot's devices but also to design a natural scenario for male users to increase their favorable impressions.

### 5.2. Differences between age groups

The sociability values for the six different age groups are plotted in Fig. 11. We can see that the values decrease for those under 10 years old an gradually increase for those over 10 years old.

The reasons for this phenomenon may be as follows. If participants are under 10 years old, they freely admit the object has eyes and arms. However, if they are 10 to 19, they think it is embarrassing to interact with anthropomorphized objects and their sociability values are decrease as a result. The experimenters found in observing the participants that those under 10 years of age acted aggressively with the display robot, pulling its arms or pushing its eyes, but those between 10 years of age to university-age students watched the display robot from a distance.

We also found that those who were more than 30 years old had greater sociability values than younger participants. This may have been because they could objectively interact with the anthropomorphized object, and felt less embarrassed because they were older.

These results indicate that 10 to 19 years olds had a tendency to find interaction with anthropomorphized objects to be embarrassing. We need to design a more attractive scenario where the 10 to 19 year old age group can interact with objects without being embarrassed.

### 5.3. Impressions for watching action

The results are listed in Tables 5 and 6.

Table 5 shows that participants who felt watching were negative said that they felt the object was horrible because it could not gaze at them accurately. It also shows that participants who felt watching was positive said that they felt the Iris-board itself was beneficial. We need to improve its gaze so that it is more precise by developing better accurate facial recognition and capturing a wider area with the camera to improve participants' impressions of the display robot.

### 5.4. Impressions for calling action

Table 6 shows that participants who felt calling was negative said that they could not understand the intentions of anthropomorphized objects and they could not respond to them. We expected that the invitation by an object using its eye gaze and beckoning would attract participants toward the object. The research results indicate that participants could not understand the “invitation by the object” because the trash box and exercise bike basically had no functions and there was no need to invite people. We found that we needed to design scenarios that extended the “intention of the object.” For example, if the trash box is anthropomorphized, it needs to interact in the situation where “it needs to collect garbage” and if the exercise bike is anthropomorphized, it needs to interact in the situation where “it needs participants to exercise.” However, participants who felt calling was positive says that they felt it was not only “cute or cool” but also “safe”. This indicates that anthropomorphization increased the subjectivity of the objects and participants felt more glances from them.

## 6. Conclusion

This chapter proposed a display robot that acts as an agent to anthropomorphize objects by changing them, using devices that are like human body parts. We did research on the interaction between users and anthropomorphized objects using the displaying robot. As a result, we found that anthropomorphization by the display robot was mostly appreciated by female participants and accepted by people of all ages except for those aged 10 to 19.

However, we need to clarify how the virtual body image is created in the future and what interaction is possible by conducting more experiments and researches.

## 7. Acknowledgements

The first author was supported in part by the JSPS Research Fellowships for Young Scientists. This work was supported in part by Grant in Aid for the Global Center of Excellence Program for “Center for Education and Research of Symbiotic, Safe and Secure System Design from the Ministry of Education, Culture, Sport, and Technology in Japan.”

## References

1 - M. Bateson, D. Nettle, G. Roberts, 2006 Cues of being watched enhance cooperation in a real-world setting’. Biology Letters, 2 2006, 412 414
2 - A. Fukayama, V. Pham, T. Ohno, 2003 Acquisition of Body Image by Anthropomorphization Framework. Proceedings of Joint 3rd International Conference on Soft Computing and Intelligent Systems and 7th International Symposium on Advanced Intelligent Systems, 103 743 53 58
3 - A. Green, 2001 C-Roids: Life-like Characters for Situated Natural Language User Interface. Proceedings of 15th International Symposium on Robot and Human Interactive Communication, 10 140 145, Bordeaux-Paris, France
4 - H. Kobayashi, S. Kohshima, 2001 Unique morphology of the human eye and its adaptive meaning: comparative studies on external morphology of the primate eye. Journal of human evolution, 40 5 419 435
5 - N. Mukawa, A. Fukayama, T. Ohno, N. Sawaki, N. Hagita, 2001 Gaze Communication between Human and Anthropomorphic Agent. Proceedings of 10th International Symposium on Robot and Human Interactive Communication, 10 366 370 , Bordeaux-Paris, France
6 - Y. Nishida, H. Aizawa, T. Hori, N. H. Hoffman, T. Kanade, M. Kakikura, 2003 3D ultrasonic tagging system for observing human activity. Proceedings of International Conference on Intelligent Robots and Systems, 1 785 791
7 - H. Osawa, J. Mukai, M. Imai, 2006 Anthropomorphization of an Object by Displaying Robot. Proceedings of 15th International Symposium on Robot and Human Interactive Communication, 15 763 768 , Hatfield, United Kingdom
8 - H. Osawa, J. Mukai, M. Imai, 2007 Anthropomorphization Framework for Human-Object Communication. Journal of Advanced Computational Intelligence and Intelligent Informatics, 11 8 1007 1014
9 - B. Reeves, C. Nass, 1996 The Media Equation: How People Treat Computers, Television, and New Media Like Real People and Places, Univ. of Chicago Press.
10 - O. Sugiyama, T. Kanda, M. Imai, H. Ishiguro, N. Hagita, 2006 Three-layer model for generation and recognition of attention-drawing behavior. Proceedings of International Conference on Intelligent Robots and Systems, 5843 5850, IEEE/RSJ