Implementation of a Framework for Imitation Learning on a Humanoid Robot Using a Cognitive Architecture

Robots are designed to assist human beings to complete tasks. As described in many research journals and scientific fictions, researchers and the general public expect that one day robots can complete certain tasks independently and autonomously in our world. A typical implementation of Artificial Intelligence (AI) in robotics research is expected to incorporate Knowledge Representation, Automated Reasoning, Machine Learning, and Computer Vision [Russell and Norvig, 2010]. From the beginning of this century, the importance of cognitive sciences is highlighted by the researchers in both robotics and artificial intelligences[Russell and Norvig, 2010]. Real Cognitive Science is based on the testing and experiments on animals and humans to obtain the understanding and knowledge of the cognition, and AI research is based on designing and testing algorithms on a computer based artificial system. In the development, research in both Cognitive Sciences and AI benefit each other. Cognitive Science provides theoretical foundation and solutions to the problems in AI, and AI enhances the research in Cognitive Science and provides possible research directions for Cognitive Science. Recently, an emerging field, named Cognitive Robotics, is developed for robots to generate human-like behaviors and intelligences, which integrates perception, action, learning, decision-making, and communication[Kawamura and Browne, 2009]. Cognitive Robotics largely incorporates the concepts in the research of cognitive sciences and tries to simulate the architecture of the information processing of human cognition. Currently, some long term requirements are proposed for Cognitive Robotics, which put concentration on the embodiment of cognitive concepts [Anderson, 2003; Sloman et al., 2006]. Increasing achievements in Cognitive Robotics began to grab the attention of the robotics research community. However, currently, it is still difficult to design truly cognition, especially human-like cognition, on a robotic platform due to the limitations in mechanism, computation, architecture, etc. [Brooks, 1991]. Therefore, on one side, researchers try to place robots in a human-existing dynamic environment to complete tasks[Brooks, 1991]; on the other side, researchers still do not obtain a general architecture to generate complex behaviors in such area for robots[Sloman, 2001]. In recent research, researchers try to place robots in robotic aid area, where robots can assist humans to complete tasks[Tan et al., 2005].


190
An important feature of robotics research is to design human-like mechanism for robots [Brooks et al., 1999]. Humanoid robots, which are designed human-like, have been gradually selected as the platform for experimentally implementing the conceptual design of Cognitive Robotics. An increasing number of humanoid robots have been designed to fulfill the dream of the researchers (such as NASA Robonaut, Aldebaran's NAO and ISAC) [Tan and Kawamura, 2011]. The human-like mechanism provides a possible platform for researchers to design human-like behaviors [Tan and Liao, 2007b]and it is expected that one day such human-like artificial creatures can live in our real world. However, there is no defying the fact that the essential part of designing a truly human-like robot is to provide robots abilities and skills to generate human-like behaviors. Even a robot with six degrees-of-freedom (DOF) is not easy to plan the motion in humanexisting environment Yang et al., 2006]. Futher more, the mechanism of a humanoid robot is much more complicated than an industrial robot, because the degree-offreedom is much higher. Therefore, it is difficult for researchers to use traditional methods to plan the motions for humanoid robots. When humanoid robots are asked to carry out some human-like behaviors, they are much more difficult to be programmed or controlled because such behaviors are more complex. Additionally, as stated in former paragraphs, it is e x p e c t e d t h a t h u m a n o i d r o b o t s s h o u l d b e placed in a dynamic and human-existing environment to complete tasks. Therefore, researchers began to find solutions from cognitive sciences. Recently, imitation learning has been considered as a powerful tool for rapidly transferring skills between humans and robots [Uchiyama, 1978;Schaal, 1999]. Human beings often try to imitate the behaviors of other people in the environment. This kind of ability is especially important for children. Concepts of imitation learning can be found from the investigation of the psychology research and biology research [Billard, 2001]. Therefore, it is reasonable to implement imitation learning for robots since we are struggling to develop cognitive skills for robots. In a typical imitation learning, first, human teachers demonstrate a behavior sequence to complete certain task in a specified situation. Second, this behavior sequence is learned by recording the behavior either by using the encoders on the robot (if the demonstration is given by manually moving arms of robots) or by recording the trajectory in the task space (if the demonstration is given by showing the robot the task in the task space). Third, the robot will generate similar behavior in similar but different situations to complete tasks. Imitation learning aims to train robot the behaviors and skills from the demonstrations and let the robots adapt them to similar situations, using which robots reproduce similar movements to complete similar tasks . Imitation learning algorithms can be divided into two categories : one is trying to train robots to extract and learn the motion dynamics [Schaal and Atkeson, 2002;Ijspeert et al., 2003;Tan and Kawamura, 2011], and the other is trying to let robots to learn higherlevel behaviors and action primitives [Dillmann et al., 2000;Bentivegna and Atkeson, 2001] by imitation. Both methods require a set of predefined basis behaviors to ensure convergence. For most tasks, a human teacher shows robots several behaviors in a behavior sequence to complete a task. That means the behavior sequence is composed of several behaviors and each behavior has its own parameters. Recently, imitation learning methods mostly put www.intechopen.com Implementation of a Framework for Imitation Learning on a Humanoid Robot Using a Cognitive Architecture 191 concentration on reproducing new movements with dynamics similar to one single behavior in a behavior sequence [Ijspeert et al., 2002] or generating a behavior which has dynamics similar to the whole behavior sequence [Dillmann et al., 2000]. Therefore, the problem is if robots are required to generate new behaviors in a similar situation but the task constraints have been changed a lot, sometime robots may fail to complete this task using current methods. In a Chess-Moving-Task, a humanoid robot, named ISAC, is required to grasp and move the Knight two steps forward and subsequently one step left. The demonstration by a human teacher is shown in the left picture of Fig. 2. In the middle picture of Fig.1, if the Knight is moved not too far away from the place in the demonstration, ISAC can use current imitation methods to complete the task. But if the Knight is moved far away from the place in the demonstration, ISAC fails to grasp and to move this piece in an expected style as shown in the right picture of Fig.2. That is because current learning www.intechopen.com The Future of Humanoid Robots -Research and Applications 192 methods can guarantee the convergence of achieving the global goal, but the local goals have not been taken into consideration. Therefore, in a biologically inspired way, robots should understand the task as a behavior sequence. The reason for choosing the segmentation of a behavior sequence is: first, behavior based cognitive control method provides a robust method for manipulating the object and complete a task through learning, because behavior based methods can train robots to understand the situation and the task related information; second, the segmentation provides a more robust approach for robots to handle completed tasks. As in Fig.2, if ISAC knows that this task is composed of several parts, the grasping sub-task can be guaranteed by adapting the first reaching behavior. However, robots do not have knowledge on how to segment the behavior sequence in a reasonable way. Former research methods segment behaviors based on its dynamics. However, they are not applicable to some complex tasks. For complex tasks, a global goal is achieved by achieving several local goals.

Fig. 2. Chess-moving task
Fuzzy methods [Dillmann et al., 1995] and the Hidden Markov Model (HMM) [Yang et al., 1997] are normally used for segmentation. Another type of segmentation methods is to detect change-points on the trajectory [Konidaris et al., 2010]. Kulic and Nakamura [Kulic et al., 2008]proposed a segmentation method based on the optical flow in the environment, which is a cluster based method. These methods are not robust, because they rely on the analysis of the dynamics of the demonstration and pre-defined behavior primitives, which are not suitable for extension. Readers can simply image a new situation where the pre-defined behavior primitives cannot be used, and a noises existing environment, where the sensory information of the demonstration can be largely affected. In this paper, we propose a cognitive segmentation method. Imitation learning provides a possible solution for behavior generation for humanoid robots. However, researchers gradually found that it is possible to design algorithms and architectures for a specific task in a specific situation, but it seems difficult to design a general imitation learning method to generate behaviors in a large number of different situations. Researchers starts to find the solution from the cognitive sciences, because the research of cognitive sciences investigate the stable learning processes in human or animal brains and possibly it can provide solution to current behavior based robotics research, especially for the research on humanoid robots [Tan and Liang, 2011]. Recently, the cognitive architecture received broad attention from the robotics research community, because it provides a kind of methods of using cognitive processes [Tan and Liang, 2011]. Current cognitive architectures can be divided into four categories: Symbolic, Connectionist, Reactive (Behavior Based Connectionist), and Hybrid. Some well-known Symbolic architectures include: ACT-R [Anderson et al., 1997], SOAR [Laird et al., 1987], EPIC [Keiras and Meyer, 1997], Chrest [Gobet et al., 2001], and Clarion [Sun, 2003], in which symbolic knowledge are stored for automated reasoning. For Connectionist, BICS [Haikonen, 2007], Darwinism [Krichmar and Reeke, 2005], and CAP2 [Schneider, 1999] are developed, in which connectionism method are implemented to generate behaviors or decisions. A typical Reactive architecture is Subsumption [Brooks, 1986;Brooks, 1991], which directly couples the sensory information and the behavior primitives. Some researchers have begun to recognize the need of both deliberative interaction and reactive interaction for cognitive robots, which motivates the research on hybrid architectures [Kawamura et al., 2004]. Such integration offers the promise of robots which are both fluent in routine operations and capable of adjusting their behavior in the face of unexpected situations or demands. Typical Hybrid architectures include: RCS [Albus and Barbera, 2005], and JACK [Winikoff, 2005]. In our lab, we developed a hybrid cognitive architecture, named ISAC Cognitive Architecture [Kawamura et al., 2004]. In this paper, we propose to use ISAC cognitive architecture to implement the imitation learning for a humanoid robot. The rest of the papers are organized as follows: Section II introduces the design of the proposed system, Section III describes the experimental setup, procedure and the results, Section IV discuss the proposed system and the future work, and Section V summarizes the work in this paper.

System design
For ISAC, demonstrations are segmented into behaviors in sequences. The segmented behaviors are recognized based on the pre-defined behavior categorizes. The recognized behaviors are modeled and stored in a behavior sequence. When new task constraints are given to the robot, ISAC generates same behavior sequences with new parameters on each behavior, the dynamics of which are similar to the behaviors in demonstration. These behaviors are assembled into a behavior sequence and sent to the low-level robotic control system to move the arm and control the end-effectors to complete a task. Fig.1 is the system diagram of ISAC Cognitive Architecture, which is a multi-agents hybrid architecture. This cognitive architecture provides three control loops for cognitive control of robots: Reactive, Routine and Deliberative. Behaviors can be generated through this cognitive architecture. Imitation learning basically should be involved in the Deliberative control loop. Three memory components are implemented in this architecture, including: Working Memory System (WMS), Short Term Sensory Memory (STM), Long Term Memory (LTM).

194
The knowledge learned from the demonstrations is stored in the Long Term Memory (LTM). When given a new task in a new situation, ISAC retrieves the knowledge from the LTM and generates a new behavior according to the sensed task information in the STM and WMS [Kawamura et al., 2008].
-Perceptual Agents (PA) The PA obtains the sensory information from environment. Normally, encoders on the joints of the robot, cameras on the head of the robot, and the force feedback sensor on the wrist of the robot are implemented in this agent. -

First-Order Response
Ag ent

E xe c ut iv e Con t rol Ag en t Perception-Action
Ag ent  Central Executive Agent (CEA) The CEA provides central processing, decision making, and control policy generating for different task goals which is stored in the Goals Agent (GA). In hierarchy architecture, this component accesses all of the sensed information and makes decision for tasks.
-Goal Agent (GA) Correspondingly, the GA stores the motivations or goals of tasks in situations.
-Long Term Memory (LTM) The LTM stores the memory especially the knowledge for long term use. Procedural, semantic, and episodic knowledge are stored in this component. In imitation learning, the learned skill or knowledge is stored as procedural and episodic knowledge using a mathematical model. - Internal Rehearsal System (IRS) The IRS evaluates the results of the decisions made from the CEA through internal rehearsal. The activity of ISAC can be divided into two stages: Learning from Demonstration, Generation by Imitating. Figure 2 displays the control loop for the first stage: Learning from Demonstration.

Learning from demonstration
This stage is divided into three steps: demonstration, segmentation, and recognition. Demonstration Assumption-We assume 1) human teachers are well-trained, 2) behaviors of the same part in different demonstrations should have similar dynamics, and 3) the behavior sequences of the demonstrations should be composed of the same number of behaviors.
For the Chess-Moving-Task, the Knight (We used blue cubes for easy grasping because the control of ISAC's arm is not very precise) was placed on several different places and several demonstrations have been given to the robots by manually moving robot's arm to complete the task. For each demonstration, joint angles were sampled using the encoders on the joints, and the position values were sampled using a camera on the robot's head. A human teacher shows the demonstration by manually moving the right arm of ISAC. The PA senses the movement of right arm using the encoders on the joints of the right arm of ISAC, and records the movement of the Knight using the camera mounted on the head of ISAC. Sensed information is stored in the STM as data matrices. Three data structures are used for the recorded demonstration: θ and P . θ θ , , which is a N 7 matrix, records the joint angles of the robot's right arm and the temporal information on sampling points, and is used to calculate the position of the end-effector in the Cartesian space. P P , which is a M matrix records the positions of the Knight on the chess board in the Cartesian space in the demonstration and the temporal information related to the sampling points. P P , , which is a N matrix, records the position values of the robot's effector and the temporal information related to the sampling points. (1) These data arrays are stored in the STM for the CEA to process.

Segmentation
Segmentation is important for the system proposed in this paper. We propose that the segmentation is based on the change of world (environmental) states. The change of world states can be considered as the static object in the environment begins to move, the signal in the environment varies from one area to another area, some objects comes into the environment, etc. The CEA obtains the segmentation method from the LTM and segments the sensed information in the STM. In the Chess-Moving-Task, we define the change or the world states as the status of the Knight changes from static to moving. Upon that, the behavior sequence can be segmented.
And the variance of this probabilistic distribution is: At time step t+1, , , the distance between and the mean of this the distribution, can be calculated through simple calculation: Heuristically, if , | | the Knight can be considered as having been moved from its initial position. Initial knowledge of the segmentation method is given to the robot. In the experiment, ISAC observed the environment and set an array T to record the temporal information, the elements in this array are the temporal information of that point when the state of the knights has been changed. The recorded array T is used as segmentation parameters. Assumption-In order to implement the behavior recognition, the 'grasping' behavior is defined in advance because it is almost impossible to observe the grasping using cameras. When the position of the Knight began to move along the trajectory of the end-effector on the robot's arm, we can assume that a grasping happened just before this movement. In real implementation, a grasping behavior is added into the 'static' status of the Knight and the 'begins to move' of the Knight. A behavior sequence is obtained as {Behavior 1, Grasping, Behavior 2}.

Recognition
The CEA obtains the criterion of recognition from the STM and recognize the behaviors in the segmented behavior sequences. For each behavior, the sampled points should be pre-processed before processing. Prior to that, it is necessary to determine what types of behavior they belong to. Therefore the data www.intechopen.com The Future of Humanoid Robots -Research and Applications 198 of sampled points are aligned and dimensionally reduced into low-dimensional space to extract the features. PCA [Wold et al., 1987], LLE [Roweis and Saul, 2000], Isomap [Tenenbaum et al., 2000], and etc. can be used to realize this step. For this special Chess-Moving-Task, the sampled trajectory is simply projected into X-Y plane, because the variations on Z-direction are very small. Behaviors are divided into two types: one is the 'Common Behavior', which means that parameter can be modified according to different task constraints, e.g., different position values of the Knight on the chess board. The other is the 'Special Behavior', which means parameters remain the same in different task constraints. The criterion for judgment is based on the scaling. In the latent space, if the axes of the trajectories of the behaviors satisfy the following equation, we consider it as a 'Special Behavior'; otherwise, it is a 'Common Behavior'.
, the probability of a common behavior is calculated as follow: , the probability of a special behavior is calculated as follow: 1 Heuristically, if , this behavior can be considered as common behavior; if , this behavior can be considered as special behavior. However, this criterion is not applicable in all situations. It should be modified through an iterative machine learning process. Intuitively, the 'Common Behavior' means we can modify the parameters of behaviors to adapt them to different situations, and the 'Special Behavior' means that in specific tasks, such behaviors should not be changed, and robots should follow the demonstration strictly. After that, points are interpolated to align the signals in order to extract the common feature of the behaviors. Based on the assumption in section 2.1, the position values on the obtained behavior trajectory are obtained by calculating the average of the sampled position values on the common timing points as: The learned knowledge is stored in the LTM for the stage of generation.

Generation by Imitating
Robots obtains the environmental information from the PA and sends it to the STM. The CEA analyze the sensed information from the STM and sent to the GA to generate the goal of the task. The CEA gets the generation method from the LTM and generates same behavior sequences. In the behavior sequences, behaviors have similar dynamics to the behaviors in the demonstrations. The generated behavior sequences are sent to the WMS.
Subsequently they are sent to the actuators to complete tasks in similar but slightly different situations. In behavior modeling, classical Locally Weighted Projection Regression (LWPR) method is used to model the trajectory of a 'common Behavior'. Heuristically, 10 local models are chosen to model the trajectory with specific parameters. The general method of LWPR can be referred to Vijayakumar and Schaal, 2000]. Given new task constrains, specifically the new position value of the Knight on the chess board, ISAC uses the same behavior sequence to reach, grasp and move the Knight. The behavior sequence is obtained in section 2.2 and DMP is chosen to generate a new trajectory for behavior 1.DMP method originally proposed by Ijspeert [Ijspeert et al., 2003] , records the dynamics of demonstrations and generates new trajectories with similar dynamics, is an effective tool for robots to learn the demonstrations from humans . The formulation of the original DMP algorithm is shown as differential equations: where is the goal state, z is the internal state, (a LWPR model) is calculated to record the dynamics of the demonstration, y is the position generated by the differential equations, and www.intechopen.com The Future of Humanoid Robots -Research and Applications 200 is the generated velocity correspondingly. , , and τ are constants in the equation. From the original paper of the DMP, , , and τ are chosen as 1, ¼ and 1 heurisitically to achieve the convergence. When the position value of the Knight is given, the CEA generates a new trajectory which has similar dynamics to the demonstration of behavior 1. After the grasping, behavior 2 is generated strictly following the obtained behavior in section.

Experimental results
Proposed experimental scenario is: 1) a human teacher demonstrates a behavior by manually moving the right arm of ISAC to reach, grasp and move the Knight on the piece; 2) ISAC records the demonstrations using encoders on its right arm and the cameras on the head; 3) ISAC generates several behaviors to complete tasks in a similar but different situations; Using the cognitive segmentation method propo s e d i n t h i s p a p e r , e a c h o f t h e t h r e e demonstrations are segmented into 2 parts as shown in Fig.6. Each part is considered as a behavior. The black circle demonstrates the point of the change of the world states because the Knight began to move with the end-effector. Based on the assumption in section 2.1, the first behavior of each demonstration is considered as the same type of a behavior, and the same consideration for the second behavior. As shown in Fig.6, three demonstrations are segmented into 2 behaviors with a changing point which is considered as the grasping behavior.  In Fig. 9, on each dimension, the sampled trajectories of different behaviors in each demonstration are normalized in time and magnitude. The magnitude was normalized in the range of (0, 1) for each dimension of the trajectory. Because DMP is chosen in the trajectory generation part and DMP only requires the dynamics of the trajectory. In practical, DMP algorithm automatically normalizes the sampled trajectory, therefore different time periods and magnitudes will not affect the results of the experiments. In Fig.7, the first row are the trajectories of Behavior 1 in X-direction, the second row are the trajectories of Behavior 1 in Y-direction, and the third row are the trajectories of Behavior 1 in Z-direction. The first column is the trajectory of Behavior 1 in demonstration 1, the second column is the trajectory of Behavior 1 in demonstration 1, the third column is the trajectory of Behavior 1 in demonstration 1, and the fourth column is the processed trajectory of Behavior 1 which will be used for future behavior generation. Behavior 2 is calculated by getting the average value of the sampled position values on the common timing points. In Fig. 9, the left figure is the trajectory of Behavior 2 in demonstration 2 on X-Y plane, the left middle figure is the trajectory of Behavior 2 in demonstration 2 on X-Y plane, the right middle figure is the trajectory Behavior 2 on X-Y plane, and the right figure is the processed trajectory of Behavior 2 which will be used for future behavior generation.    By comparing Fig. 11 with the fourth column of Fig. 9, similar dynamics of the two trajectories in X, Y, Z-direction can be found. Grasping is added between Behavior 1 and Behavior 2 based on the assumption in section 2.2. Fig.12 shows the generated behavior sequences for the new task constraints. The red line is the behavior 1, the blue line is the behavior 2, and the intersection point of the two behaviors is the 'grasping' behavior. The simulation results shows that the end-effector moves to reach the Knight at (450, 215, -530) with similar dynamics to the behavior 1 in the demonstration, then it move the Knight in the same way as behavior 2. Although the position of the Knight has been changed, the movement of the Knight is the same as the demonstration. From Fig.  11, it is concluded that this algorithm successfully trained the robot to reach, grasp and move the chess in the expected way. Fig. 13. A generated new behavior sequence in cartesian space Fig. 13 shows the generated behavior sequences for the new task constraints. The red line is the behavior 1, the blue line is the behavior 2, and the intersection point of the two behaviors is the 'grasping' behavior. The simulation results shows that the end-effector moves to reach the Knight at (450, 215, -530) with similar dynamics to the behavior 1 in the demonstration, then it move the Knight in the same way as behavior 2. Although the position of the Knight has been changed, the movement of the Knight is the same as the demonstration. From Fig.  13, it is concluded that this algorithm successfully trained the robot to reach, grasp and move the chess in the expected way. Practical experiments were carried out on ISAC robot as shown in Fig.13. The Knight (blue piece) was placed at (450, 215, -530) which is far away from the place in the demonstrations and ISAC was asked to reach, grasp and move it to the red grid in an expected way which is similar to the demonstrations. The experimental results demonstrates that ISAC successfully learns this behavior sequence, and it can reach, grasp, and move the piece in an expected way.

Discussion and future work
The main idea of this paper is to demonstrate the effectiveness of cognitive segmentation of the behavior sequence. From the simulation and the practical experiments, we can conclude that this method is successful and it is useful for future study on cognitive imitation learning. Because ISAC is pneumatically driven and it is not very precise, although it can move the end-effector according to the behavior sequence and complete the movement successfully, it cannot grasp the piece every time. Fig.12 only shows one successful experiment. We have finished current work on applying the proposed method on simulation platform and ISAC robot. Our long term goal is to design a robust behavior generation method which enables robots to work in a dynamic human-robot interaction environment safely. This requires that system have the ability to learn knowledge from demonstration, develop new skills through learning and adapt the skills to new situations. Therefore, robots should learn the demonstration, store the knowledge and retrieve the knowledge for future tasks. Therefore, in cognitive robotics, the functions of robotic control, perception, planning, and interaction are always incorporated into cognitive architectures. Upon that, researchers can use general cognitive processes to enhance imitation learning processes and stores the knowledge for other cognitive functions.

www.intechopen.com
The Future of Humanoid Robots -Research and Applications 206 Some researchers are inrested in transferring skills and behaviors between robots. The skill transfer between robots does not implemented in this paper. However, it is still can be incorporated in this cognitive architecture. Skill Transfer is divided into two parts: demonstration and observation. Assume that ISAC is asked to transfer the skills to another robot, named Motoman. ISAC demonstrates the behavior sequences to Motoman strictly follows the demonstrations from the human teacher. Therefore, there are three demonstrations to reach, grasp and move the Knight. The only difference between the observation by ISAC and Motoman is that Motoman should observe the demonstrations using the camera and convert the recorded data in the tasks space, which is the movements of the right hand of ISAC, to the joint space. O O , , which is a N matrix, records the position values of the end-effector on the right arm of ISAC and the temporal information related to the sampling points. θ inverse kinematics O (13) P P , is still a M matrix recording the positions of the Knight on the chess board in the Cartesian space in the demonstration and the temporal information related to the sampling points After the observation, Motoman uses the same process as described in former paragraphs to segment, recognize, and generates new behaviors in simialr but slightly different situations. In the future, our lab intend to implemente the skill stransfer using this framework and cognitive architecture described in this paper. The existing problem is this cognitive framework largely relies on the vision system. In this paper, we proposed a cognitive segmentation method using the visual information from the cameras. As known, vision system is not very stable, and it is easy to be affected by the environmental issues, e.g., light, perspective, and noises. Therefore, how to design a stable vision system is crucial for robotics research, especially for humanoid robots and cognitive robots [Tan and Liao, 2007a].
Another possible future work is to design a probabilistic imitation learning method especially for the generation stage. It is known that the sensed information, and the results of the actions are uncertain. Therefore, robots should make a probabilistic decision in the imitation learning to complete tasks. Generally, the imitation learning is considered to be a learning of control policies [Argall, 2009]. Therefore, how can robots choose a policy from many candidate policies can be design in a probabilistic way. Machine learning normally is a iterative process, in which, computers or robots obtain the experience from the exercises, either good or devil. The decision making is improved by utilizing the results fromthe iterative learnings in the practice. In Atkeson's famous experiment, inverted pendulum experiment, the robot tried to improve the performance through a feedback process. Initially, the robot may fail serveral times to hold pendulum in a balanced position. However, it can obtain the experiences from the failure, and finally got success. Currently, there are two types of teaching methods can be used for robots to improve and fasten the learning process: one is to provide the demonstration at the begining and robots try to complete a similar but different task from reproducing the demonstration; the other one is to train robots to learn strarting from scratch, and correct the behavior of the robots in the learning process.Both methods are effective for robotic imitation learning. And we expect that we can combine the two kinds of methods by simulating the way in which human beings learn from the demonstrations from childhood. Currently, Cognitive Sciences receives broad attention from the robotics research community. It is expected that robots can behave like a real human and live with us in the future, and it is reasonable that researchers may seek the solution or motivation from the interdiciplinary research.

Conclusion
This paper proposes a cognitive framework to incorporate cognitive segmentation and the DMP algorithm in a cognitive architecture to deal with generating new behaviors in similar but different situations using imitation learning method. The simulation and experimental results demonstrates that this method is effective to solve basic problems in imitation learning when the task constraints were changed a lot. The main contribution of this paper is that it provides a framework and architecture for robots to complete some complicated tasks, especially in the situation where several task constraints have been changed. A cognitive segmentation method is proposed in this paper. And the experimental resutls demonstrates that the integration of robotic cognitive architectures with the imitation learning technologies is successful. Basically, the current research in imitation learning for robots is still a control problem in which the sensory information increases largely. Cognitive robots should understand the target of the task, incorporate the perceptual information, and use cognitive methods to generate suitable behaviors in a dynamic environment. This paper provides a possible solution, which can be used in different cognitive architectures, for the future cognitive behavior generation.