Motor learning rate (success, %)
The paper presents an analysis of human reaching movements in the manipulation of flexible objects. Two models, the minimum hand jerk and the minimum driving hand forcechange, are used for modelling and verification of experimental data. The data are collected with the haptic system supporting dynamic simulation of the flexible object in real time. We describe some initial experimental results and analyze the applicability of the models. It is found that even for short-term movements human motion planning strategy can depend on arm inertia and configuration. This conclusion is based on the experimental evidence of the multi-phased hand velocity profiles that can be well captured by the minimum driving hand force-change criterion. To support the latest observation, an experiment with reinforcement learning was conducted.
Recently, reproducing of human-like motions has become a focus of attention in many research fields such as human motor control and perception, humanoid robotics, robotic rehabilitation and assistance (Pollick et al., 2005; Tsuji et al., 2002; Amirabdollahian et al., 2002). In a bio-mimetic analogy, the human arm can be considered as a chain of rigid bodies actuated by driving mechanisms (muscles) and controlled by a computer (central nervous system, CNS), which might by instructive for the design of control systems for advanced manipulators. However, little is known about actual motion strategies planned by the CNS. Human motion planning models available in the literature are mostly remained phenomenological and descriptive – they rely on bulky experimental measurements done with motion capturing systems, encephalographs, feedback force devices, etc. On the other hand, the models based on optimal control methods are very attractive because they take into account trajectory formation, boundary conditions, and dynamic properties of the arm and environment. In addition, minimized performance indexes may have a natural interpretation related to human behaviour.
When humans make rest-to-rest movements in free space, there is, in principle, an infinite choice of trajectories. However, many studies have shown that human subjects tend to choose unique trajectories with invariant features. First, hand paths in rest-to-rest movements tend to be straight (or, slightly curved) and smooth. Second, the velocity profile of the hand trajectory is bell-shaped (Morasso, 1981; Abend et al., 1982). It is well established that for unconstrained reaching movements, the trajectory of human hand can be predicted with reasonable accuracy by the minimum hand jerk criterion (MJC) (Flash & Hogan, 1985). More generally, in the optimization approaches, the trajectory is predicted by minimizing, over the movement time T an integral performance index
When movement is constrained by a 3D curve (door opening is a typical example of constrained movement), there is no uncertainty in spatial trajectory, but the temporal hand velocity profile becomes an important indicative of human hand control. Haptic technologies afford great opportunities for studying human motion planning because virtually any constraints and dynamic environments can be probed for verifying optimality criteria. For example, in studying multi-mass object transport using a PHANToM -based haptic interface (Svinin et al., 2006a; Svinin et al., 2006b), it was shown that the MJC models hand movement much better than the lowest order polynomial model that is common in control of robotic and mechatronic systems with flexible elements. This led to the conclusion that the CNS plans reaching movements in the hand space coordinates rather than in the object space. It was speculated that the trajectories of the human arm in comfortable reaching movements can be predicted without taking into account the inertial properties of the arm, which gave a good reason to believe that the arm dynamics are already “prewired” in the CNS while the object dynamics (the novel environment) are acquired by learning. In (Goncharenko et al., 2006) different curvature types of 3D constraints were considered for the tasks of rest-to-rest rigid body movement and bimanual crank rotation. Among several performance indexes, only two criteria were confirmed to be the best candidates for the description of motion control in the tasks: MJC and the minimum force change criterion (MFCC).
Roughly speaking, the MFCC is a dynamic version of the MJC. While the latter ignores inertial properties of the human arm, the former takes them into account. Both these criteria give very close results for the hand velocity profiles if the stiffness of the haptic device is high enough, or, if the transported object is relatively lightweight. In general, however, the theoretical predictions by these criteria can be very different (Svinin et al., 2006c). It is therefore important to design experiments that would help to distinguish between the two criteria and demonstrate the correct choice of one of them. This constitutes the main goal of this paper: to demonstrate experimentally that the hand mass-inertia properties and configuration cannot be ignored in prediction of human motion planning in highly dynamic environment.
This chapter is organized as follows. The next section formulates the MJC and MFCC for the task of a rest-to-rest transport of a flexible object and introduces a concept of dynamically equivalent configurations. Sections 3 and 4 describe primary experiments with a haptic system for two dynamic configurations. Section 5 describes experiments with reinforcement learning, and the last section concludes the chapter.
2. Optimality criteria for the task of rest-to-rest mass transport
A model of rest-to-rest movements is shown in Figure 1. The object is connected to the hand by a virtual spring of initial zero length. In the initial configuration, the positions of the hand and the object coincide. A human subject is asked to make reaching movement of length L and time T and stop the object without excitation of oscillations. For this task, the MJC and its dynamic constraint are:
where xh is the coordinate of the human hand, xo is the object coordinate, mo is the mass of the object, and k is the stiffness of the spring. Defining the natural frequency
The boundary conditions corresponded to rest-to-rest states under the dynamic constraint (2) for both, hand and object, can be also expressed only through xo :
The solution of the problem (1,2) can be represented as a combination of 5-th order polynomial and trigonometric terms as was proven in (Svinin et al., 2006a; Svinin et al., 2006b).
It was also shown that the hand velocity profile, corresponding to this solution, can have either one phase (bell-shaped) or two phases while the object velocity is always single phased. For example, in Figure 2 the hand velocity for the MJC is shown by thick black line and the object velocity by thin black line. The graphs are given for T=1.15s, k=150N/m, mo = 3.2kg, L=0.2m.
Unlike the MJC, the MFCC takes into account the hand dynamics:
where mh is the mass of the hand and f stands for the driving hand force. Again, we can rewrite the criterion (4) to the form similar to (3), taking into account (2), (5), and defining the natural frequency
From (6) and (3) it can be seen that the MFCC converges to the MJC when <<1. However, for non-infinitesimal , additional parameter mh influences on the solution for (6) significantly. Namely, there can be more than two phases in the hand velocity profile. In Figure 2 hand velocity for MFCC is shown by the thick grey line, and the object velocity is given by the thin grey line (T=1.15s, k=150N/m, mo =3.2kg, mh =0.8kg, L=0.2m). Complete solution and theoretical properties of the MFCC are given in (Svinin et al., 2006c). The portrait of the phase transition for the MFCC is shown in Figure 3, where the numbers inside the areas correspond to the number of phases.
In this figure, point A corresponds to the parameters used to calculate profiles shown in Figure 2 (T=17.6, = mo /mh =4)
Note that one point on the non-dimensional phase diagram can correspond to two different sets of physical parameters. In this connection we can define dynamically equivalent systems as systems correspondent to the same point on the phase transition diagram. Define
3. Experiment plan and setup configuration
It is interesting that for fixed mh , T, and L velocity profiles yielding solutions for (3) and (6) are exactly the same for various mo and k, which maintain constant and o . Then, to make conclusion in favour of either the MJC or the MFCC for each subject, we may select two different parameter sets, which are dynamically equivalent to the parameters used for hand velocity calculations. The profiles depicted in Figure 2 are clearly two-phased (MJC) and three-phased (MFCC), and their magnitudes are significantly different. Of course, we cannot expect that each subject’s “effective” hand mass is close to 0.8kg. Because of the ergonomic of experimental layout forearm mass can partially contribute to the “effective” mass. Standard anthropometric mass of human forearm is 1.48kg (Chandler et al., 1976), however, the uncertainty in mh can vary from 0.5 to 1.5 kg, or even more if arm joints are not fixed. To avoid this confusion, we completed two experimental series for each subject using the concept of dynamically equivalent systems in the following manner.
Step 1. As a zero-guess, we assume mh =0.8kg and set other parameters as T=1.15s, k=150 N/m, mo =3.2 kg, L=0.2m. When a subject completes a long series of trials, we compare his average hand velocity profile with ones shown in Figure 2. If the average profile is three-phased and closely matched to the MFCC curve, we conclude that the MFCC criteria is preferable, and the hand mass is very close to 0.8 kg. Otherwise, the next step is completed.
Step 2. Using a curve matching procedure, we estimate new “effective” mh , recalculate new dynamically equivalent parameters k and mo , and ask the subject to repeat the experimental trials. Hand mass and velocities are analyzed again after completing the series.
To analyze human movements, we reconfigure our experimental setup (Figure 4) previously used for multi-mass object movement analysis (Goncharenko et al., 2006). In the setup, a haptic device (1.5/3DOF PHANToM, maximum exertable force 8.5N) was connected to a computer (dual core 3.0 GHz CPU).
Five naïve right-handed male subjects were selected to participate in the experiment. The subjects were instructed to move a virtual flexible object “connected” to the human hand by the PHANToM stylus. The hand & object system was at rest at the start point. The subjects were requested to move the object and smoothly stop both the hand and the object at a target point. The subject made these rest-to-rest movements along a straight line (in the direction from left to right) in the horizontal plane using the PHANToM stylus. The travelling distance was set as L = 0.2m. The object dynamics were simulated using the 4th-order Runge-Kutta method with fixed time step Δt = 0.001s correspondent to the PHANToM cycle. The data regarding the position, velocity of the hand and the simulated object were recorded at 100 Hz. (Stylus position and velocity are measured by the hardware.) PHANToM feedback forces and object acceleration were recorded as well. The subjects were requested to produce the specified reaching movement in a natural way, on their own pace, trying to make as many successful trials as possible. To count successful trials we introduced the following set of tolerances: object and hand final position 0.2±0.005m, object and hand final velocity 0±0.05 m/s, object final acceleration 0±0.16 m/s2, hand start velocity 0±0.05m/s, trial total time 1±0.2s. The reaching task is successful when the simulation and hardware-measured data obey all the above tolerances, then haptic interaction is stopped and an audio signal prompts the users to proceed with the next trial.
Unlike in our previous experiments with multi-mass objects (Svinin et al., 2006a), the time tolerance is very narrow because the solutions of tasks (1), (4) are sensitive to T. To prompt the subjects that they are within the time window, we implemented additional visual feedback in the system (a colored semaphore). Taking into account that the initial hand speed tolerance is not relevant to the target point, the described task was expected to be difficult and sport-like, without high success rate. In order to collect statistically representative datasets, the subjects were asked to complete 2000 trials each, equally split in two days, but with different object configurations.
4. Preliminary experimental results
When all the subjects completed the first series of 1000 trials on Day 1, parameters mh , mo , k, were changed, the setup was reconfigured, and the subjects had to complete new 1000-trial series with new configuration on Day 2. In our previous experiments (Svinin et al., 2006a) a stable growth of motor learning progress (trial success rate) was observed. In this difficult task with the narrow tolerance windows, total success rate was low, about 15% or lower, but still sufficient for statistical analysis (Table 1.). On the average, the second configuration was more difficult for the subjects. There were no obvious learning progress trends inside individual series as well: all the subjects shortly catch their own control strategy after approximately 100-200 first trials, and then the success rate remains various, locally oscillating around 10-15% (see Figure 5 as an example). Sometimes the successful trials followed one-by-one, and sometimes the subjects lost their control strategy for a long period. After 500 trials the subjects took breaks of about 15-20min.
|Subject||Day 1 (1000 trials)||Day 2 (1000 trials)|
|S1||272 (27.2%)||71 (7.1%)|
|S2||149 (14.9%)||42 (4.2%)|
|S3||119 (11.9%)||72 (7.2%)|
|S4||280 (28.0%)||178 (17.8%)|
|S5||105 (10.5%)||120 (12.0%)|
Reaching time for successful trials varied within the time tolerance window (from 0.8s to 1.2s) on the average was shifted, but very close to 1.15s for each subject (Table 2). It makes it possible to correctly map each individual trial profile to the unified time interval of 1.15s.
|Subject||Day 1||Day 2|
We re-estimated new hand mass
where N is the number of measurements in each successful trial, M is the number of successful trials,
|Initial configuration ( m h , m o , k )||Configuration after Day 1 ( m h , m o , k )||Hand mass estimated after Day 2|
|S1||0.8, 3.2, 150||1.3, 5.1, 239||1.5|
|S2||0.8, 3.2, 150||1.4, 5.4, 253||2.1|
|S3||0.8, 3.2, 150||1.1, 4.4, 206||1.4|
|S4||0.8, 3.2, 150||1.1, 4.4, 206||2.3|
|S5||0.8, 3.2, 150||0.9, 3.6, 169||0.9|
Initially, the subjects were not instructed to fix elbow or shoulder joints. It is interesting, that only subject S5 found his own comfortable arm configuration – he fixed his elbow joint in both experimental days, while other subjects did not fixed. It can partially explain the fact that the estimated hand masses are higher for subjects S1-S4 after Day 1 (Table 3).
After Day 1 subjects S1-S4 demonstrated slightly left-skewed two-phased hand velocity profiles with the maximal magnitude less than 30 cm/s. The profile form cannot be explained quantitatively neither by MJC, nor by the MHCC for the hand mass mh =0.8 kg. However, matching criterion (7) formally, one can find optimal mh for MHCC which is significantly different (after Day 1 and Day 2) from the initallly supposed hand mass. Moreover, the matching error is lower for the MHCC than for MJC. Figure 6 (left) shows that the error by the MJC is 0.057 while the error by the MHCC is 0.04 at the optimal “effective” hand mass 1.4kg. In the right part of the Figure 6 the gray thick line shows average experimental hand velocity profile, and two black thick lines depict the profiles predicted by the MJC (two-phased) and the MHCC (three-phased) for mh =0.8 kg. Finding the optimal “effective” hand masses and using the principle of dynamically equivalent systems, the haptic simulator was reconfigured after Day 1 as ahown in Table 3, and the experiments were repeated on Day 2. Nevertheless, the experimental hand velocity profiles remained two-phased for subjects S1-S4, with the magnitude less than 30cm/s. The second estimation by criterion (7) showed that there is an uncertainty in the “effective” hand masses for subjects S1-S4.
At the same time, statisticaly representative results for subject S5 (with fixed elbow joint) are strongly in favour of the MFCC. Figure 7 shows the experimental and predicted by the MHCC (at mh =0.9 kg) hand velocities for subject S5. Thick grey and black lines are the average experimental and predicted profiles, and the thin grey lines depict last 30 successful trials on each experimental day. Matching by the criterion (7) showed that the re-estimated “effective” hand mass (0.9kg) is very close to the initial estimation (0.8kg).
The only difference between subjects S5 and S1-S4 is that S5 fixed his elbow joint placing the elbow on a stand. Obviously, different muscle groups worked for S5 and S1-S4, and physical limits of S1-S4 could not allow them to reach velocity higher than 30cm/s. Also, the significant difference between hand masses estimated after Day 1 and 2 for S1-S4 means that modelling of the “effective” hand mass via a point mass is dubious for the case of arm configuration without joint fixation.
5. Reinforcement learning and arm configuration
After the course of preliminary experiments it was decided to ask one subject from the group S1-S4 to repeat experiments in order to check if the three-phased hand velocity profiles can be achieved after reinforcement learning. In the reinforcement learning task, the haptic system was repeatedly used in the following teaching mode: it was programmed to drag the subject’s hand close to the average trajectory of subject S5. In this case the subject’s hand passively followed the driving PHANToM stylus. The teaching mode was supposed to provide motor learning of movement of subject S5.
Subject S3 participated in the experiment on Day 3. First, he completed 1000 trials in the teaching mode (Task A) and then, after 20min break, he was asked to reproduce 1000 times (Task B) the learnt movement in the standard simulator’s mode (mass-spring transport) described in the previous sections. In both series, his elbow joint was not fixed. Figure 8 shows the average hand velocity profiles of the subject for this experiment. The black line is the average profile of S3 after Day 1, and the light grey line (two-phased, left-skewed) is the average profile of Task B (after reinforcement learning). Even the subject said that he remembered the desired movement in teaching mode, it can be seen from Figure 8 that the profile of Task B is not tree-phased. Moreover, he found the desired movement less comfortable than his previous self-leant control strategy.
Finally, the subject was asked to complete Task A and Task B with his fixed joint placed on a stand. In this case he found the desired movement much more comfortable and the average hand velocity profile was very close to the profile predicted by the MHCC (Figure 8, three-phased profile). The “effective” hand mass estimated by (7) was 0.85kg.
An analysis of human reaching movements in the task of mass transport is presented. Two models, the minimum hand jerk (MJC) and the minimum driving hand force-change (MFCC), are used for modelling and verification of experimental data. The data were collected with a haptic system supporting object dynamics simulation in real time. The importance of the research is that the knowledge of human control strategies may be useful and hopefully beneficial for the design of human-like control algorithms for advanced robotic systems. Perhaps, the main contribution of the paper is that it was demonstrated that human motion planning strategies cannot be captured only by the minimum jerk criterion without taking into account the configuration of the human arm and its inertia. For many reaching tasks the MJC and the MFCC give similar predicted hand motion velocities, and it is important to distinguish between the criteria.
First, we theoretically predicted (with the MJC and the MFCC) a special configuration of the mass-spring system, when the expected hand velocity profiles may differ significantly in terms of magnitudes and phase numbers. With the experiments, it was demonstrated that human arm configuration and ergonomics are important factors for correct theoretical predictions of the hand velocity profiles. Statistically representative results for the case of arm configuration with fixed elbow joint are strongly in favour of the MFCC criterion. Therefore, the hand mass/inertia properties and ergonomics cannot be ignored for hand motion planning in highly dynamic environment. For these skilful tasks a subject forms a unique natural hand velocity profile. Reinforcement learning, “programmed” by another skilful person’s profiles, may not provide comfortable control strategies for the subject.
In the future research, it would be worthwhile to analyze the movements for different types of experimental scenarios. Also, it would be interesting to explain our experimental results without arm joint fixation by replacing the equations (2), (5) by models of the arm with two links and joints, including the joint stiffness and viscosity and the dynamics of the the hardware. Also, we found that many of the experimental profiles were slightly skewed to the left. In this respect, studying non-zero boundary conditions (partially, non-zero hand acceleration) of the optimization problems could clarify these effects.