The goal of this research is to examine if and how aided target recognition (AiTR) cueing capabilities facilitates multitasking (including operating a robot) by gunners in a military tank crewstation environment. Specifically, we examine if gunners are able to effectively perform their primary task - maintaining local security - while performing a pair of secondary tasks: (1) managing a robot and (2) communications with fellow crew members. According to Mitchell (2005), who used the Improved Performance Research Integration Tool (IMPRINT) to examine the workload of the crew of a future tank system, the gunner is the most viable option for performing the robotics control tasks compared to the other two positions (i.e. vehicle commander and driver). She found that the gunner had the fewest instances of overload and, therefore, may be able to assume control of the robot. However, she also discovered that there were instances in the model when the gunner dropped his/her primary tasks of detecting and engaging targets to perform robotics tasks, which could be catastrophic for the team and mission during a real operation. If the gunner is the individual who will most likely be assigned the responsibility of robotics control, then it is important to consider what design changes will be necessary to allow successful multitasking without a critical performance decrement in maintaining local security.
Based on Mitchell’s modeling work, Chen and Joyner (2009) conducted a simulation experiment and the results showed that, when the robotics operator had to perform robot targeting and local security (i.e. gunner’s tasks) at the same time, both workload and performance degraded, compared with a baseline single-task condition. More specifically, as the robotics task became more difficult, the participants’ gunnery task performance became worse and their workload assessment also increased. Indeed, past research in dual task performance has shown that operators may encounter difficulties when both tasks involve focal vision. For example, Horrey and Wickens (2004) demonstrated that participants could not effectively detect road hazards while operating in-vehicle-devices. Additionally, Murray (1994) found that as the number of monitored displays increased, the operators’ reaction time for their target search tasks also increased linearly. In fact, response times almost doubled when the number of displays increased from 1 to 2 and from 2 to 3 (a slope of 1.94 was obtained). Since both the gunnery and the robotics tasks in Chen and Joyner were heavily visual, we considered tapping into another modality - touch. Our hypothesis was that parsing additional information by using relatively untapped modalities might alleviate the resource demands and could help the operator effectively transition between displays (Wickens, 2002).
1.2. Tactile cueing
In the current study, we examined if and how tactile cueing, which delivered simulated AiTR capabilities (i.e. cues to the direction of a potential target), enhanced gunner’s performance in a military multitasking environment. In the first experiment, the simulated AiTR was perfectly reliable; in the second experiment, it was either false-alarm prone (FAP) or miss prone (MP). Sklar and Sarter (1999) found tactile cueing to be particularly useful for target detection and response time with a concurrent visual task, both in conjunction with visual cueing and alone. Terrence et al. (2005) compared spatial auditory and spatial tactile cues and found that participants perceived the tactile cues both faster and more accurately. In another study by Krausman et al. (2005), on the other hand, tactile cueing was not found to be more effective than auditory cueing in terms of response time, although it was more effective than visual cueing. Additionally, participants rated tactile cueing as the most helpful among the three types of alerts.
Spatial attention has been found to have cross-modal links across visual, auditory, and tactile inputs (Spence & Driver 1997). The level of effectiveness of one spatial information display relative to other display modalities may be dependent on the operational context of the experimental procedure (i.e. the demands of the tasks). Ho et al. (2005) found vibrotactile alerts were powerful directors of spatial attention in simulated driving scenarios, with faster responses even when reliability levels made the alerts spatially non-predictive. Clearly, there are potential benefits to offloading information to the relatively underutilized sensory pathways, though the exact nature of the performance gains is in need of further elucidation. With proper implementation, tactile alerts may improve performance when multitasking with man-machine interfaces (Van Erp & Van Veen 2004).
1.3. Imperfect automation and multitasking performance
In the real world, cueing systems are often FAP or MP, based on the threshold settings of the alert. Meyer (2001, 2004) suggests that FAP and MP alerts have distinct effects on operator’s usage of the automated systems. Specifically, high FA rates reduce the operator’s compliance with automation (compliance defined as taking actions based on the alerts). Conversely, high miss rates reduce operator’s reliance on automation (reliance defined as failure to take precautious actions when there is no alert). Wickens, Dixon, Goh et al. (2005) showed that the operator’s automated task performance degraded when the FA rate of the alerts for the automated task was high. On the other hand, when the miss rate was high, the concurrent task performance was affected more than the automated task because the operator had to allocate more visual attention to monitor the automated task. Similarly, Dixon and Wickens (2006) showed that FAs and misses affected compliance and reliance, respectively, and their effects appeared to be relatively independent of each other.
In contrast to Meyer’s model and the aforementioned findings, Dixon et al. (2007) showed that FAP automation impaired “performance more on the automated task than did miss-prone automation, (e.g. the “cry wolf” effect) and hurt performance (both speed and accuracy) at least as much as MP automation on the concurrent task (p. 570-571).” FAP automation was found to affect both operator compliance and reliance, while MP automation affected only operator reliance. The authors suggested that the FAP automation had a negative impact on reliance because of the operator’s overall reduced trust in the automated system. Similarly, Wickens, Dixon, and Johnson (2005) demonstrated a greater cost associated with FAP automation (than with MP automation), which affected both the automated and concurrent tasks.
Furthermore, Wickens and Dixon (2005) demonstrated that when the reliability level is below approximately 70%, operators often ignore the alerts. In their meta-analytic study, Wickens and Dixon found that “a reliability of 0.70 was the ‘crossover point’ below which unreliable automation was worse than no automation at all.” Although Wickens and his colleagues have done extensive research in this area, their studies were conducted in a different environment (unmanned aerial vehicle control display monitoring), and they did not use tactile cueing. The current study was the first one to examine these issues in the context of combined roles of gunner and robotics operator. Since an AiTR cannot have a perfect reliability rate in foreseeable real world operations, the data from this study should provide useful information to the design community of future military systems, in which AiTR will play an integral role.
1.4. Individual differences in spatial ability and attentional control
In the current study, we also sought to investigate the effects of individual differences in spatial ability (SpA) and perceived attentional control (PAC) on the operators’ concurrent performance. SpA has been found to be a significant factor in virtual environment navigation (Stanney & Salvendy, 1995), learning to use a medical teleoperation device (Eyal & Tendick, 2001), target search task (Chen et al., 2008, Chen & Joyner, 2009), and robotics task performance (Cassenti et al., 2009, Lathan & Tracey, 2002, Menchaca-Brandan et al., 2007). For example, Lathan and Tracey (2002) demonstrated that people with higher SpA performed better in a teleoperation task through a maze. They finished their tasks faster and had fewer errors. In a recent study, Cassenti et al. (2009) demonstrated that robotics operators with higher SpA (measured by a mental rotation test) performed robot navigation tasks significantly better than those with lower SpA. Our previous studies (Chen et al., 2008, Chen & Joyner, 2009) also found SpA to be a good predictor of the operator’s robotics and gunnery task performance. In the domain of visual spatial displays, Stanney and Salvendy (1995) found that high SpA individuals outperformed those with low SpA on tasks that required visuo-spatial representations to be mentally constructed. While many SpA tests measures focus on visually presented stimuli, the interconnections of sensory modalities at the level of spatial perception may translate into differential effects of multisensory spatial displays across SpA levels (Spence et al., 2004).
In addition to SpA, we also examined the relationship between attentional control and multitasking performance. Several studies show that there are individual differences in multitasking performance, and some people are less prone to performance degradation during multitasking conditions (Rubinstein et al., 2001, Schumacher et al., 2001). There is some evidence that attention-switching flexibility can predict performance of such diverse tasks as flight training and bus driving (Kahneman et al., 1973). There is also evidence that people with better attention control can allocate their attention more flexibly and effectively (Bleckley et al., 2003, Derryberry & Reed, 2002), and this was partially confirmed by Chen and Joyner (2009). It is likely that operators with different levels of attention switching abilities may react differently to automated systems with FAs and misses. In other words, operators’ compliance and reliance behaviors may be altered based on their ability to effectively switch their attention among the systems. For example, the complacency effect may be more severe for poor attentional control individuals compared with those with better attentional control. The current study sought to examine if the compliance vs. reliance effects reported in the literature might be moderated by individual attentional control.
1.5. Current study
In the current study, we simulated a military tank crewstation environment and incorporated AiTR signals (tactile or a combination of tactile and visual) to help participants locate potential threats in the immediate environment while controlling a robot. The primary task of the gunner was to determine which action to take, based on a visual determination of whether a potential threat was hostile or neutral. This task was to be performed while conducting other tasks (including the remote targeting task with the robot and a concurrent communication task). In the first experiment, the simulated AiTR was perfectly reliable; in the second experiment, it was either FAP or MP. For the first experiment, it was hypothesized that tactile signals would improve performance in both the gunnery and the robotics control tasks as they could signal the appropriate times to transition from the robotics control tasks back to the gunner’s primary task of maintaining local security around the simulated vehicle. The tactile signals also provide directional information along the azimuth for targets around the vehicle, which may also facilitate performance. Additionally, assisting the gunnery task with the AiTR was expected to enhance the operators’ performance of the concurrent tasks, as more mental resources could be directed to these tasks (Young & Stanton, 2007a, Dixons et al., 2004). Past research has shown that automation can help reduce the performance gap between experts and novices (Young & Stanton, 2007a). It is, therefore, reasonable to expect greater performance improvement for the participants with lower SpA when automation is introduced.
For the second experiment, based on the data from Wickens, Dixon, Goh et al. (2005), we expected that the operator’s gunnery (automated) task performance would degrade if the FA rate of the AiTR for the gunnery system was high because of reduced compliance with the automation. Conversely, if the cueing was MP, the operator’s robotics (concurrent) task performance would be affected more than the gunnery task because of reduced reliance on the automation. More mental and visual resources would be devoted to checking the raw data for the automated task, and therefore, the performance of the concurrent task would be degraded. On the other hand, there was evidence that FAP automation was more detrimental to both the automated and concurrent tasks than MP automation (Dixon et al., 2007). Therefore, it is likely that FAP automation would have a more negative impact on the overall performance than would MP automation. In other words, there have been conflicting results in the literature regarding the independence of the effects of FAP and MP automation on operator compliance and reliance. It is possible that individual differences may be responsible for some of the observed differences in the literature. Therefore, we investigated the effects of individual differences on FAP and MP conditions as a possible explanation for the discrepancies.
2. Experiment 1
Twenty college students (4 females and 16 males, mean age = 21.0) participated in this study. Participants were compensated $8 per hour or with class credit for their participation.
The experiment was conducted using Tactical Control Unit (TCU) (developed by the U. S. Army Research Laboratory’s Robotics Collaborative Technology Alliance) for the robotics control tasks (Figure 1). The TCU is a one-person crew station from which the operator can control several simulated robots, which can either perform their tasks semi-autonomously or be teleoperated. The operator performed the instructed robotics tasks through the use of a 19 in. touch-screen display. A joystick was used to manipulate the direction in which the robotic vehicles moved when in Teleop mode. The robot simulated in our study is the eXperimental unmanned vehicle (XUV) developed by the Army Research Laboratory. The gunnery station was implemented using an additional screen and controls to simulate the out-the-window view and line-of-sight fire capabilities (Figure 1). The interface consisted of a 15 in. flat panel monitor and a joystick. Participants used the joystick to rotate the viewfinder 360 degrees, zoom in and out, and engage targets.
22.214.171.124. AiTR Displays.
To augment target detection in the gunnery component, visual and tactile alerts were used to cue the participant to the direction of a target as determined by the AiTR. The visual alerts were displayed in the lower right area of the screen, with the target icons presented around the overhead-view diagram of the simulated vehicle gunner station. The target icon appeared in one of eight possible locations around the gunner, corresponding to 45˚ increments along a 360˚ azimuth. As the gunner rotated the view, the turret portion of the vehicle diagram moved along the eight possible orientations to allow the gunner to place his/her field of view on the cued target.
Tactually, target positions relative to the gunner were presented using eight electromechanical transducers known as ‘tactors,’ each delivering a 250 Hz sinusoidal, salient (approximately 20 dB above threshold) vibrotactile stimulus harmlessly to the skin. The eight tactors were arranged equidistantly on an elasticized belt worn around the abdomen just above the navel. This configuration was based upon research conducted by Cholewiak et al. (2004) who found that additional tactors within this ring reduced inter-tactor distance and compromised localization performance. The tactile stimulus parameters were programmed onto a battery-powered controller board governing all eight tactors. This board was, in turn, controlled by a computer running the simulation and presenting targets for the visual and tactile conditions. The tactile stimulus had a 300 ms duration, which was determined based upon the simulation’s refresh rates for updating AiTR information. To match the visual condition as closely as possible, a target that was directly behind the gunner (6 o’clock position) would cause the tactor on the spine to activate. If the gunner moved the turret to the right, the vibrotactile stimulus would then appear to move along the right side of the body. If the tactor above the navel was active, then this indicated the corresponding hostile target should now be in the gunner’s field of view. Participants had an opportunity to familiarize themselves with both types of signals during training.126.96.36.199. Communication Task Materials.
The communication task was administered concurrently with the experimental scenarios. The questions included simple military-related reasoning tests and simple memory tests. The inclusion of these cognitive tasks was for simulating an environment where the gunner was communicating with fellow crew members in the vehicle. For the reasoning tests, there were questions such as ‘if the enemy is to our left, and our UGV is to our right, what direction is the enemy to the UGV?’ For the memory tests, the participants were asked to repeat some short statements or keep track of three radio call signs (e.g. “Bravo 83”) and they had to report to the experimenter whether the call signs they heard were one of those they were keeping track of. Test questions were pre-recorded by a male speaker and were presented at the rate of one question every 33 seconds via a synthetic speech program, DECTalk®.188.8.131.52. Questionnaires and Spatial Tests.
A demographics questionnaire was administered at the beginning of the training session. The Cube Comparison (Ekstrom et al., 1976), the Hidden Patterns tests (Ekstrom et al., 1976), and the Spatial Orientation Test (Gugerty & Brooks, 2004) were used to assess participants’ SpA. The Cube test requires participants to compare, in 3-minutes, 21 pairs of 6-sided cubes and determine if the rotated cubes are the same or different (only 3 sides of each cube are shown). The Hidden Patterns test measures flexibility of closure and involves identifying specific patterns or shapes embedded within distracting information. The Spatial Orientation test, modeled after the cardinal direction test developed by Gugerty and Brooks (2004), is a computerized test consisting of a brief training segment and 32 test questions. Both accuracy and response time were automatically captured by the program. Participants were designated as high SpA or low SpA based on their composite scores of the three spatial tests (median split).
A questionnaire about attentional control (Derryberry & Reed, 2002) was used to evaluate participants’ PAC. The attentional control survey consists of 21 items and measures perceived attention focus and shifting. The scale has been shown to have good internal reliability ( =.88). Derryberry and Reed conducted an experiment to examine the relationship between self-reported (i.e. attentional control survey score) and actual attentional control. They found that participants with a high survey score could better resist interference in a Stroop-like spatial conflict task. In one of our previous studies (Chen and Joyner, 2009), we observed a positive, although somewhat weak, relationship between attentional control survey score and some multitasking performance measures.
Participants’ workload was evaluated using the computer-based version of NASA-TLX (Hart & Staveland, 1988). Finally, a usability questionnaire was used to assess participants’ reliance on tactile and/or visual cueing for the gunnery task when both types of alerts were available. Participants rated their preference on a 5-point scale (from 1 to 5: entirely visual- predominately visual- both visual & tactile- predominately tactile- entirely tactile).
2.1.3. Experimental design
The overall design of the experiment is a 2 x 2 x 3 mixed design. The between-subject variable is participants’ SpA (low vs. high). The within-subject variables are Robotics Task type (Auto vs. Teleop) and AiTR type (Baseline- no alerts vs. Tactile alerts only vs. Tactile + Visual alerts) (see Procedure). There were six within-subject conditions:
Auto-BL (baseline): No alerts + control of a semi-autonomous UGV
Teleop-BL: No alerts + Teleoperating a UGV
Auto-Tac: Tactile alerts + control of a semi-autonomous UGV
Teleop-Tac: Tactile alerts + Teleoperating a UGV
Auto-TacVis: Tactile alerts + Visual alerts + control of a semi-autonomous UGV
Teleop-TacVis: Tactile alerts + Visual alerts + Teleoperating a UGV
The reliability level of the alerts was 100%. However, only hostile targets were cued, not the neutral targets. The participants had to detect the neutral targets on their own. It was decided to not include a visual-cueing condition due to the fact that our simulated environment was heavily visual. Therefore, visual alerts were not expected to be effective if not combined with a non-visual modality.
After the informed consent process, participants were administered the surveys and spatial tests. After these tests, participants received training, which was self-paced and was delivered by PowerPoint® slides showing the elements of the TCU, steps for completing various tasks, several mini-exercises for practicing the steps, and 2 exercises for performing the robotics tasks (details presented later). After the tutorial on TCU, participants were trained on the gunnery tasks. The entire training session lasted about 2.5 hrs.
The experimental session took place on a different day but within a week of the training session. Before the experimental session began, participants were given some practice trials and review materials, if necessary, to refresh their memories. After the refresher training, participants completed one combined exercise in which they performed all three tasks (i.e. gunnery, robotics, and communication tasks) at the same time. Participants then changed into one of the laboratory cotton T-shirts in order to standardize how the tactors were applied to the skin. The experimenter then measured the participant around the abdomen just above the navel, adjusted the tactile belt, and arranged the tactors so that they were equidistant for the participant’s abdomen. Once fitted with the tactile display, the participant was seated in front of the gunner monitor. A test pattern would confirm that all eight tactors were working properly and that the participant could readily perceive the stimuli. The experimenter then explained the nature of the AiTR system and the corresponding visual or tactile cues that would be provided.
In the experimental trials, participants’ tasks were to use their robot to locate targets (i.e. enemy dismounted soldiers) in the remote environment and also find targets in their immediate environment. The tank was simulated as traveling along a designated route, which was approximately 4.3 km and lasted about 15 minutes. There were 10 hostile and 10 neutral targets randomly placed along the route in each gunnery scenario. Hostile targets were enemy soldiers dressed in military uniform and carrying a gun; neutral targets were civilians dressed in typical Middle Eastern attire without any weapons. Participants were instructed to engage the hostile targets and verbally report spotting the neutral targets. Only hostile targets were cued (in the non-baseline conditions), not the neutral targets. The participants had to detect the neutral targets independently. Additionally, the alerts did not occur when neutral targets appeared in the environment. In total, there were six 15-minute scenarios, corresponding to the six experimental conditions, the order of which was counterbalanced according to a Williams Square design.
There were two types of robotics tasks: Auto and Teleop. The Auto control task required the operator to monitor the video feed as the robot traveled autonomously, examine still images generated from the reconnaissance scans, and detect targets. The Teleop task required the operator to manually manipulate and drive the robot (using a joystick) along a predetermined route using the TCU to detect randomly placed targets for each scanning checkpoint. For both the Auto and Teleop tasks, upon detecting a target, participants needed to place the target on the map, label the target, and then send a spot-report.
While the participants were performing their gunnery and robotics tasks, they simultaneously performed the communication task by answering questions delivered to them via DECtalk®. There were two-minute breaks between experimental scenarios. Participants filled out the NASA-TLX after they completed each scenario and the usability survey at the end of the experimental session.
The dependent measures include mission performance (i.e. number of targets detected in the remote environment using the robot and number of targets detected in the immediate environment), communication task performance, and workload assessment.
2.2.1. Target detection performance184.108.40.206. Gunnery Task.
Participants were designated as high SpA or low SpA based on their composite SpA test scores (median split). A mixed analysis of variance (ANOVA) was performed to examine the effects of the concurrent robotics tasks on the gunnery task performance (percentage of hostile targets detected), with the Robotics Task condition (Auto vs. Teleop) and the AiTR condition (Baseline vs. Tac vs. TacVis) being the within-subject factors and SpA (High vs. Low) as the between-subject factor. The analysis revealed that AiTR condition significantly affected number of targets detected, F(2, 36) = 78.6, p <.001. Simple contrasts with the Baseline condition as the reference category showed that target detection in Baseline was significantly lower than in the Tac and TacVis conditions. Participants with higher SpA had significantly higher gunnery task performance than did those with lower SpA, F(1, 18) = 5.7, p <.05 (Figure 2).
Participants’ detection of neutral targets was also assessed. Since the AiTR only alerted the participants when hostile targets were present, the neutral target detection could be used to indicate how much visual attention was devoted to the gunnery station. An ANOVA revealed a significant main effect for both Robotics, F(1, 19) = 13.2, p <.005, and AiTR, F(2, 38) = 18.1, p <.0001. Post-hoc (LSD) tests showed that Baseline was highest and Tac was lowest, and the differences between each pair were all significant.220.127.116.11. Robotics Task.
Since participants’ task performance in the Auto condition was assisted by the capabilities of the TCU, it was determined that only the performance data from the Teleop condition would be included for the analyses. Performance data from the Tac and TacVis conditions were merged to form the AiTR condition and was compared with the Baseline condition. It was found that the Baseline condition was significantly lower than the AiTR condition, F(1,18) = 5.3, p <.05. Those with higher SpA outperformed those with lower SpA in the baseline condition, F(1,18) = 5.9, p <.05, but not in the AiTR conditions (Figure 3).
2.2.2 Communication task performance
Performance data from the Tac and TacVis conditions were again merged to form the AiTR condition and was compared with the Baseline condition. The difference between these two conditions was significant, F(1, 19) = 7.4, p <.05, with the no AiTR condition lower.
2.2.3 Workload assessment
Weighted ratings of the scales of the NASA-TLX were used for this analysis. Participants’ perceived workload was significantly affected by the Robotics condition, F(1, 18) = 5.2, p <.05, as well as the AiTR condition, F(2, 32) = 4.3, p <.05 (Figure 4). The workload assessment was higher in the Teleop condition (M = 70.22) and when the gunnery task was unassisted by the AiTR (M = 70.5).
2.2.4. AiTR display usability assessment
A usability questionnaire captured participant preferences for presentation of AiTR information. Following their interaction with the AiTR systems, 65% of participants responded that they either relied predominantly or entirely on the tactile AiTR display. Only 15% responded that they either relied predominantly or entirely on the visual AiTR display. AiTR preference was also significantly correlated with participants‘ SpA (i.e., composite score of the spatial tests), r =.53, p =.016.
3. Experiment 2
The goal of this experiment was to examine the effects of unreliable alerts on gunners’ concurrent performance of gunnery, robotics, and communication tasks. Both tactile and visual displays were incorporated to provide directional cueing for the gunnery targeting task (based on a simulated AiTR capability). Two types of imperfect AiTR were simulated: false-alarm-prone (FAP) and miss-prone (MP). We were particularly interested in investigating discrepancies in previous research related to compliance and reliance effects as a function of type of AiTR error. Effects of individual differences in SpA and perceived attentional control (PAC) were also evaluated.
Twenty-four college students (4 females and 20 males, mean age = 22.3) participated in this study. Participants were compensated $15/hr or with class credit for their participation.
The simulators and cueing displays were identical to those used in Experiment 1. The simulated AiTR was either FAP or MP, with a reliability level at 60%. The low reliability level was deliberately chosen to investigate if the compliance vs. reliance effects as well as the individual differences reported previously in the literature would be amplified in the high workload multitasking environment in the current study. The FAP condition consisted of ten hits (i.e. alerts when there were targets), eight FAs (i.e. alerts when there were no targets), no misses (i.e. no alerts when there were targets), and two correct rejections (CRs) (i.e. no alerts when there were no targets). The MP condition consisted of two hits, no FAs, eight misses, and ten CRs.
The communication task materials, spatial tests, and surveys (i.e., Attentional Control Survey, NASA-TLX, and Usability Survey) were identical to those used in Experiment 1. Participants were also asked to evaluate their trust in the AiTR system using a modified survey by Jian et al. (2000) (items 22-33).
3.1.3. Experimental design
The overall design of the study is a 2 x 3 mixed design. The between-subject variable is AiTR type (FAP vs. MP). The within-subject variable is Robotics Task type (Monitor vs. Auto vs. Teleop) (see Procedure).
The preliminary session (i.e., surveys and spatial tests) and the training session were identical to Experiment 1 and lasted about 2.5 hrs. The experimental procedure was also identical to Experiment 1, except that it followed the training session on the same day and the participants were told that the AiTR cueing was unreliable. There were three types of robotics tasks: Monitor, Auto, and Teleop. The Monitor task required the operator to continuously monitor the video feed as the robot traveled autonomously and verbally report detection of targets. There were twenty targets (five hostile and fifteen neutral) along the route. The Auto and Teleop tasks were identical to those in Experiment 1. While the participants were performing their gunnery and robotics control tasks, they simultaneously performed the communication task by answering questions delivered to them via DECtalk®. There were 2-min breaks between experimental scenarios. Participants assessed their workload using the computerized NASA-TLX after each scenario. They also evaluated their perceived utility of and trust in the AiTR at the end of the experiment. The entire experimental session lasted about 1 hr.
The dependent measures include mission performance (i.e. number of targets detected in the remote environment using the robot and number of hostile/neutral targets detected in the immediate environment), communication task performance, and perceived workload.
3.2.1. Target detection performance18.104.22.168. Gunnery Task.
A mixed ANOVA was performed to examine the effects of the concurrent robotic control tasks on the gunnery task performance (percentage of hostile targets detected), with the AiTR condition (FAP vs. MP) being the between-subject factor and the Robotics Task condition (Monitor vs. Auto vs. Teleop) as the within-subject factor. The analysis revealed that Robotics condition significantly affected number of targets detected, F(2, 15) = 4.6, p <.05 (Figure 5). Post hoc (LSD) tests showed that target detection in the Monitor condition was significantly higher than in the Auto and Teleop conditions. Neither AiTR nor the Robotics x AiTR interaction was significant.
Participants with higher SpA had significantly higher gunnery task performance than did those with lower SpA, F(1, 16) = 6.3, p <.05. When comparable data from both experiments were examined in the same analysis (with only the TacVis condition from Experiment 1 and Robotics and Teleop conditions from Experiment 2), it was found that AiTR reliability contributed significantly to the hostile target detection performance of gunnery task, F(2,30) = 11.8, p =.000. Post-hoc (LSD) tests show that AiTR with perfect reliability (Experiment 1) was significantly higher than MP, and FAP was also significantly higher than MP, p’s <.05.
Participants’ SpA was found to affect their gunnery task performance, and there was a significant SpA x AiTR reliability interaction (Figure 6). As Figure 6 shows, there was a large difference between low SpA and high SpA individuals in the FAP condition.
Participants were classified as high or low PAC based on their attentional control survey scores (median split). There was a significant AiTR x PAC interaction, F(1, 16) = 7.4, p <.05 (Figure 7, upper left). Those with lower PAC performed better with the FAP cueing, whereas those with higher PAC performed at a similar level regardless of the AiTR conditions.
In order to further examine the effect of task load on reliance of AiTR, the data of the MP condition were analyzed separately. Due to the small sample size (N = 12), no significant differences were found between those with high vs. low PAC, F(1, 10) = 1.4, p >.05. However, the trend was evident that, while those with high PAC maintained a fairly stable level of reliance throughout the experimental conditions, those with low PAC became increasingly reliant on the AiTR (and missed more targets), as task load became heavier (i.e. Teleop > Auto > Monitor, based on Chen & Joyner, 2009) (Figure 8). For the low PAC participants, the difference between the Monitor and Teleop conditions was statistically significant, F(1, 6) = 7.1, p <.05.
Participants’ detection of neutral targets was also assessed. Since the AiTR only alerted the participants when hostile targets were present, the neutral target detection could be used to indicate how much visual attention was devoted to the gunnery station. A mixed ANOVA revealed a significant main effect for Robotics, F(2,15) = 4.4, p <.05. Post hoc tests (LSD) showed that neutral target detection in the Teleop condition was significantly lower than in the Auto condition. The main effect for AiTR failed to reach statistical significance, F(1, 22) = 3.3, p >.05. There was a significant AiTR x PAC interaction, F(1, 16) = 3.6, p <.05 (Figure 7, upper right panel). Those with lower PAC performed at about the same level, regardless of the AiTR type, while those with higher PAC had a better performance with the MP cueing
than with the FAP cueing. When comparable data from both experiments were examined in the same analysis (with only the TacVis condition from Experiment 1 and Robotics and Teleop conditions from Experiment 2), it was found that both the main effect of Robotics and the Robotics x PAC interaction were significant, F(1,30) = 8.8, p =.006 and F(1,30) = 4.5, p =.04 respectively (Figure 9). The difference between low PAC and high PAC individuals was larger in the Teleop condition than in the Auto condition.22.214.171.124. Robotics Task.
A mixed ANOVA revealed that there was a significant main effect for Robotics, F(2,15) = 25.4, p <.001 (Figure 10). The Monitor condition was significantly higher than both the Auto and the Teleop conditions, in terms of percentage of targets detected. The main effect for AiTR was not significant, p >.05. There was a significant Robotics x AiTR interaction, F(2,32) = 4.0, p <.05. The Monitor task performance stayed at the same level regardless of the AiTR types. The Auto task performance was slightly higher with the MP cueing (although the difference failed to reach statistical significance), while the Teleop task performance was significantly higher with the FAP cueing (p <.05). There was also a significant AiTR x PAC interaction, F(1,16) = 4.8, p <.05 (Figure 7, lower left panel). Those with lower PAC had a better performance with the FAP cueing, while those with higher PAC performed better with the MP cueing.
3.2.2. Communication task performance
A mixed ANOVA revealed that there was a significant main effect for Robotics, F(2,44) = 3.3, p <.05. The Monitor condition was significantly higher than the Teleop conditions, F(1,22) = 5.5, p <.05. Neither the main effect for AiTR nor the Robotics x AiTR interaction was significant, p’s >.05 (Figure 7, lower right panel). When comparable data from both experiments were examined in the same analysis (with only the TacVis condition from Experiment 1 and Robotics and Teleop conditions from Experiment 2), it was found that the main effect of AiTR reliability was significant, F(2,29) = 5.3, p =.011 (Figure 11). Post-hoc (LSD) tests showed that communication task performance in Experiment 1 (perfect reliability) was significantly better than either FAP or MP (p’s <.05).
3.2.2. Workload assessment
Participants’ self-assessment of workload (weighted ratings of the scales of the NASA-TLX) was significantly affected by Robotic condition, F(2,15) = 25.1, p <.001 (Figure 12). The perceived workload was significantly higher in the Teleop condition (M = 77.7) than in the Auto condition (M = 69.6) and the Monitor condition (M = 61.1). The difference between Auto and Monitor was also significant. The main effect for AiTR was not significant, p >.05. There was a significant Robotics x AiTR interaction, F(2,15) = 5.5, p <.05.
3.2.3. AiTR display usability assessment
Following their interaction with the AiTR systems, 41% of participants responded that they relied predominantly or entirely on the tactile AiTR display, while 36% responded that they relied predominantly or entirely on the visual AiTR display. AiTR preference was also significantly correlated with SpA (composite spatial test scores), r =.51, p <.01. Those with
higher SpA tended to prefer tactile cueing over visual cueing. Conversely, those with lower SpA favored visual cueing over tactile cueing. Figure 13 shows the data from both experiments examined in the same analysis, F(1,35) = 12.1, p =.001. There was also a significant negative correlation between the participants’ ages and their preference of tactile display, r = -.42, p =.003 (i.e., older participants tended to prefer visual cueing display while younger participants tended to prefer tactile display).
4. General discussion
In this study, we simulated a military tank crewstation environment and examined the performance and workload of the combined position of gunner and robotics operator. More specifically, we investigated the effects of AiTR (with either perfect reliability or imperfect reliability [FAP vs. MP]) on operator’s performance of the automated (i.e., gunnery) task as well as the concurrent tasks (i.e., robotics and communication). According to Chen and Joyner (2009), adding a robotics task to the gunner’s tasking environment resulted in approximately 30% reduction in target detection for the gunnery task. In Experiment 1, the structural interference for the gunnery task created by concurrent performance of the robotics task was mitigated by augmenting the gunnery task via tactile cueing. Results of Experiment 2 showed that the operator’s gunnery task performance in detecting hostile targets was significantly better in the Monitor condition than in the other two robotics task conditions, consistent with the findings of Chen and Joyner (2009). In both Chen and Joyner (2009) and Experiment 2, the workload associated with the Monitor condition was significantly lower than the other robotics conditions. These results suggest that the operator had more visual and mental resources for the gunnery task when the robotics task was simply monitoring the video feed, compared with the other two robotics conditions. Also consistent with past research (Lathan & Tracey, 2002, Vincow, 1998) and Chen and Joyner (2009), participants’ SpA was found to be an accurate predictor of their gunnery performance in both Experiments 1 & 2. Thomas and Wickens (2004) showed that there were individual differences in scanning effectiveness and its associated target detection performance. However, Thomas and Wickens did not examine the characteristics of those participants who had more effective scanning strategies. The findings of the current study along with Chen and Joyner indicate that SpA may be an important factor for determining scanning effectiveness. Figure 6 shows that when there was an increased requirement for visual scanning (i.e., FAP), the difference in effectiveness of scanning (i.e., target detection performance) between high SpA and low SpA was especially large. Our findings support the recommendation by Lathan and Tracey that military missions can benefit from selecting personnel with higher SpA to operate robotic devices.
Results of Experiment 2 also showed that there was a significant interaction between types of unreliable AiTR and participants’ PAC. For those with high PAC, our data are consistent with the notion that operator reliance on and compliance with automation are independent constructs and are separately affected by system misses and false alarms (Dixon & Wickens, 2006, Meyer, 2001, 2004, Wickens, Dixon, Goh et al., 2005). Based on Figure 7, it is evident that high PAC participants did not comply with alerts in the FAP condition. Since the FAP AiTR had a 0% miss rate, a full compliance should result in a detection rate over 84%, as reported in Experiment 1 (with perfectly reliable AiTR). As predicted, Figure 7 shows that in MP conditions, high PAC participants did not rely on the AiTR and detected more targets than were cued. However, an examination of the data for the low PAC participants revealed a completely opposite trend. Specifically, with the FAP condition, low PAC participants showed a strong compliance with the alerts, which resulted in a good performance in target detection (at a similar level as in Experiment 1). With the MP condition, however, low PAC participants evidently overly relied on the automation and therefore had a very poor performance. Indeed, Figure 8 shows that as task load became heavier, those with low PAC became increasingly reliant on the AiTR (and missed more targets), while those with high PAC maintained a fairly stable level of reliance throughout the experimental conditions. According to Biros et al. (2004), higher task loads tend to induce a higher level of reliance on automated systems. Data of Experiment 2 suggest that this heightened level of reliance is also moderated by PAC. More specifically, only those with low PAC tend to exhibit over-reliance on automation (i.e. complacency) under a heavy task load.
Data of both Experiments 1 and 2 showed that the gunner’s detection of neutral targets (which was not aided by AiTR) was significantly worse when s/he had to teleoperate a robot (vs. when the robot was semi-autonomous) or when the gunnery task was aided by AiTR. These findings suggest that participants devoted significantly less visual attention to the gunnery station when their robot required teleoperation or when their gunnery task was assisted by AiTR. On average, in Experiment 1, participants detected 45% of the neutral targets when there was no AiTR; they only detected 28% when there was. These results are consistent with automation research that operators may develop over-reliance on the automatic system and this complacency may negatively affect their task performance (Chen & Joyner, 2009, Dzindole et al., 2001, Parasuraman et al., 1993, Thomas & Wickens, 2004, Young & Stanton, 2007b). It is worth noting that these findings, along with the results of the current study, do not necessarily suggest that manual manipulation of sensor devices be used instead of AiTR devices. However, the issue of over-reliance on these automatic capabilities needs to be taken into account when designing the user interface where these features are present. Data of Experiment 2 also showed that those with lower PAC performed at about the same level, regardless of the AiTR type, while those with higher PAC had a significantly better performance with the MP cueing. This suggests that higher PAC participants devoted more visual attention to the gunnery station (implying a reduced reliance on automation for the gunnery task) when the AiTR was MP than when the AiTR was FAP. Although we did not measure participants’ scanning behaviors, the detection rate of neutral targets on the gunnery station provides an estimate of the amount of operator’s visual attention on the automated task environment. Again, the data of high PAC participants seem to support the hypothesis that MP automation reduces operator reliance. However, the same phenomenon was not observed for the low PAC participants. Figure 9 shows that, with data from both experiments, the difference in neutral target detection performance between high PAC and low PAC individuals appeared to widen when the robotics task was Teleop, compared with the Auto condition. This finding suggests that high PAC individuals were able to allocate more visual attention to the gunnery tasking environment when the multitasking requirement was more demanding (i.e., Teleop) than did the low PAC individuals.
For the robotics tasks, the results of Experiment 1 showed that participants’ teleoperation performance improved significantly when their gunnery task was assisted by AiTR. Therefore, AiTR benefited not only the automated task (i.e., gunnery) but also the concurrent task (i.e., robotics). In the current study, structural interference for the robotics task caused by concurrent performance of the gunnery task was successfully mitigated by providing cues to assist the gunnery task. This finding is consistent with previous research on the effects of automating the primary task on enhancing the concurrent visual tasks (Dixon et al., 2004; Young & Stanton, 2007a). Additionally, it was evident that AiTR was more beneficial for enhancing the concurrent robotics task performance for those with lower SpA than for those with higher SpA. When AiTR was available to assist those operators with low SpA, the performance of their concurrent task was improved to a similar level as those with higher SpA. These results are consistent with other findings showing that vehicle automation helps reduce the performance gap between experts and novices (Young & Stanton, 2007a). These results may have important implications for system design and personnel selection for the future military programs. The data of Experiment 2 showed that participants had the best performance when the task was only monitoring the video feed. Moreover, the Monitor task performance stayed at the same level, regardless of the AiTR types. On the other hand, the Teleop task performance was significantly higher with the FAP cueing. This is consistent with previous studies that MP automation degrades concurrent task performance more than FAP (Dixon & Wickens, 2006, Wickens, Dixon, Goh et al., 2005). However, the same trend was not observed for the other two robotics tasks, which were less challenging than the Teleop task. Therefore, it appears that the adverse effect of MP automation on concurrent tasks is only manifest in more challenging task conditions. The data of Experiment 2 also showed that again, there was a significant interaction between AiTR type and PAC. Consistent with the previous two performance measures (gunnery-hostile and gunnery-neutral), the low PAC participants exhibited a larger performance decrement with the MP conditions. The performance of the high PAC participants, on the other hand, showed a completely opposite trend. These results suggest that the high PAC participants’ reduced compliance with the FAP alerts did not help them with their concurrent task, compared with the MP conditions; conversely, their reduced reliance on the MP alerts did not impair their performance. Overall, the low PAC participants showed the most pronounced adverse effect of MP alerts on concurrent performance. In contrast, the FAP alerts not only helped them with their automated task but also their concurrent task.
Taking the three main performance measures together (i.e. Gunnery- Hostile, Gunnery- Neutral, and Robotics), it appears that overall, for high PAC participants, FAP alerts were more detrimental than MP alerts. FAP alerts not only affected their automated task but also the concurrent task. This finding is consistent with the conclusion of Dixon et al. (2007) that FAP degraded overall performance more than MP automation. However, it is worth noting that for low PAC participants, we observed the opposite pattern: MP automation was more harmful than FAP automation. The overall data suggest that low PAC participants had a higher trust in the automation system than did high PAC participants. It is likely that low PAC participants had more difficulty in performing multiple tasks concurrently and had to rely on automation when available. High PAC participants, in contrast, tended to rely on their own multitasking ability to perform the tasks. It is interesting to note that there was no significant difference in the participants’ self-assessment of their trust in the AiTR system between high PAC and low PAC groups. This suggests that the participants’ self-assessed trust in automation may not truly reflect their actual use (i.e., actual trust) of automation. Our results are consistent with past research (de Vries et al., 2003, Lee & Moray, 1994) that self-confidence is a critical factor in moderating the effect of trust (in automation) on reliance (on the automatic system). Lee and Moray found that when self-confidence exceeded trust, operators tended to use manual control. When trust exceeded self-confidence, automation was used more. Our present data suggest that, this relationship between self-confidence and level of reliance is also moderated by operator’s PAC.
Participants’ communication task performance improved when their gunnery task was aided by AiTR (Experiment 1) or when their robotics task was Monitor than when it was Teleop (Experiment 2). With data from both experiments examined in the same analysis, it was also found that communication task performance was significantly better when the AiTR was perfectly reliable than when it was either FAP or MP. Again, this result suggests that reliable AiTR not only enhanced the tasks it was designed for, it also benefited concurrent tasks. It also shows that our cognitive communication task was sensitive to the task load manipulations we implemented for the concurrent tasks. Overall, these results are consistent with the conclusion by Young and Stanton (2007a) that a common resource pool feeds separate processing channels. In our case, as the visual channel is assisted, the auditory task is enhanced by the additional resources available in the general pool. This, however, conflicts with the Multiple Resource Theory (Wickens, 2002), which predicts difficulty insensitivity (i.e. changes in the difficulty of one task has little impact on the performance of the concurrent task if different resources are used). According to Naveh-Benjamin et al. (2000), information encoding processes require more attention than retrieval and are more prone to the effects of competing demands of multitasking. It is, therefore, likely that the information-encoding process of the communication task in our study was more disrupted by the concurrent tasks when there was no AiTR or when a more challenging robotics task (i.e., Teleop) was performed.
Participants’ workload assessment was found to be affected by the type of concurrent robotics task as well as whether their gunnery task was aided by AiTR. They experienced higher workload when the robot required teleoperation or when their gunnery task was unassisted by AiTR. These results are consistent with Mitchell’s (2005) analysis and with the findings of Chen and Joyner (2009) and Schipani (2003), which evaluated robotics operator workload in a field setting. Although many of the ground robots in the Army’s future robotics programs will be semi-autonomous, teleoperation will still be an important part of any missions involving robotics (e.g., when robots encounter obstacles or other problems). The higher workload associated with teleoperation needs to be taken into account when designing the user interfaces for the robots (see Chen et al., 2007, for a review of user interface designs for teleoperated robots).
The data of both Experiment 1 and 2 showed significant positive correlations of AiTR preference with SpA, indicating that as AiTR ratings tended toward considerable reliance on the tactile display, there was a concurrent shift with higher SpA. Perhaps those with higher SpA can more easily employ the spatial tactile signals in the dual task setting and therefore have a stronger preference for something that makes the gunner task easier to complete. Individuals with lower SpA, on the other hand, may have not utilized the spatial tactile cues to their full extent and therefore continued to prefer the visual AiTR display. According to Kozhevnikov et al. (2002), visualizers with lower SpA tend to rely on iconic imagery while those with higher SpA tend to prefer using spatial-schematic imagery while solving problems. Therefore, it is likely that in our study, those who preferred visual AiTR displays might be more iconic in their mental representations. However, this preference may have caused degraded target detection performance due to more visual attention being devoted to the visual AiTR display, not to the simulated environment. In contrast, those who were more spatial could take advantage of the directional information of the tactile display to help them with the visually demanding tasks, resulting in a more effective performance. Finally, our data showed that older participants tended to prefer visual cueing display while younger participants tended to prefer tactile display. It is not clear to which extent this shift is related to decline of SpA as people age (Berg et al., 1982).
In this study, we conducted two simulation experiments and examined the effectiveness of AiTR capabilities (with either perfect reliability, FAP, or MP) for enhancing the performance of gunners who also had to simultaneously operate a robot and maintain effective communication with fellow crew members. Overall, the findings of these experiments suggest that reliable automation (i.e., AiTR in Experiment 1) for one task benefits not only the automated task but also the concurrent tasks (i.e., robotics and communication in this case). The tactile cues alerted the operator of key moments to transition from the robotics task to the gunnery task, and afforded the operator the ability to timeshare effectively between the two tasks in detecting hostile targets. As searching around the vehicle was normally a task that demanded constant visual resources, the tactile cues alleviated this continuous burden by altering the demand into discrete time increments. Although parsing across available resource types may alleviate some performance decrements, it is still exceedingly difficult to fully insulate the primary task from any impinging secondary task. The automation implemented into the gunnery task via the AiTR must also be closely examined as the nature of the human-system interaction is now markedly different. Operators may develop an over-reliance on the AiTR for their tasks and overlook other developments that are not detected by the system (e.g., the neutral targets in the current study). Additionally, when selecting personnel for simultaneously performing gunnery and robotic tasks, it might be beneficial to take into account their SpA. Chen et al. (2008) and Chen and Joyner (2009) and the current study all demonstrated the superior performance by those with higher SpA. It is especially important if AiTR is not available to assist the operators with their tasks. These data on individual differences can be used in future human performance modeling efforts (e.g., IMPRINT) as input data to modeling tasks and, therefore, enhance future model analyses.
The data of Experiment 2 suggest that there is a strong interaction between the type of AiTR unreliability and participants’ PAC for almost all the performance measures. Overall, it appears that for high PAC participants, FAP alerts were more detrimental than MP alerts. FAP alerts affected not only their automated task but also the concurrent task. However, for low PAC participants, MP automation was more harmful than FAP automation. Future research should incorporate performance-based measures of attentional shifting effectiveness (e.g., Synthetic Work Environment) in addition to surveys such as the attentional control survey. In the area of SpA, Experiment 2 replicated the finding of Experiment 1 that the operator’s preference of modality of the AiTR display is correlated with his or her SpA. Low SpA individuals prefer visual cueing over tactile cueing, although tactile display would be more effective in highly visual environments (so visual attention can be devoted to the tasks, not to the cues). These findings may have important implications for personnel selection, system designs, and training development. For example, to better enhance the task performance for low SpA individuals, the visual cueing display should be more integrated with the visual scene. Augmented reality (i.e., visual overlays) is a potential technique to embed directional information onto the video (Calhoun & Draper, 2006). Additionally, the capabilities and limits of the automated systems should be conveyed to the operator, when feasible, in order for the operator to develop appropriate trust and reliance (Lee & See, 2004).
This project was funded by the U.S. Army’s Robotics Collaboration ATO. The author thanks Mr. Michael Barnes of ARL - HRED, Dr. Peter Terrence of State Farm Insurance, and MAJ Carla Joyner of U.S. Military Academy for their contributions to this project.