Sample of human error modes (adopted from Ref. ).
As the primary cause of software defects, human error is the key to understanding, detecting and preventing software defects. This chapter first reviews the state of art of an emerging area: software fault defense based on human error mechanisms. Then, an approach for human error analysis (HEA) is proposed. HEA consists of two important components: human error modes (HEM) and an undated version of causal mechanism graphs (CMGs). Human error modes are the general erroneous patterns that humans tend to behave in a variety of activities. Causal mechanism graph provides a way to extract the error-prone contexts in software development, and link the contexts to general human error modes. HEA can be used at various phases of software development, for both defect detection and prevention purposes. An application case is provided to demonstrate how to use HEA.
- human error analysis
- software defect prevention
- fault detection
- causal mechanism graph
- software quality assurance
Software has become a major determinant of how reliable, safe and secure computer systems can be in various safety-critical domains, such as aerospace and energy areas. Despite the fact that software reliability engineering has remained an active research subject over 40 years, software is still often orders of magnitude less reliable than hardware. There are over 200 software reliability models, but each of which can apply to only a few cases. Based on scientific intuition, if there were a model that had captured the essence of an entity of interest, it should be able to describe the entity in a variety of contexts. It is necessary to reflect what have been overlook in the current research and practices in software (reliability) engineering.
Software, as a pure cognitive product [1, 2], does not fail in the same way as how hardware fails. Software does not have material or manufacturing problems, for example, corrosion or aging problems. How a software system performed in the last second could tell nothing about whether the system will fail or not in the next second; and people can hardly anticipate the consequences of a software failure until it happens. Drawing upon the notion of the cognitive nature of software faults, there is a need to build software dependability theories on the foundation of cognitive science.
As the primary cause of software defects, human error is the key to understanding and preventing software defects. Software defects are by nature the manifestations of cognitive errors of individual software practitioners or/and of miscommunication between software practitioners. Though the cognitive nature of software has been realized early in 1970s , significant progress has only been made in recent years on how we can use human error theory to defend against software defects .
This chapter reviews the new interdisciplinary area: Software Fault Defense based on Human Error mechanisms (SFDHE) and proposes an approach for human error analysis (HEA). HEA is at the core of various methods used to defend against software faults in the SFDHE area.
The chapter is organized as follows: Section 2 reviews the emerging area SFDHE; Section 3 proposes the method for human error analysis (HEA); Section 4 presents an application example; Section 5 makes conclusion.
2. The new interdiscipline:
Software Fault Defense based on Human Error mechanisms (SFDHE)
Human cognition plays a central role in software development even if in the modern large projects [4–7]. A previous analysis on a large set of industrial data shows that eighty seven percent of the severe residual defects are caused by individual cognitive failures independent of process consistency . Approaches for defending against cognitive errors are necessary to improve software dependability.
Software Fault Defense based on human error mechanisms , firstly proposed in 2011 by Huang , is an area aiming to systematically predict, prevent, tolerate and detect software faults through a deep understanding of the causal mechanisms underlying software faults—the cognitive errors of software practitioners. This is an interdisciplinary area built on integrative theories in software engineering, systems engineering, software reliability engineering, software psychology and cognitive science.
2.2. State of art
2.2.1. Human error mechanisms underlying software faults
The first phase of SFDHE is to identify the factors that influence software fault introduction, as well as how various factors interact with each other to form a software defect. The factors related to programming performance are traditionally studied in software psychology, with a thorough review in . However, there is few study focusing on identifying factors that influence human errors in programming. One of Huang’s recent experimental studies was devoted to comparing the effects of various human factors on fault introduction rate . Results show that a few dimensions of programmers’ cognitive styles and personality traits are related to fault introduction rate  as significantly as the conventional program metrics .
In order to study human errors in software engineering, there is a need to integrate general human error theories with the cognitive nature of software development. Huang  developed an integrated cognitive model of software design. Based on the cognitive model, a human error taxonomy was proposed for software fault prevention . Another human error taxonomy was recently developed by Anu and Walia et al.  for with an emphasis on software requirement review. These human error taxonomies vary in details in order to achieve different purposes, however, they both place Reason’s human error theory  as a fundamental theory.
A recent experiment  examined how an erroneous pattern called “postcompletion error”  manifests itself in software development. Postcompletion error is a specific type of human errors that one tends to omit a subtask that is carried out at the end of a task but is not a necessary condition for the achievement of the main subtask . Postcompletion errors have been observed in a variety of tasks by psychologists, but there is a lack of empirical studies in software engineering. The author’s experiment shows that 41.82% of programmers committed the postcompletion error in the same way. As the first attempt to link general human error modes (HEM) to programming contexts, the study has set a significant paradigm for investigating the human error mechanisms underlying software defects.
2.2.2. Software fault prevention based on human error mechanisms
A key activity of the traditional defect prevention process is to identify root causes. Root causes are generally classified into four categories: method, people, tool, and requirement; detailed causes are analyzed by brainstorming with cause-effect diagrams . Such taxonomies are too abstract to be helpful for those organizations with little experience. Huang’s human error taxonomy  has been used to advance the process of traditional software defect prevention [16, 17].
Huang  also developed an approach called defect prevention based on human error theories (DPeHE) to proactively prevent software defects by promoting software developers’ cognitive ability of human error prevention. Compared to the conventional defect prevention that focuses on organizational software process improvement, DPeHE focuses more on software developers’ metacognitive ability to prevent cognitive errors. DPeHE promotes software developers’ error prevention ability through two stages. In the first stage, DPeHE provides developers with explicit knowledge of human error mechanisms and prevention strategies. In the second stage, software developers use the provided strategies and devices to practice error regulation during their real programming practices. Through this training program, software developers gain better awareness of error-prone situations and better ability to prevent errors. This method has received very positive feedbacks from a variety of industrial users .
2.2.3. Software fault tolerance based on human error mechanisms
Independent development (i.e., development by isolated teams) is used to promote the fault tolerance capability in N-version programming. However, empirical evidence shows that coincident faults are introduced even if the redundant versions are truly built independently [19, 20]. Programmers are prone to make the same errors under certain circumstances, thus introducing the same faults at certain places. Huang  has been devoted to first understanding why, how and under what circumstances programmers tend to introduce the same faults, and then to seeking a scientific way to achieve fault diversity and enhance software systems’ fault tolerant capability . Huang’s theory  relates the likelihood of identical faults to the “performance level” of the activity required from the programmers. Remarkably, the most frequent coincident fault does not occur at difficult task points that involve knowledge-based performance, but rather at an easy task point that involves rule-based performance .
2.2.4. Software fault detection based on human error mechanisms
Since the idea of using human error theories to promote software fault detections at various stages of software development lifecycle was presented in 2011 , significant progress has been made recently [11, 21]. Anu and Walia et al.  developed a human error taxonomy for requirement review, and positive effects on subjects’ fault detection effectiveness were observed. Li, Lee and Huang et al. [21, 22] introduced human error theories to prioritize test strategies at coding and evolution phases.
3. Human error analysis
Human error analysis (HEA) is at the core process of various methods for defending against software faults in SFDHE. HEA can be employed at different phases during software development, for both defect detection and prevention purposes, shown in Figure 1. For instance, HEA can be used to promote requirement review, design review and code inspection. At requirement and design phases, HEA can also help one identify contexts prone to trigger software developers’ cognitive errors at the next phase, so one can take strategies to prevent the errors.
HEA consists of two components: human error modes (HEM) and causal mechanism graph (CMG). Human error modes are the erroneous patterns that psychologists that have observed to recur across diverse activities [12, 14]. CMG provides a way to extract a specific set of contexts of the artifact (e.g., requirement, design and code) under analysis to the general conditions that associates with a human error mode.
3.1. Human error modes
Though human errors appear in different “guises” in different contexts, they take a limited number of underlying modes . A human
Understanding such recurring error modes is essential to identifying software defects and the contexts prone to trigger a human error. A sample of the error modes are describes in Table 1. These error modes were observed to manifest themselves in software development contexts in the author’s previous experimental studies [5, 7, 13] or industrial historical data . More software defects examples associated with these human error modes can be found in .
|Error mode name||Explanation and scenarios|
|Lack of knowledge ||Software defects are introduced when one omits related knowledge, or even does not realize related knowledge is required. This error mode is prone to appear especially when the problem is an interdisciplinary problem.|
|Postcompletion error [13, 14]||The pattern of “post completion error” is that if the ultimate goal is decomposed into several subgoals, a subgoal is likely to be omitted under such conditions: the subgoal is not a necessary condition for the achievement of its corresponding superordinate goal; the subgoal is to be carried out at the end of the task.|
|Problem representation error||Misunderstand task representation material and simulate wrong situation model of the problem, due to the ambiguity of the material.|
|Apply “strong but now wrong” rules||People tend to behave the same way in a context that is similar to past circumstances, neglecting the countersigns of the exceptional or novel circumstances. In software development, this means that when solving problems, developers tend to prefer rules that have been successful in the past. The more frequent and successful the rule has been used before, the more likely it is recalled.|
|Schema encoding deficiencies||Features of a particular situation are either not encoded at all or misrepresented in the conditional component of the rule.|
|Selectivity||Psychologically salient, rather than logically important task information is attended to. In software development, “selectivity” means that when a developer solving problems, if attention is given to the wrong features or not given to the right features, mistakes will occur, resulting in wrong problem presentation, or selecting wrong rules or schemata to construct solutions.|
|Confirmation bias||People tend to seek for evidence that could verify their hypotheses rather than refuting them, whether in searching for evidence, interpreting it, or recalling it from memory. Others restrict the term to selective collection of evidence.|
|Problems with complexity||As problem complexity arises, error symptoms tend to occur such as delayed feedback, insufficient consideration of processes in time, difficulties with exponential developments, thinking in causal series not causal nets, thematic vagabonding, and encysting (topics are lingered over and small details attended to lovingly).|
|Biased review||People tend to believe that all possible courses of action have been considered, when in fact very few have been considered.|
|Inattention||Fail to attend to a routine action at a critical time causes forgotten actions, forgotten goals, or inappropriate actions. “Automatic processing” in software developing happens when no problem solving activities are involved, such as typing. Slips might happen without proper monitoring and error detection.|
3.2. Causal mechanism graphs
The author recommends a graphic tool called causal mechanism graph (CMG) for causal mechanism modeling. CMG is a notation system firstly used to represent and model the complex causal mechanisms that determine software dependability, which encompasses different attributes, such as reliability, safety, security, maintainability and availability [23, 24].
A causal mechanism graph is capable of capturing logic, time and scenario features, which are essential to the description of interactions between various factors to produce an effect. The notations in CMG allow researchers to model causal mechanisms more accurately: logic symbols allow for various logical combinations between causes or effects; the scenario symbol enables the identification of situations in which a relation is likely to exist; and time flow allows a number of cause-effect units to develop into a cause-effective chain. Moreover, notations are designed to capture the recurrent patterns of comprehensive causal mechanisms (e.g., activate and conflict).
CMG is especially suitable to represent one’s cognitive knowledge, as it allows one to model the dynamic causal mechanisms in a robust way. This feature, combined with excellent reliability and validity , positions CMG as a powerful method to extract and model the human error mechanisms underlying software faults. A sample of the CMG notation adapted for human error analysis is shown in Table 2.
3.3. An application example
A requirement segment is extracted, shown in Figure 2. To complete the “Jiong” problem, a programmer first needed to calculate the structure of a “Jiong” using a recursion or iteration algorithms (A.1 in Figure 2), and then print a blank line after the word (A.2 in Figure 2).
Using HEA, we see that this requirement segment contains three conditions: (1) A.1 is the main requirement; (2) A.2 is not a necessary condition to A.1; (3) A.2 is the last step of A. These three conditions consist a scenario that tends to trigger “postcompletion error.” Postcompletion error is an error pattern whereby one tends to omit a subtask that should be carried out at the end of a task but is not a necessary condition for the achievement of the main goal .
This requirement was presented to student programmers in a programming contest in the previous study . Results show that 23 out of 55 (41.8%) programmers committed the error of “forgetting to print a blank line after each word,” in the same way as observed by psychologies in other tasks.
It is notable that “printing a blank line” is a very simple requirement and have been explicitly specified; this requirement is correct and clear. According to the current requirement quality criteria such as correctness, completeness, unambiguity and consistency, this requirement contains no features prone to trigger a software development error. In fact, this requirement triggered significantly more programmers to commit the error than any of other locations, and amazingly in the same way .
Once the error-prone representation is identified, one can use strategies to prevent it from triggering development errors. For instance, the requirement writer may highlight (e.g., using bright colors and/or bold font) the places of postcompletion tasks in the requirement documents (“printing a blank line after each word” in the “Jiong” case), since visual cues are an effective way to reduce postcompletion errors . Though using styles to facilitate readers’ cognitive process is not new in software requirement engineering, the contribution here is to tell the writer the exact location that should be highlighted, in order to reduce a developer’s error-proneness.
This chapter emphasizes the necessity of understanding the cognitive nature of software and software faults, and reviews the emerging area of defending against software defects based on human error theories (SFDHE). An approach of human error analysis (HEA) is proposed to detect and/or prevent software defects at various stages of the software development life cycle. The application on a requirement review shows that HEA is able to identify an error-prone scenario that can never been captured by any existing criteria for requirement quality. HEA offers a promising perspective to advance the current practices of software fault detection and prevention.