Currently, there is a trend in reduction of the number of industrial plant operators. The challenges are mainly during emergency situations: how to support operator time management without increasing operational risks? SDA focuses on this area and aims to increase operator situational awareness (ability to perceive, understand and predict the future behavior of a process) through new technological paradigms, such as Expert System and Ecological Human Machine Interface (HMI) in order to provide operational support, maintenance and optimization of refining, exploration and system of production of oil and gas plants. In SDA, the most critical alerts are shown by priority, along with decision trees, trend charts and variable comparison charts. SDA aims to assist control room operators in solving a critical problem in the oil industry, that is the loss of safety function, associated with alarms, during alarm flood. The SDA results of the SDA are presented through its implementation in Sulfur Recovery Units—URE, in the state of Rio de Janeiro, in Brazil.
- operator support
- alarm floods
- alarm processing
- fault diagnosis
- expert system
- sulfur recovery units
Industrial plants in general consist of a large number of integrated and interlinked process units. The information about the plant status is given by automated systems, which extract information of sensors spread over different parts of the process units and assist control room operators in making decisions and performing tasks to keep the plant operating in safe conditions and in an efficient way.
Control room operators are alert by automated systems through alarms. Nowadays, with digital technology, an alarm can be created within seconds and at almost zero cost. As a result, the number and frequency of alarms has increased significantly over the years. Alarms are typically set to a single operational state-triggered. Change of operating state, such as plant shutdown or plant startup may result in many alarms occurring at the same time. The amount of information presented is greater than as the human operator can actually perceive, so many of them are lost. This condition is called the alarm flood . During alarm flood, operators can be overwhelmed by the large amount of alarms and not be able to keep the plant in safe operation condition posing a risk not only production process but also to the environment and human lives.
Alarm flood are one of the main causes of industrial plant accidents and cost millions of dollars each year, for example: Three Mile Island nuclear power plant accident in 1979, Esso Australia’s gas plant explosion at Longford, in Vitoria, in 1998 , P-36 oil rig in the Campos Basin in Rio de Janeiro State, Brazil in 2001, that resulting in 11 deaths and total loss of the rig with an estimated financial loss of USD 400 million  and the Texas City Oil refinery explosion in 2005  are clear examples of accidents problems that had contribution by alarm flood.
The system presented in this study is able to provide a real time support for control room operators in critical plant situations. Besides, it can assist the control room operators in time management and decision making process.
In order to present the methodologies developed and applied in SDA, this chapter will be divided into 3 items described below. Item 2 presents a brief description of the evolution of alarm systems in industrial plants. Item 3 presents SDA highlighting the methodologies implemented: expert system and ecological interface machine. And finally in item 4 is presented the conclusion of the chapter.
2. Alarm systems
Until the 1950s, an industrial plant control room was nothing more than a wall full of individual process indicators (panel), lights, switches, and moving pen charts. When something in the process was wrong one or more lights (alarms) came on in the panel along with beeps and indicate to operators which part of the process the problem was in. In this system introducing a new alarm was very expensive, where they had to be designed and implemented one by one in an often electromechanical system. With the evolution of computing many of these items became scarce and no longer met the needs of the operators.
In 1975 the first distributed digital control systems were created aiming to assist the industrial processes. These systems were responsible for record process variable data, calculate plant efficiency, assist in process monitoring and management and inform the operators about the plant state through alarms. In these systems the addition of a new alarm was done at no cost and within seconds. This led to a significant increase in the number of total alarms per operator, which can be verified according to Figure 1.
So increasing the amount of alarms configured did not bring more security to the process and not even facilitated operators’ identification and decision-making. Because all events are treat as alarms and in situations of major plant disturbance are generated a huge amount of alarms, causing the so-called alarm flood. ANSI/ISA 18.2  defines alarm flood as the occurrence of 10 or more annunciated alarms in any 10-minute period per operator. Since the alarms in these systems are presented in sequence of events (SOE), it is impossible for the operator to understand them in a timely manner, causing often, due to a stressful situation, alarms very important for understanding the situation go unnoticed by the operators. In light of this problem, the major international engineering bodies came together to outline a set of methods, definitions and best practices for the design of an alarm system.
Studies on the most advanced alarm systems in operation [6, 7, 8] claim that a prioritization based on safety and plant urgency are the most frequently factors cited in advanced alarm systems literature and points out that the meaning of an alarm for control room operators depends on four factors: urgency, safety consequences, productivity consequences and relevance to the current task. It also cites the importance of the use categorizes by alarms through a time-based color coding available to operators and a dynamic severity rating. In other words, is important that the alarms are colored and segregated by urgency and ordered into categories by their severity.
The NRC alarm study  shows the effects on the performance of methods by which alarm processing results are disseminated to operational staff. The specific techniques analyzed in this study were suppression and dynamic prioritization. With suppression, minor alarms are not presented to operators but can be accessed upon request. In dynamic prioritization, the least important alarms are presented to operators, but differently from the most important ones. Because designers cannot anticipate every possible plant disturbance, some alarms may gain relevance in decision-making in a specific context. Thus, one of the advantages of dynamic prioritization cited in the study is that this approach does not omit any alarms to operators, unlike suppression.
According to the 191 standard British-based organization Engineering Equipment and Materials Users’ Association (EEMUA) , alarm systems are an important means for automatic plant monitoring, drawing operator attention to significant process changes that require evaluation and action. They consist of field equipment, signal transmission, processing and visualization screen, being important tools to support the operator assisting in:
Keep the plant process within a safe operating range. In this way, the operator is advised of potentially hazardous situations before the Emergency Shutdown System (ESS) is forced to intervene. This improves the plant assessment and helps to decrease demand from ESS, increasing the plant production and safety;
Recognize and act to avoid situations that may lead danger to the plant. The role of ESS is intervening in a hazardous situation, however there may be cases where the plant deviates from its normal design operating conditions to a state in which the ESS is unable to act efficiently, such as during plant startup, which has a change of state.
Identify deviations from operating conditions that could lead to financial losses;
Understand the complex process conditions. Alarms can be an important diagnostic tool and are one of several sources of information that an operator can use during a critical process.
An alarm system is a crucial element in process plant operation, when well planned provide an additional layer of protection and can help operator prevent an abnormal situation from spreading, also offers benefits to the plant that include: increased safety, increased production, quality improvement and cost reduction.
3. Alert diagnosis system—SDA
SDA is a real-time operation support system that provides to control room operators an optimized flow of information from critical plant process variable changes, in order to assist decision-making in a short time, that can avoid operational situations like unexpected shutdown, loss of efficiency and even accidents. Thus, SDA is designed to support control room operators in the time management and take the right actions in order to keep the plant in a safe state.
The information provided on the SDA Human Machine Interface (HMI) is different from traditional alarm systems. In SDA alarms is called diagnostic alerts. The diagnostic alert can be formed by alarms already existing in the plant supervisory or not. They can be of the simple type (formed by a single alarm) or compound (formed by a set of alarms, which follow the order of logical operators, such as: and, or, between, less than, rate, etc.). Thus, the alert diagnostic not only identify changes in the process but identify problems (failures in the process), they by themselves already indicate the diagnosis of the situation. Besides, they are presented in a non-SOE approach, sorted by priority, where the highest priority alert will always occupy the top of the alarm list.
SDA is designed to monitor and process information about different process variables acquired and calculated by it, as well as process variables coming from different structures, such as: external databases, text files and excel spreadsheets. SDA consists of a data acquisition system, a knowledge based or rule bank (KB-SDA), a real-time diagnostic system, and a HMI. The real-time diagnostic system is an expert system, object oriented based on Artificial Intelligence Monitoring System (AIMS) technique . Figure 2 presented the SDA data flow.
For example, in SDA considering approximately 1000 (binary/analog) variables the time of one cycle is of 1 second. In other words, all data acquisitions (binary/analog) and the variables calculated from these are presented in a time less than 1 second. If a particular state or diagnostic alert is only dependent on the current values collected, it will be updated immediately, with a time interval of 1 second. If this state (or diagnostic alert) depends on a rate or time variation, it will be accompanied by a 1 second step, until it reaches the required threshold. In other words, the operator response time is 1 second.
In this way, the SDA works at a level above the plant supervisory, where in alarm avalanche situations will show the highest priority alerts and the behavior of process variables associated with them giving the operator a clear plant situational awareness, quickly and offering benefits to the plant that include: increase security, increase in production, improve the quality, and cost savings.
3.1 Expert systems
Expert Systems or Knowledge Based Systems  were developed in the 1960s by the Stanford Heuristics Programming Project as a new intelligent method to find solutions for complex problems as a disease diagnosis. Edward Feigenbaum, widely known as the father of expert systems, defined it as “an intelligent computer program that uses knowledge and inference procedures to solve problems that are difficult enough to require significant human expertise for their solutions.” In other words, an Expert System (ES) is a computational system that emulates the decision ability of a human expert in any topic.
A basic concept of ES is composed by a knowledge base – KB, where the intelligence of the system is stored, and an inference machine that process current facts based on the knowledge to generate new ones and conclusions (Figure 3).
The most relevant advantage in using ES is the independency between the KB and the inference machine. The KB can be changed or adapted to a new knowledge without the need of remodeling the inference engine. This capability makes this type of system a significant tool to handle diagnosis problems of many different types of power plants. ES are classified based on the paradigm in which information is represented in its knowledge base. The information can be represented on the Knowledge base in many forms: Logical Trees, Rules and Class .
The inference engine is responsible for join facts of a problem with the knowledge represented in the knowledge base, and establish new facts and conclusions. The information chaining process in the inference engine can be done in two ways: forward chaining and backward chaining. A forward chaining system begins with the facts initially known and uses the rules to draw new conclusions or take certain actions. All rules are checked to see if the initial facts satisfy some of them. Each satisfied rule is then fired, generating new facts that will be used to trigger other rules and so on until the problem is solved.
Matching—where the antecedents satisfied by the facts are verified;
Conflict Resolution—when more than one antecedent is satisfied, you must decide which of the rules will be fired first. This decision is called conflict resolution.
Execution—in this step there is the execution of the rule, which can result in new facts as well as new rules.
In backward chaining, the inference process begins with choosing a solution and performs a search similar to depth searching [12, 13]. At first the known fact set is empty and as rules are fired this set becomes the set of facts that take the solution (object states). Thus, rules are triggered to generate values for object states or to generate intermediate facts that will later be used as a set of object state values.
3.1.1 SDA expert systems—AIMS
AIMS technology is a framework for developing real-time monitoring systems , which uses object-oriented (OO) concepts and expert systems. The AIMS kernel is the object-oriented knowledge-based (KB) expert system that acquires and calculates variables as well as their interdependencies, and maps them within a network of hierarchical objects, where rules are implicit in object operators and network topology.
The state of monitored variables updates a fact base, which is used by a real-time inference machine that activates and triggers knowledge base rules. A mail server is responsible for updating the operators on the HMI and manipulating the information.
The KB system has two main characteristics: (a) acquisition and maintenance of offline knowledge, which is built and modified through the KB Module and, (b) real-time monitoring representing the rules that define the real-time application in the which a network of hierarchical objects represents the rules that define the real time application, which is created according to KB rules and used by the inference engine.
The knowledge domain which comprises all monitored and calculated variables used by SDA, as well as their interdependencies, is mapped within a hierarchical structure of descended from parents’ networks where each node contains a variable represented by an object which determines its attributes and operations. For each object is associated with a hierarchical level, which is used by the inference engine while firing the KB Rules. The lowest level is represented by the acquisition variables. Figure 4 shows an example of a network of three level hierarchical objects.
In Figure 4, V22 means V22 is generated by V12. This hierarchical network represents all the rules contained in the KB rules that can be transformed into IF-Then structures, such as:
IF (V11 updated or V12 updated)
THEN (update V22 applying the operator considering its dependencies).
The KB module knowledge structure is based on five main class, shown in the hierarchical structure of Figure 5, where variables are abstract class from which the class are derived: analog (for representation of analog variables), binary (for representation of binary variables) and rate (for time variation representation).
The Message class represents the facts created from the inference of triggered rules. Every time the variable changes it generates a new fact (creating a new Message object). At the beginning of the monitoring process, the acquired variables are Message objects.
In the KB module the acquisition and maintenance of knowledge is done. In the KB module interface, the table shown in Figure 6 allows the user to create object variables, edit their properties and operator rules as well as provide their dependencies on other variables (relatives and heritage).
Is not allowed to the users defines more than one link between two variables duplicate rules as well as inconsistent rules (different operations linked to the same antecedent with the same consequent). The rule problem without the condition side is eliminated due to the fact that the rule is made by linking two existing variables.
The track of time and the timing for rule activation is automatically done by the framework which does not allow descending nodes to have a refresh rate higher than their relatives’ refresh rate. If more than one rule is activated by the inference machine, the conflict resolution strategy can be applied. The AIMS conflict resolution strategy takes as its first criterion the hierarchical level (in the object network) of the variables affected by the rule. Rules related to lower level variables have high priorities and will be trigged first. The inference process ends when there are no more rules to be activated.
Thus, in summary, the AIMS is able to receive data from binary and analog variables from an acquisition system, processing the data, performing calculations through logical and arithmetic operators and create new alarms (alarms not provided directly by the acquisition system).
The new alarms are called alert diagnostics and can be viewed as process alarms that require operator action, symptom alarms that indicate the failure of any unit component, or even prediction alarms that have the function of alerting the user that a failure could occur in the future if the current situation remains. Thus, the alert diagnostics generated by the SDA themselves represent a diagnosis of the current situation of the plant unit.
3.2 SDA HMI
SDA HMI concentrates the most important (critical) information in one place, preventing the “loss” of important alerts during emergency situations that cause alarm flood in traditional automation systems. Thus, it provides operators support in identifying the most critical problems, helping to prioritize actions and suppressing low criticality information. In normal operation, SDA also provides real time plant efficiency indicators, enabling actions to optimize (increase production) and increase reliability of operations, increasing operational continuity.
All information stored in KB SDA and processed in AIMS is presented to the users through the HMI. The frequency at which the information from HMI is refreshed is every 1 second. SDA HMI was developed based on the state of the art known as Ecological HMI [5, 6], where plant process variables (binary or analog) are represented by graphical objects which provide the behavior of these variables quickly and clearly, giving the operator a real time situational awareness of the plant. The graphic objects used in the SDA HMI will be: Plane Graph (x, y), Deviation Diagram, Sparkline, Radar Graphs, and Digital Diagram.
SDA HMI is divided into five different areas: (I) Date/hour, Binary Logical Annunciators Panel (ALB); (II) Ecological HMI; (III) Diagnostic alert list by priority; (V) Other options like: Ecological HMI of all diagnostic alerts, more diagnostic alert list, diagnostic sequence alert list, and additional information such as: instrument code, source, initial cause, action to be taken. Figure 7 shows the five areas in the SDA HMI with the ecological HMI option in area V. Figure 8 shows the five areas in the SDA HMI with diagnostic alert list option in area V.
Diagnostic alerts are presented in HMI in three distinct areas:
in an area similar to the binary alarm annunciators (ALB), where diagnostic alerts are grouped by subsystems and by units (area I). Figure 9 shows this information.
in a diagnostic alerts list sorted by priority: critical, medium, high, and low, displayed in red, orange, yellow, and blue, respectively (area III). In area III, only the seven more critical alert diagnoses are presented. (Figure 10). The diagnostic alerts list consists of:
priority icon: icon that indicating the diagnostic alert priority;
tag: diagnostic alert name;
description: diagnostic alert description;
on/off indication: indication whether the diagnostic alert is on or off;
remaining response time: In SDA each diagnostic alert is registered in the KB SDA with an associated response time. That is a time interval considered appropriate for the operator to take the necessary actions to correct the problem identified by the alert;
activation data/time: date and time that the diagnostic alert was activated;
Important to point out that the most important diagnostic alert will always occupy the top of the list due the dynamic prioritization method.
and in another priority list in area V. The list aims to show more diagnostic alerts by priority, 8th onwards, not shown in area III. The list consists of: priority icon, tag, description, on/off indication, remaining response time and activation data/time. Figure 11 shows this information.
3.3 SDA implementation
SDA was designed to be implemented in any onshore and offshore production system. Currently, SDA is implemented in two Sulfur Recovery Units in the state of Rio de Janeiro, Brazil with the following results:
Presents the quality and importance/priority of the seven strands that compose the load of the Units, all as technical documentation available online and in real time at SDA. In case that the operator does not realize in short time that the streams quality is not good, and does not quickly deviate this streams, it will generate deposition and lead to the unit shutdown, causing financial losses and increased emissions for the environment. This information is always available from the ecological HMI, and the operator is able to check the deviation quickly by looking at the SDA HMI;
Presents all diagnostic alerts regarding the prioritization of H2S, SO2 and NH3 detectors from the all Units area. These warnings are critical and prevent the operator from going into an area with the presence of gas that could endanger their health and physical integrity;
Presents the diagnostic alert prioritization of the critical unit process stored into the KB SDA which has been built on the operational experience of operators and design engineers and HAZOP information’s and others. This knowledge was previously elaborated, considering various scenarios, in order to assist the operators in decision-making in abnormal situations;
Presents, in real-time, information about sulfur removal efficiencies from Units and their subsystems. This information enables operators to act to minimize the tendency of units to lose efficiency;
Monitors critical controls on manual, active alarms and bypass, etc.;
Perform process and unit equipment diagnostics: vessels, distillation towers, heat exchangers, ovens, compressors, etc.;
Follows the desired limits or operation regions (pressure, temperature and composition) of the processes and equipment, ensuring that they are respected not only during normal operation but also during plant starts and plant stops;
It considers the different unit operating modes (starts, stops, steady state, disturbances, etc.) to generate diagnostics alerts specific to each situation;
Early tests with SDA show that concentrating all critical information’s about the process on a single HMI helps the operator in time management. Preventing him from consulting different tools and screens to be situational awareness and take decision. In addition, the flexibility to create diagnostic alerts enables monitoring of situations not provided by the traditional alarm management system.
Tests applied in order to evaluate the operator situational awareness (ability to perceive, understand and predict the future behavior of a process) showed that in normal operation using traditional automation systems the operator situational awareness was around 50% and dropped during emergency situations for values below 20%. With SDA, during emergency, the operator situational awareness was around 55%, that was higher than those obtained in normal operation in traditional automation systems. This demonstrates the importance of an Intelligent Real-Time Decision Support System for oil industry control room operators aim to anticipate failures and specially to avoid risk situations .
SDA is also being evaluated for use on offshore production platforms to increase the safety and reliability of these processes. This system can be implemented in any industrial process to providing an intelligent system, coupled with an ergonomic HMI that allows operators to reduce your cognitive load in search of the root cause of a certain problem.
SDA was developed to support operators in time management in case of emergency situations providing optimized plant process information in order to increase operator situational awareness, so that they make appropriate decisions and actions aiming to increase the safety, integrity and reliability of plant processes and equipment.
SDA HMI aims to increase operator situational awareness by explaining why diagnostic alerts through decision trees, trend graphs, etc. and also by presenting recommendations for mitigating actions. The aim is to help operators identify process fails quickly and prevent the abnormal situation from spreading, increasing risks to people, equipment and the environment.
The methodologies developed and implemented makes SDA a potential support system from easy applicability in different onshore and offshore from oil industry, mainly impacting the sector in reduction of the number of disturbances and shutdowns, production and increased plant safety. Besides that, SDA impacts in relation the solution of alarm flood, anticipate failures, avoid hazardous situations, avoid misdiagnosis, prevent breakdown and /or equipment unavailability, prevent leaks, avoid emissions (e.g., torch) and debris. This increases the safety, reliability and production of complex oil industry processes. A good indication of this impact was the installation of SDA in Sulfur Recovery Units in Brazil.
Concluding we can say that SDA is a new safety barrier providing diagnostics alerts that can prevent an abnormal situation from evolve into accidents and may decrease the chance of human mistake, especially in identifying the root cause, which if misdiagnosed, may lead them to take wrong actions that increase the consequences of incidents, accidents or plant shutdowns.
The authors would like to acknowledge: CAPES (Coordination for the Improvement of Higher Education Personnel), CNPq (National Council for Scientific and Technological Development) and Petrobras for financial support.