Open access peer-reviewed chapter

Real-Time Fault Detection and Diagnosis Using Intelligent Monitoring and Supervision Systems

Written By

Gustavo Pérez Alvarez

Submitted: 21 May 2019 Reviewed: 15 October 2019 Published: 05 February 2020

DOI: 10.5772/intechopen.90158

From the Edited Volume

Fault Detection, Diagnosis and Prognosis

Edited by Fausto Pedro García Márquez

Chapter metrics overview

1,892 Chapter Downloads

View Full Metrics


In monitoring and supervision schemes, fault detection and diagnosis characterize high efficiency and quality production systems. To achieve such properties, these structures are based on techniques that allow detection and diagnosis of failures in real time. Detection signals faults and diagnostics provide the root cause and location. Fault detection is based on signal and process mathematical models, while fault diagnosis is focused on systems theory and process modeling. Monitoring and supervision complement each other in fault management, thus enabling normal and continuous operation. Its application avoids stopping productive processes by early detection of failures and by applying real-time actions to eliminate them, such as predictive and proactive maintenance based on process conditions. The integration of all these methodologies enables intelligent monitoring and supervision systems, enabling real-time fault detection and diagnosis. Their high performance is associated with statistical decision-making techniques, expert systems, artificial neural networks, fuzzy logic and computational procedures, making them efficient and fully autonomous in making decisions in the real-time operation of a production system.


  • automatic control
  • availability
  • intelligent systems
  • monitoring
  • predictive maintenance
  • supervision

1. Introduction

Advances in production techniques have improved the capacity of the productive systems of the industries, since the equipment used in these processes have improved their reliability and availability in the operation, making the productive processes more efficient.

One of the most critical questions about automated system design today is reliability and availability of a system. A traditional way to improve the reliability and availability of systems is to improve the quality, reliability, and robustness of the individual components of such a system, such as such as sensors, actuators, controllers and/or computers, used integrally in modern monitoring processes. Even so, a fault-free operation cannot be guaranteed. Process monitoring and fault diagnosis are a vital part of the innovative and modern systems of automatic management of the operation of production systems [1, 2].

Since the life cycle stages of production process equipment require high investments, and maintenance and operation procedures to achieve appropriate return times on the investments made, must ensure high availability and reliability rates. These performance indexes are improved by reducing the number of failures and managing their severities, while ensuring an increase in overall security.

To achieve these goals, two important techniques are available that allow optimized maintenance management, known as predictive and proactive, which are complemented by the techniques: corrective and preventive. This set of techniques offers its best results through the implementation of efficient real-time monitoring and supervision structures, making production systems highly reliable in supplying their products and in the quality of products offered. Corrective maintenance corrects the problem, preventive maintenance prevents the problem.

On the other hand, predictive maintenance consists in the frequent measurement of physical quantities, considered representative and through the analysis of their behavior, to extract their state or operative condition. This allows to suggest the most appropriate moment to apply the necessary actions in the equipments that present characteristics of being in the initial state of a fault - early failure (the root cause is slightly impacting the equipment continuously), anticipating in this way to the emergence of a serious system failure. The predictive maintenance process allows obtaining a report on the operational condition of the equipment. This process of issuing the report basically comprises four stages:

  • Identification of the failure modes that are occurring;

  • Fault location;

  • Evaluation of its extension;

  • Estimation of the remaining life of the equipment or component in question.

In traditional predictive maintenance processes, all these steps are performed manually. Alternatively, these steps can be performed using computer systems that allow automating this process is called Systems for Automatic Fault Diagnosis [3].

As can be inferred, the selection, implementation, operation and maintenance of a System for Automatic Diagnosis of Failures is not a simple task, requiring at each stage, care so that the result provided by the system, after its implementation, is within the one initially specified. For this, it is necessary to use appropriate tools and strategies, in each step, in order to maximize the success in executing each of them.

Proactive maintenance is a procedure that minimizes the impact of lack of maintenance or reduced maintenance on the equipment of a production system and also by its own characteristics complements the other maintenance techniques. The main action of this maintenance is to analyze the performance indicators and identify the root cause of the failures, the degradation of the equipment and to remove them before the severity of a fault itself increases [4].

In this chapter, a description will be given of the various methodologies for converting an online monitoring and supervision system into an intelligent system that allows the detection and diagnosis of failures, training it to assure autonomy in taking the necessary actions in real time to avoid them and seek their causes to eliminate them.

The proposed content has two basic objectives: to discuss some important factors for the success in the implantation and use of these structures or systems, as well as the main benefits in the integrated and simultaneous use of the monitoring and supervision of several physical quantities of the equipment, with the goal of increasing the “accuracy” of fault detection and diagnosis.

The technological development in this area has allowed the emergence of innovative methodologies for the detection and diagnosis of failures. The failure detection method recognizes that the failure has occurred, and fault diagnosis finds the root cause and location of that failure. In general, fault detection methods are based on mathematical models of signal and process, and on methods of systems theory and process modeling to generate fault symptoms. Fault diagnosis methods use causal relationships between fault and symptom, applying statistical decision methods, artificial intelligence and computational software [5].

Among the existing model-based fault diagnosis schemes, the so-called observations-based technique has received much attention since the 1990s. This technique was developed within the framework of the successful theory of advanced control, where powerful tools are available to design or to extrapolate recorded observations through efficient and reliable algorithms for data processing in order to reconstruct process variables.

The content described here is intended to provide an introduction to advanced monitoring and supervision, focused as a framework or intelligent assembly for fault detection and diagnostics [1, 6], and fault-tolerant systems especially for processes characterized by continuous and sampled (discrete) signals.

In general, almost all physical signals are continuous, for example, position and velocity of a body, speech or music picked up by a microphone, voltage or current in an electric circuit.

The sampling (instantaneous) of an analog signal or waveform is the process by which the signal is represented by a discrete set of numbers. These numbers, or samples, are equal to the signal value at well-determined instants (the sampling times). Samples must be obtained in such a way that it is possible to reconstruct the signal accurately. That is, the original waveform, defined in “continuous” time, is represented in “discrete” time by samples obtained at conveniently spaced sampling instants.

An application-oriented approach will also be done with methods that have proven their proper performance in practical applications.


2. Monitoring and supervision of systems

The monitoring and supervision of processes aim to show the real state of the equipment involved in a productive process, indicating undesirable or illicit states and the appearance of a change in its initial phase (early failure). This situation will require taking appropriate and immediate action to avoid catastrophic damage in the future.

Deviations from the normal behavior of the parameters of an equipment or system arise from faults and/or errors, which can be attributed to several causes. These changes are symptoms of possible early failure, and if the necessary actions are not taken to eliminate them, they may become actual failures that may compromise the performance of productive systems. The justification for monitoring and supervision systems is to avoid such defects or failures in systems by collecting continuous information (provided by the monitoring system) in real time, on the behavior of the equipment of a production system and its supervision data) that will allow you to determine if a device or equipment is operating normally or at risk.

Deviations from the normal behavior of the parameters of an equipment or system arise from faults and/or errors, which can be attributed to several causes. These changes are symptoms of possible failures in their early state, and failure to take the necessary actions to eliminate them can lead to real failures that may compromise the performance of productive systems. The justification for the monitoring and supervision systems is to avoid these defects or failures in the systems by collecting continuous information (provided by the monitoring system) in real time, on the behavior of the equipment of a production system and its supervision (data evaluation collected) to determine whether a device or equipment is operating normally or at risk.

The content presented in this chapter is focused primarily on the areas of system monitoring and supervision. We have shown the changes that can be made in these two areas of observation and analysis of the behavior of the parameters of a system during its operation, to make them more efficient in solving problems of production systems. The fundamental objective of this information is to integrate these two areas into one set only through the use of the Smart System technique, allowing its unified application in real time in decision making, in any area of a production system [2].

The Smart System technique will make productive systems economically efficient by improving their performance, quality, reliability of supply, operational flexibility, safety, etc.

2.1 Fault diagnosis monitoring systems

It should be noted that the selection, implementation, operation and maintenance of a system for automatic fault diagnosis is a complex task [2, 7]. You must ensure that the result provided by this system is within the programmed specifications. For this, it is necessary to use appropriate tools and strategies, in each step, in order to maximize the success in executing each of them.

The concept of predictive maintenance is directly linked to the monitoring of the condition (state) of one or more equipment. O monitoring as such is a basic tool for the implementation of predictive maintenance strategies. Monitoring can be classified from the point of view of the type of sensor installation (permanent or mobile), or be classified by the data acquisition strategy “continuous/on-line” or “periodic/off-line”.

“Continuous/on-line” monitoring systems often work in an integrated way with the Supervisory and Control Systems, or “Supervisory Systems” of the production systems, both of which have individual requirements for data acquisition and functions totally different from one another. The integration of these two systems allows for the “continuous” acquisition of operating data and the variables of slow variation (temperatures, levels, position values, static pressures, etc.) normally available in these systems.

Automatic Diagnostic Systems - ADS are the next step to pure and simple monitoring. These more advanced systems receive information from the monitoring system and, through the use of intelligent software, can manage” Knowledge Bank”, where information obtained from various physical parameters is crossed and integrated, from where a result that is closer to what one really wants: an effective aid to decision-making.

2.2 Main features of automatic fault diagnostic systems

Automatic processing systems are integrated by computer programs, focused on the technique of artificial intelligence, and are responsible for automatically processing all information from the monitoring systems. The main objective of the integration of these systems in the operation of a productive system is the automatic detection of incipient faults and their main characteristics, that is, faults that are in the initial phase of their formation (early faults), their identification, location and estimation of the degree of severity.

The main characteristic of ADSs is that they can handle large amounts of data generated by Monitoring Systems in a systematic, frequent and automatic way, and optimize the process of data storage during long periods of operation (months or years). Another attribute of the ADSs is their intrinsic characteristic, that is, throughout the time of use, each time less need of the interference of the user. Another important feature of ADSs is their adequacy as a Knowledge Management tool in predictive maintenance [8, 9].

The characteristic limitation of this type of system, as well as of any type of Monitoring System traditionally used, is presented when dealing with faults of instant or catastrophic evolution. For this, “Protection Systems”, with fixed and well-established alarm limits, should be considered as the main option. The principles of operation, as well as the necessary technical characteristics, relating to the acquisition, communication and processing of data from each of these systems are fundamentally different and should not be confused [10].

Basically, ADSs have the function of reporting the occurrence of failures when they are still in their infancy, while the Protection Systems must act at the moment an unacceptable operating situation occurs.

The technological development in the systems of monitoring and supervision will allow the structuring and optimized evolution of the areas of automated detection and diagnosis, this being the next step to the pure and simple monitoring. These systems receive information from the monitoring system, consisting essentially of sensors and through the use of the technique of intelligent systems and expert systems, a “knowledge bank” is managed or also called the knowledge base for decision making.

The evaluation of the information provided by the monitoring and supervision system will allow to detect and locate a problem and diagnose its root cause, simultaneously, it will be possible to select the best action to mitigate changes in the behavior of the parameters of interest and eliminate the cause that produces them. Finally, the system itself will decide whether to take this action online or offline, depending on the severity and robustness of the problem.

Another important characteristic that is considered in the design of these systems is their intrinsic characteristic that, throughout the time of use, they are less and less required to interfere with the user. That is, while in the case of traditional monitoring systems, the accumulation of stored, non-processed data by the user is a natural consequence of the monitoring process itself, and the effort to treat such data never diminishes over time. In the ADSs the manual work involved in the processing of information is decreasing over time. This is due to the fact that there are tools and mechanisms of retention and improvement of the knowledge registered in these systems. Thus, by using knowledge management tools, maintenance team members can track, correct, insert, retrieve, and refine existing content in their Knowledge Bank (expert systems).

In this way, it can be said that the joint monitoring and supervision system has become intelligent and consequently autonomous to operate a production system in an efficient way from a technical as well as an economic point of view.

2.3 Structuring of monitoring and supervision systems in an intelligent system

In this section, a description of the methodology used for the conversion of a monitoring and supervision system in an intelligent system that allows the detection, localization and diagnosis of failures is made possible to take the most appropriate actions to eliminate them and to seek their causes to avoid them. This will improve the efficiency of production systems.

Smart System or Smart Grid in general terms is the application of information technologies in production systems, integrated with communication systems and with an automated network infrastructure. This technique requires the installation of sensors in all the fundamental equipment of the production systems, structuring a reliable two-way communication system with wide coverage with the various devices and automation of the physical assets.

The current sensors have chips that detect information about the behavior of the parameters of certain equipment. These devices collect the information and those with changes are sent to an operation center through a communication system where they are analyzed to determine what is significant.

This process must occur in real time and online mode and in the presence of significant information, a centralized analysis system (specialized software) will evaluate them and determine the changes that have occurred and what should be done to improve the performance of a given parameter.

In Figure 1 , a block diagram is presented where the sequential structure of an intelligent monitoring and supervision system is described at a macro level. This configuration is a technological innovation in the area of intelligent automation. Its implementation is done through computational software of reference that will help in the process of evaluation, detection, location, diagnosis and application of the most appropriate actions in the elimination of a problem or failure.

Figure 1.

Flowchart of intelligent monitoring and supervision system.

An overview of each of the steps that will make up the intelligent monitoring and supervision systems (see flowchart in Figure 1 ) will be presented, highlighting the methodologies and techniques that will be used in each one of them in order to reach the required efficiency level which allows solving the various problems that arise in the equipment used in the production processes. This efficiency will be measured by the degree of automatism in real time and the autonomy in decision making in the presence of a certain disturbance. This will indicate a fully intelligent system that will safeguard the integrity and security of a production system, avoiding collapses and economic and technical damages.

2.3.1 Monitoring

To monitor is to observe, analyze and be aware of possible signs that something is not normal. In information technology, “not normal” can indicate unavailability of one or more parts of a system or simply change a parameter of a device.

In this phase the observation of changes or changes in the modules of the parameters of transcendence in time is realized. These changes must be recorded within a data collection system called the Database, which will allow us to construct a history of the behavior in time of a given variable or parameter according to a reference level or threshold of behavior [11, 12].

This process is carried out only through a robust system of sensors, installed at strategic points of equipment or system, allowing its observability continuous in time [1].

Monitoring is carried out using the following methodologies:

  1. Digital recorders—perform digital recording of all information from the sensors. It is through these devices that the history of the behavior of a parameter or variable in time is constructed. This information is usually stored in the binary system.

  2. Remote digital sensors (threshold)—transducer is the name given to a sensor or actuator, which in turn are devices for detection and actuation in a given process.

With the advent of microcontrollers and microprocessors and the great availability of tools and resources for the processing of digital systems, it was possible to introduce a high computing capacity to the transducers.

The intelligent transducer, which is the integration of: (a) an analog or digital sensor or an actuator, (b) a processing unit, and (c) a network interface.

A smart transducer transforms the sensor’s raw signals into a standardized digital representation, transmitting this digital signal to its users through a standardized digital communication protocol.

In Table 1 , the sensors that are part of the two types of sensors existent for the realization of the monitoring process are described.

c. Oscillography—aims to enable the post-event analysis of disturbances, different from the protection systems that must act in real time in response to disturbances. In fact, oscillography is a complementary tool to the protection systems, as it allows the specialist in the analysis of disturbances to verify the adjustments of a given protection, as well as any defects that may arise.

Active sensors Passive sensors
Thermoelectric Resistive
Piezoelectric Capacitive
Pyro electric Inductive
Photovoltaic Resonant
Hall effect

Table 1.

Types of sensors.

A very useful calculation performed by specialists from the oscillograms is to determine the distance at which a disturbance occurred. In this case, the specialist informs the maintenance team in which region of the transmission line it must act in order to repair the damage caused by the disturbance, making its work easier and more efficient. In addition, the expert performs other procedures, such as the phasor analysis to verify the balance between the phases and the harmonic analysis to observe the intensity of the harmonics present in the signal.

The digitalization of the oscillography signals motivated the growth of the number of computational tools developed to aid in the analysis of perturbations, also allowing the development of sophisticated signal processing tools and intelligent processing systems.

Nowadays the use of oscillography has become quite frequent for the recording of events in production systems (electrical systems, mechanical systems, etc.), since it is possible to observe the development sequence of an event and the interaction between the elements of the system that are part of the event. This implies the progressive growth of the number of oscillography files.

Thus, the need arises to study and develop compression methods with the purpose of reducing the space needed to store these files and make better use of the resources. It is proposed the use of a compression method by synthesis of oscillography files, using redundant adaptive decompositions, which provide a coherent representation with the phenomena present in the recorded signals. These decompositions were based on the technique of Matching Pursuits (MP).

Remote and real time monitoring allows observation of data related to operating conditions, mechanical parameters (fuel, temperature, engine speed, oil pressure and level, vibration, etc.), electrical parameters (currents, powers, voltages, oscillations, etc.), hydraulic parameters (flow, cavitations, water hammer, etc.) and operating hours.

2.3.2 Supervision

It is a process that performs the analysis of collected data for detection of unwanted or non-permitted states. It is searched if a parameter is within the permitted limits or if there are unusual variations.

The supervision area receives information from the Monitoring System and, through the use of intelligent software, a “Knowledge Bank” must be managed, where information obtained from various physical parameters is crossed and integrated, from which a result that is closer to what one really wants: an effective aid to decision-making.

The supervisory system automatically processes the information collected by the monitoring system through internal routines using intelligent techniques (Computational Intelligence). The objective is the automatic detection of incipient faults, that is, early detection of faults, their identification, location and estimation of the degree of severity [4, 13].

Within the area of supervision three very significant and decisive procedures are performed in solving the problems of a productive system:

  1. Evaluation of the information collected;

  2. Diagnosis of the changes present in these collected data - the system will inform if they are faults in their precocious state or catastrophic failures and their root cause. It will also inform your location;

  3. Elimination of failures and their root cause. This will prevent damage and collapse in a production process. Here will be decided the actions that must be taken and depending on the severity of the failure its execution will be online or offline. Evaluation

At this stage, the detection of variations or changes in the normal behavior of a parameter is carried out. For this, a comparison process is carried out with a previously defined reference value. This revision is performed in real time and its result compared to the past behavior, this will allow defining if it is really presenting an abnormality or simply it is an isolated eventuality.

The behavior history is analyzed and an image of the state of the selected parameters is created [8, 9]. This image is compared to the behavior of these same parameters in real time. This part of the evaluation uses only stored (digital) data that reaches a threshold value (reference value).

An evaluation of the recorded oscillograms is also performed, interpreting all recorded graphs related to the behavior of a given parameter.

To perform the database evaluation process with the historical behavior of the parameters of interest, there are tools or methodologies that allow the execution of this process in an optimized and efficient manner. These methodologies are as follows and depend greatly on the type of signal being monitored:

  1. Statistical and Probabilistic Techniques

    At this stage of the research the most used statistical method is least squares. Applies mainly to processes with linear characteristics. The probabilistic methods most commonly used to determine the probabilities of states and the probability density function of an equipment or system are: (a) state space method or Markovian process that describes states and possible transitions between them, (b) Monte Carlo simulation performs several computational simulations of a process for a certain period, ending the simulations procedure, estimating the desired indices as the probability of a failure to occur, its frequency and the duration of the failure.

  2. Kalman filter

    The Kalman filter produces estimates of the actual values of measured quantities and associated values, predicting a value, estimating the uncertainty of the predicted value, and calculating a weighted average between the predicted value and the measured value. The highest weight is given to the least uncertainty value. The estimates generated by the method tend to be closer to the actual values than the original measurements, since the weighted average presents a better estimate of uncertainty than both values used in its calculation. From a theoretical point of view, the Kalman filter is an algorithm for efficiently making accurate inferences about a linear dynamic system, which is a Bayesian model similar to a Markov hidden model, but where the state space of the variables is not observed is continuous and all observed and unobserved variables have normal distribution (or often multivariate normal distribution).

  3. Fourier Transform

    This technique applies mainly to stationary periodic signals.

    The Fourier transform is an integral transform that expresses a function in terms of sinusoidal base functions, i.e. as the sum or integral of sinusoidal functions multiplied by coefficients. There are several directly related variations of this transform, depending on the type of function to transform. The Fourier transform decomposes a temporal function (a signal) into frequencies, just as a musical instrument string can be expressed as the amplitude (or volume) of its constituent notes. The Fourier transform of a temporal function is a complex frequency value function whose absolute value represents the sum of the frequencies present in the original function and whose complex argument is the phase of displacement of the sinusoidal base at that frequency.

  4. Wavelet Transform

    This technique applies mainly to non-stationary Periodic signals.

    Many of the time series exhibit non-stationary behaviors such as changing trends, structural breakdowns from the beginning to the end of the event. These features are often the most important parts of the signal and by applying TF it is not possible to efficiently capture these events.

    The wavelet transform is a very useful tool for analyzing these non-stationary series.

    The wavelet transform has attractive qualities that make it a very useful method for time series, exhibiting characteristics that could vary both in time and frequency (or scale).

    The wavelet transform allows the signal to be decomposed into a set of function bases at different resolution levels and localization times. From these levels it is possible to reconstruct or represent a function, using the wavelet bases and coefficients of these levels appropriately.

  5. ARMA (p,q) model

    This technique applies mainly to stochastic signals.

    In the statistical analysis of time series, autoregressive moving average (ARMA) models provide a poor description of a weakly stationary stochastic process in terms of two polynomials, one for autoregression and one for average mobile.

    Given a time series of X data, the ARMA model is a tool for understanding and perhaps predicting future values in this series. The model consists of two parts, an autoregressive part (AM) and a moving average part (AM). The AR part involves returning the variable to its own lagged, that is, past values. The AM part involves modeling the error term as a linear combination of error terms that occur contemporaneously and at various times in the past.

    The model is generally referred to as the ARMA (p,q) model, where p is the order of the autoregressive part and q is the order of the moving average part.

    In signal processing, a time series is a collection of observations made sequentially over time. In linear regression models with cross-section data the order of observations is irrelevant to the analysis, in time series the order of data is fundamental. A very important feature of this type of data is that neighboring observations are dependent and the interest is to analyze and model this dependency.

    Within probability theory, a stochastic process is a family of random variables representing the evolution of a value system over time. It is the probabilistic counterpart of a deterministic process. Instead of a process that has a single way of evolving, as in the solutions of ordinary differential equations, for example, in a stochastic process there is an indetermination: even if one knows the initial condition, there are sometimes infinite directions in which the process can evolve.

    In discrete time, as opposed to continuous time cases, the stochastic process is a sequence of random variables, such as a Markov chain. The variables corresponding to the various times may be completely different, the only requirement being that these different values are all in the same space, that is, in the contradiction of the function. One possible approach is to model random variables as random functions of one or more deterministic arguments, in most cases, relative to the time parameter. Although the random values of a stochastic process at different times seem to be independent random variables, in the most common situations they exhibit a complex statistical dependence. Diagnosis

At this stage the type and degree of variation, the type of failure, its location, its severity and the incidence on the performance of an element or component are determined, and most particularly the root cause of this disturbance is identified [8].

In order to achieve the objectives of the fault diagnosis stage, two very important techniques allow the monitoring and supervision system to achieve this intelligent and autonomous decision-making system feature [5, 7, 14]. The application of these two methodologies is the major differential between classical monitoring and supervision and intelligent monitoring and supervision. These two methodologies are described below:

  1. Fault detection methods

    They detect faults, locate them and determine their degree of severity. The methodologies used in this technique to detect faults are:

    1. Fault detection with limit checking

      It is a simple and often used method to detect faults by checking the limit of a directly measured Y(t) variable. The measured variables of a process are monitored and checked if their absolute values or trends exceed a threshold. An additional possibility is to check its plausibility.

      To detect failures in a device of a production system it is necessary to establish or determine variation limits for the variables considered of interest. Usually these limits are the maximum values these variables can reach. When the values ​​of the variables reach these limits, it can be inferred that a variable is presenting abnormal changes. If these changes are continuous or discrete, it can be concluded that the device is in a process of failure.

      The fault detection threshold verification technique is based on two procedures in order to achieve its goal, namely:

      • Binary thresholds;

      • Diffuse thresholds.

      In most situations the binary decision between “normal” and “disturbance” is sometimes artificial, because there is rarely a marked difference between these two states. Therefore, the diffuse threshold procedure is a more realistic alternative for detecting changes in the behavior of variables.

  2. Fault diagnosis methods—root cause identification.

    Identify the impact a failure has on the performance of an element or device. This is strongly related to the severity of the failure. The important part of the diagnostic step is that the system is able to identify the root cause of a problem.

    Many measured signals show oscillations that are either harmonic or stochastic in nature or both [9]. If changes to these signals are related to actuator, process and sensor failures, signal model-based failure detection methods may be applied.

    Assuming special mathematical models for signal measurement, appropriate characteristics can be determined. Comparison with observed characteristics for normal behavior provides changes in these characteristics which are considered as analytical symptoms.

    The signal model can be divided into non-parametric models, such as frequency spectrum or correlation functions, and into parametric models as amplitudes to distinguish frequencies or ARMA models.

    At this stage usually the signals that are analyzed for being frequent in different production processes are focused on the following types of signals:

    • Periodic signs;

    • Non-stationary periodic signals;

    • Stochastic signals.

2.4 Applying corrective actions: alternatives

Within the various alternatives available to eliminate these changes in the behavior of a given parameter, the best is sought from the technical and economic point of view and, most importantly, that can be applied in real time, either in on mode online or offline [15]. This depends on the severity of the failure, for example whether it is a failure that is likely to happen and the serious consequences or symptoms of an incipient failure.

This optimized solution is found by applying intelligent optimization techniques such as:

  • Expert systems;

  • Neural networks;

  • Fuzzy logic.

An ES is capable of processing non-numerical information, presenting conclusions on a certain subject as long as it is properly oriented and “fed”. Another common feature in expert systems is the existence of an uncertain reasoning mechanism that allows one to present uncertainty about domain knowledge. In other words, ES employ human knowledge to solve problems that require the experience of one or more specialists. Within the application of expert systems it is necessary to count on the participation of a robust database, where the knowledge base will be stored (expert knowledge to solve numerous problems).

2.4.1 Knowledge base

The knowledge base is a permanent but specific element of an expert system. This is where the information of an expert system is stored, that is, the facts and rules. Information stored in a particular domain makes the system an expert in that domain.

2.4.2 Blackboard

Communication of information between expert systems is done by a mechanism called a blackboard. A blackboard is a place within computer memory where information stored in an expert system is “pinned” so that any other expert system can use them if you need the information contained therein to achieve your goals.

The blackboard is a structure that contains information that can be examined by cooperative expert systems. What these systems do with this information depends on the application.

Still, a blackboard, draft, or working memory (temporary memory) has a useful life during the course of a query and is linked to a concrete query. It is an area of ​​memory used to make evaluations of the rules that are retrieved from the knowledge base to arrive at a solution.

Information is recorded and erased in an inference process until the desired solution is reached.

2.4.3 Inference machine

Inference engine or inference engine is a permanent element that can even be reused by various expert systems. It is the party responsible for seeking knowledge base rules to be evaluated, directing the inference process [13]. Knowledge must be prepared for good interpretation and objects must be in a certain order, represented by a context tree.

Basically the inference engine is divided into the following steps:

  • Select and search;

  • Evaluate and Search.

Summarizing the above tasks, it can be said that the rules necessary to reach a goal must be sought in the knowledge base. These rules will be placed on the blackboard, and existing rules will only be evaluated after the most recent ones.

The evaluation order on the blackboard follows a stack-like structure to achieve the most recent goal. The rule will continue to be evaluated as long as the assumption conditions are true, otherwise the rule will be dropped, the set goal will be unstacked and a new rule will be loaded.

When a value of a parameter in a given context is not known and is not in the stack structures, one should then look for new information in the knowledge base, search for new rules, or ask the user directly.

2.4.4 Neural networks

An artificial neural network is made up of several processing units whose operation is quite simple. These units are usually connected by communication channels that are associated with a certain weight. Units perform operations only on their local data, which is input received by their connections. The intelligent behavior of an Artificial Neural Network comes from interactions between network processing units.

Neural networks allow optimized selection of a particular solution alternative for a given event or change. The neural network is used in this process to evaluate the results of the expert system, that is, the final solution should be selected as the best of all presented, the neural network allows to establish this solution.

2.4.5 Fuzzy logic

Fuzzy logic is based on fuzzy set theory. Traditionally, a logical proposition has two extremes: either it is completely true or it is completely false. However, in Fuzzy logic, a premise varies in degree of truth from 0 to 1, which leads to being partially true or partially false.

Fuzzy logic is the logic that supports the modes of reasoning that are approximate rather than exact. Fuzzy systems modeling and control are techniques for rigorously handling qualitative information. Derived from the concept of fuzzy sets, fuzzy logic forms the basis for the development of process modeling and control methods and algorithms, reducing the complexity of design and implementation, making it the solution to control problems hitherto intractable classic techniques.

In classical and modern control theories, the first step in implementing process control is to derive the mathematical model that describes the process. The procedure requires knowing in detail the process to be controlled, which is not always feasible if the process is too complex. Existing control theories apply to a wide variety of systems where the process is well defined.

However, all of these techniques are not capable of solving real problems whose mathematical modeling is impractical. For example, in many situations a considerable amount of essential information is only known a priori qualitatively. Similarly, performance criteria are only available in linguistic terms. This picture leads to inaccuracies and inaccuracies that make it impossible to use most of the theories used so far.

Fuzzy modeling and control theory are techniques for rigorously handling qualitative information. It assesses how imprecision and uncertainty should be managed and, in so doing, become powerful enough to properly manipulate knowledge. This technology considers the relationship between inputs and outputs, aggregating various process and control parameters. This allows processes considered complex to be reconsidered so that the resulting control systems provide a more accurate result as well as stable and robust performance. The sheer simplicity of implementing fuzzy control systems can reduce the complexity of a project to a point where previously intractable problems are now solvable.

2.5 Solution

The intelligent monitoring and supervision system becomes autonomous to decide, but this decision should indicate the best action that should be taken to mitigate or eliminate a particular change. This intervention must be in real time and in online mode. It should show the type of application that will be performed, the point where the intervention will be made, the components that will be reached and their intervention time. This solution can be given by the following procedure:

2.5.1 Maintenance

Depending on the situation, it can be in real time and in Online mode, meaning the maintenance team can make the necessary adjustments without shutting down the equipment and reducing its availability in the shortest possible time. The smart system should provide recommendations for making these correctives without compromising equipment operation. This is accomplished through expert system intervention. This system will decide on what type of maintenance to perform.

2.5.2 Predictive and proactive maintenance

Depending on the situation, it can be in real time and in Online mode, meaning the maintenance team can make the necessary adjustments without shutting down the equipment and reducing its availability in the shortest possible time.

The smart system should provide recommendations for corrective action without compromising equipment operation. This is accomplished through expert system intervention.

2.5.3 Element or device replacement

The intelligent system must have the ability to make this decision, supported by technical and economic criteria (losses).

2.6 Innovation

The final and ultimate solution will demonstrate the versatility, autonomy and efficiency of monitoring and supervision systems when they work and are structured as intelligent systems. Errors in procedures with these systems will be minimal to ensure safety and accuracy.

2.7 Benefits

Safe (high availability and reliability) and efficient operation of production systems, balanced and timely investments, reduced operating and energy costs.

Following is a typical output (report) of the software developed for the implementation of the methodology, especially the part related to data evaluation, fault diagnosis, root cause finding and determination, action decision making to be performed and execution of these actions [3, 4].

Following is an output of the software developed to implement the proposed intelligent system framework [1, 15]. All the tools presented in this chapter have been included:


3. Conclusions

Intelligent monitoring and control systems allow minimize the risks of failure of production systems, as is the case with generation systems (will be taken as a reference for this project), where intelligent systems are widely used, especially in large and top technology plants, and consequently increase its reliability (reduced failure rate/year) and the availability, improving the quality of energy supply by reducing the periods and interruption frequency of power supply, by improving indicators DEC, FEC, DIC, FIC and DMIC, and reliable management charge and distributed generation.

Centralization of information processing by intelligent monitoring systems will improve the efficiency of operation of electrical systems, optimize maintenance processes within the generation plants and consequently increase or maintain the estimated useful life of the generators, economically benefiting utilities power.

The transformation of the current systems for monitoring and supervision of hydroelectric plants in intelligent systems, effectively represents a technological advancement over conventional monitoring systems. What defines the quality of the response of these systems, in relation to the supervision and diagnosis, it is the experience of those responsible for analysis of failure modes.

The data management infrastructure, established by the power utilities, will be responsible for more or less extracted benefit of the system as well as for maintaining the efficient operation of the same. The choice of the best strategy for the Data and Information Management, will depend on the policies adopted by companies to their treatment. The benefits of smart grid technology, in monitoring systems will come when there is a data management policy functional within the utilities should be avoided as much as possible “excess of monitored parameters”. Prioritization criteria of failures “detectable” or “observable” must be considered. For each observable failure mode, there will always be a form of detection, which keeps a relationship “sensitivity/installation cost” more, and that in principle, should be chosen.

An action of great interest, which should be considered in the monitoring and supervision systems is the integration of auxiliary systems, to conduct their analysis and diagnosis, together with those from the main systems, causing minimal impact on its cost of installation. The influence of failures in auxiliary systems (ancillary services) with the probability of generating, forced stops of the equipment and system is high and in some situations similar to those of the main systems.

Importantly, there is not a single application and solution of systems or smart grids. Many of these functions will not become viable if coexist with others and should be implemented according to the needs of utilities. Thus, individual functions such as monitoring and fault detection in generators or feeder circuits, may not have their benefits evaluated separately.


  1. 1. Selak L, Butala P, Sluga A. Condition monitoring and fault diagnostics for hydropower plants. Computers in Industry. 2014;65(6):924-936. ISSN 0166-3615
  2. 2. Perez GA, Nelson K. Integration of distributed generation in power distribution networks and its structure as an intelligent generation system. In: 2015 IEEE PES Innovative Smart Grid Technologies Latin America (ISGT LATAM). Montevideo, Uruguai. pp. 134-139
  3. 3. Working Group A1.11. Guide for On-Line Monitoring of Turbogenerators. CIGRE; 2010
  4. 4. Working Group A1.10. Survey of Hidrogenerator Failures. CIGRE; 2009
  5. 5. Wenye W, Yi X, Mohit K. A survey on the communication architectures in smart grid. Computer Networks. 2011;55(15):3604-3620
  6. 6. Wei D, Lu Y, Jafari M, Skare P, Rhode K. An integrated security system of protecting Smart Grid against cyber attacks. Innovative Smart Grid Technologies (ISGT). United States, Gaithersburg, MD; 2010:1-7. Available from: INSPEC Accession Number: 11205470
  7. 7. Xiang L, Wenye W, Jianfeng M. An empirical study of communication infrastructures towards the smart grid: Design, Implementation and evaluation. IEEE Transactions on Smart Grid. 2013;4(1):170-183
  8. 8. Isermann R. Fault Diagnosis Systems—An Introduction from Fault Detection to Fault Tolerance. Berlin, Heidelberg, New York: Springer; 2009. ISBN-10 3-540-24112-4. ISBN-13 978-3-540-24112-6. Library of Congress Control Number: 2005932861
  9. 9. Ding SX. Model-Based Fault Diagnosis Techniques. Berlin, Heidelberg: Springer-Verlag; 2009. eBook ISBN: 978-3-540-76304-8; DOI: 10.1007/978-3-540-76304-8
  10. 10. Helio PA Jr, Levy AFS, Carvalho AT. Estudo sobre a Influência dos Acopladores Capacitivos na Sensibilidade da Medição de Descargas Parciais em Máquinas Elétricas Rotativas. In: XX SNPTEE, Recife, Brasil. 2009
  11. 11. Omori J. O projeto de Smart Grid da COPEL. In: Smart Grid Brazil Forum, São Paulo. 2010
  12. 12. Piirainen J. Applications of horizontal communication in industrial power stations [Master in science thesis]. Tampere University; 2010
  13. 13. Xin Y, Baldine I, Chase J, Beyene T, Parkhurst B, Chakrabortty A. Virtual smart grid architecture and control framework. In: IEEE Conference Publications (SmartGridComm). IEEE; 2011. pp. 1-6
  14. 14. Wenye W, Zhuo L. Cyber security in the smart Grid: Survey and challenges. Computer Networks. 2013;57(5):1344-1371
  15. 15. Guimaraes PHV, Murilo A, Andreoni M, Mattos DMF, Ferraz LHG, Pinto FAV, et al. Comunicação em redes elétricas inteligentes: Eficiência, Confiabilidade, Segurança e Escalabilidade. In: Minicursos do Simpósio brasileiro de redes de computadores—SBRC-2013, Brasília, DF, Brazil. 2013. pp. 101-164

Written By

Gustavo Pérez Alvarez

Submitted: 21 May 2019 Reviewed: 15 October 2019 Published: 05 February 2020