InTechOpen uses cookies to offer you the best online experience. By continuing to use our site, you agree to our Privacy Policy.

Computer and Information Science » Information and Knowledge Engineering » "Big Data on Real-World Applications", book edited by Sebastian Ventura Soto, José M. Luna and Alberto Cano, ISBN 978-953-51-2490-0, Print ISBN 978-953-51-2489-4, Published: July 20, 2016 under CC BY 3.0 license. © The Author(s).

Chapter 1

Novel Rule Base Development from IED-Resident Big Data for Protective Relay Analysis Expert System

By Mohammad Lutfi Othman, Ishak Aris and Thammaiah Ananthapadmanabha
DOI: 10.5772/63756

Article top


The Expert System block diagram [6].
Figure 1. The Expert System block diagram [6].
The Expert System block diagram for validation and diagnosis of protective relay [10].
Figure 2. The Expert System block diagram for validation and diagnosis of protective relay [10].
Structure of Expert System for protection coordination [13].
Figure 3. Structure of Expert System for protection coordination [13].
Data mining analysis steps in hypothesizing distance relay operation characteristics from big relay event data.
Figure 4. Data mining analysis steps in hypothesizing distance relay operation characteristics from big relay event data.
Architecture of Protective Relay Analysis System (PRAY).
Figure 5. Architecture of Protective Relay Analysis System (PRAY).
GUI for constructed rule base.
Figure 6. GUI for constructed rule base.
GUI for analysis of distance protective relay operations.
Figure 7. GUI for analysis of distance protective relay operations.
GUI for ground distance quadrilateral characteristics plots.
Figure 8. GUI for ground distance quadrilateral characteristics plots.
Validation of misoperative relay.
Figure 9. Validation of misoperative relay.
Diagnosis of misoperative relay.
Figure 10. Diagnosis of misoperative relay.

Novel Rule Base Development from IED-Resident Big Data for Protective Relay Analysis Expert System

Mohammad Lutfi Othman1, Ishak Aris1 and Thammaiah Ananthapadmanabha2
Show details


Many Expert Systems for intelligent electronic device (IED) performance analyses such as those for protective relays have been developed to ascertain operations, maximize availability, and subsequently minimize misoperation risks. However, manual handling of overwhelming volume of relay resident big data and heavy dependence on the protection experts’ contrasting knowledge and inundating relay manuals have hindered the maintenance of the Expert Systems. Thus, the objective of this chapter is to study the design of an Expert System called Protective Relay Analysis System (PRAY), which is imbedded with a rule base construction module. This module is to provide the facility of intelligently maintaining the knowledge base of PRAY through the prior discovery of relay operations (association) rules from a novel integrated data mining approach of Rough-Set-Genetic-Algorithm-based rule discovery and Rule Quality Measure. The developed PRAY runs its relay analysis by, first, validating whether a protective relay under test operates correctly as expected by way of comparison between hypothesized and actual relay behavior. In the case of relay maloperations or misoperations, it diagnoses presented symptoms by identifying their causes. This study illustrates how, with the prior hybrid-data-mining-based knowledge base maintenance of an Expert System, regular and rigorous analyses of protective relay performances carried out by power utility entities can be conveniently achieved.

Keywords: association rule, data mining, digital protective relay, expert system, power system protection analysis, rough set theory

1. Introduction

According to the IEEE Working Group D10 of the Line Protection Subcommittee, Power System Relaying Committee, Expert Systems have been proposed since early 1980s to be potential tools for engineers to develop intelligent performance analysis systems for the intelligent electronic devices (IEDs) such as protective relays [1]. Some of the works where protection performance analyses can be identified are in the area of offline tasks such as settings coordination, postfault analysis, and fault diagnosis [213].

Kezunovic et al. [6] explain the substation automated fault analysis using Expert System method based on the retrieved disturbance data acquired by digital fault recorders (DFRs). This fault analysis helps protection engineers identify the correctness of protective relay operation. Figure 1 illustrates the block diagram of the Expert System. The knowledge base in the CLIPS (an Expert System shell) rules used in the forward chaining inference engine using processed data is built by interviewing experts, using an empirical approach based on Electromagnetic Transient Program (EMTP) simulation and utilizing actual big field substation data.


Figure 1.

The Expert System block diagram [6].

Luo and Kezunovic’s [10] implementation of the Expert System in automated protection analysis is more specifically tailored at detailed analysis of a specific protective relay by relying on recorded big data found only within it. Figure 2 illustrates the block diagram of the analysis system created based on CLIPS language within Visual C++ framework. The analysis system is developed revolving around the strategy of comparing predicted (hypothesized) and actual (factual) protection operation in terms of statuses and corresponding timings of logic operands. Any matching between the predicted and actual protection operations validates the correctness of the actual status and timing of that operand. Otherwise, certain misoperation is identified, and diagnosis is initiated to trace the reasons. Predicted statuses and timings of active logic operands are basically a hypothesization of relay operations, which is done by way of forward chaining reasoning. They form the knowledge base in the rules used in the CLIPS inference engine.


Figure 2.

The Expert System block diagram for validation and diagnosis of protective relay [10].


Figure 3.

Structure of Expert System for protection coordination [13].

Tuitemwong and Premrudeepreechacharn [13] implement ES analysis for improving protection coordination settings of protective devices in distribution system under the presence of distributed generators (DG). By way of selecting suitable protection coordination settings, this analysis system determines the correct protection system performance in a DG-present power distribution system. The proposed structure of ES is shown in Figure 3. The inference engine uses coordination rules and selection rules to generate satisfactory coordination settings based on the processed equipment data, circuit data, protection data, and DG data in the knowledge base. In the case of conflicting settings, the user can make his own decision. The rules are set for the specific distribution system protection and maybe changed when necessary.

The common problem with the aforementioned implementation of rule-based Expert System in protection system analysis is the difficult upgrading of its knowledge base that is made up of “if-then” rules used for decision-making inference engine. Upgrading by expansion and refinement are necessary so as to adapt the Expert System to the continuously changing power network topologies, protection strategies, and multiplicity in protective relay functions [14]. However, acquiring knowledge of relay operation characteristics for upgrading of the knowledge base has not been an easy task due to

  1. the burdensome manual handling of voluminous protective relay stored data and

  2. the heavy dependence on the protection experts’ differing knowledge and inundating relay manuals.

It is beneficial if a novel technique could be formulated so as to relieve the untoward effort needed to acquire knowledge in building and maintaining the knowledge base. This technique should allow adjustment of knowledge base by training a protective relay device for as many disturbances as exhaustively possible in order to produce a complete inventory of rules. To help realize this, the authors’ previous work of an integrated data mining approach under the Knowledge Discovery in Database (KDD) framework shall be the prior step before the eventual Expert System knowledge base upgrading strategy is subsequently performed [1517].

2. Integrated data mining approach to hypothesize expected relay behavior from recorded relay event report

Under the KDD framework, Othman et al. [1517] investigate the implementation of a novel integrated data mining approach under supervised learning in order to discover the knowledge (or “hypothesize”) and the expected relay behavior. This knowledge extraction from the resident large event reports of a digital distance protective relay comes in the form of association rules as shown in Figure 4. The integrated data mining encompasses the adoption of the following computational intelligence methods:

  1. Rough set theory: Used to select the minimal subsets (i.e., reduction) of attributes while maintaining the original syntax of the relay’s big data of event report.

  2. Genetic algorithm: Used to explore the optimal sets of the above subsets of reduced attributes from which simple yet accurate prediction rules (i.e., decision algorithm) can be constructed.

  3. Rule quality measure: Used to extract the pertinent association rule from a host of the above original population of prediction rules to determine tripping logic of relay upon fault detection. This is what is referred as hypothesization of protective relay operation. This final version of knowledge representation shall be the main constituent for the Expert System knowledge base.


Figure 4.

Data mining analysis steps in hypothesizing distance relay operation characteristics from big relay event data.

In the study, the large event report is a PSCAD-simulated raw operation recording of an AREVA-modeled distance protective relay as shown in Table 1 (only a portion of time events is shown to reduce page usage). This big data, which is prior to data preparation, is a representation of the relay’s decision system (DS) for zone 1 A–G fault—the so-called predata-preparation DS [18].


Table 1.

Predata-preparation of distance protective relay’s decision system for zone 1 A-G fault (only a portion of attribute columns (from a total of 108) and time events are shown to reduce page usage).

The decision system is an information table of event report that can be considered as a pair of finite and nonempty set (U, A). U is the universe of objects (i.e., time tagged relay events tn, thus called event report) and A is the set of attributes {e.g., ir, irp, vam, iam, ibm, icm, CB52a_B, CB52b_B, VTmcb_B, CRZ4, pg_Z3PkUp, pg_Z4PkUp, pp_Z1PkUp, pp_Z2PkUp, AGflt, c50_Z1, b50_Z3, Dist_ab_Z2, pg_TrpZ1f, TrpBOPZ1, WI_CRTrp, Trip_PhA, etc.}. Each attribute aA defines an information function such that, fa: UVa, where Va is the set of values of the attribute a, called the domain of a. For instance, the set of values of the attribute pg_Z1PkUp (the “zone 1 ground distance pick-up” element) is expressed as pg_Z1PkUp: U → {0, 1}, which defines the relay element’s active states according to the presence of ground fault in the protected section of transmission line (i.e., no-fault present or zone-1-ground-fault present).


Table 2.

The predata-mining DS of distance protective relay subjected to zone 1 A-G fault.

Here, A is A = CD which is a nonempty finite union set of condition and decision attributes (condition attributes ciC suggest the multifunctional protective elements and analog measurands while decision attribute di D suggests the relay’s trip output).

This big data is a hindrance in a laborious manual extraction of relay operation characteristics for the Expert System development. Thus, the aforementioned novel integrated data mining strategy is necessary to address this issue.

The resulting prepared decision table (after data selection, preprocessing, and transformation) of the distance protective relay's decision system is shown in Table 2. It is also called postdata-preparation DS or predata-mining DS. “.” denotes data patterns that are similar to events immediately before and after them. Thus, they are not presented in order to reduce the table dimension. It is noticeable that the number of attributes has been substantially reduced by the data preparation strategy to merely 46 from the original 108 in the large raw event report.

The important analysis steps in the framework of Rough Set based data mining for deriving the distance relay decision algorithm from its event database is illustrated in Figure 4 and discussed herewith.

The computation of reducts which is a process of reducing the number attributes while still maintaining the original data syntax is performed to start with. Within this the following substeps are executed:

  1. Computation of the D-discernibility matrix of C (denoted as media/eq1.png). An element of media/eq1.png is defined as the set of all condition attributes which discern events ti and tj and do not belong to the same equivalence class of the relation U|IND(D).

  2. Subsequent derivation of the discernibility function fC(D) in Conjunctive Normal Form (CNF) (also called POS form in Boolean algebra) from MC(D). The CNF is reduced to final form after absorption law and omission of duplicates of disjunctive terms (sums) are applied minus the multiplication among each of the disjunctive terms of the final CNF.

  3. In empirical database such as in this relay event data analysis, the calculation toward arriving at the final Disjunctive Normal Form (DNF) in order to find the eventual reducts is extremely computationally intensive. (DNF is obtained if the multiplication among each of the disjunctive terms of the final CNF is performed). In this case, the generation of reducts is considered as an NP-hard problem [19]. Thus, Genetic Algorithm is adopted to compute approximations of reducts by finding the minimally approximate hitting sets (analogous to reducts) from the sets corresponding to the discernibility function [20, 21].

Next prediction rules (denoted as media/eq2.png) are generated in which the above discovered reducts serve as the templates for the prediction rules to be created from. This is principally done by superimposing each reduct in the reduct set over the original decision table DS and then reading off the domain values of the condition and decision attributes. The resulting logical patterns, denoted as media/eq3.png), that relate descriptions of condition to decision classes shall have the representation shown in Eq. (1):


These prediction rules that are an exact representation of the characteristics of the relay decision system (table) DS can be described as the relay decision algorithm and can be designated as ALG(DS), i.e.,


where (CpredD)t is the set of minimal prediction rules CpredD for an event t ∈ ∪, i.e.,


This ALG(DS) can be evaluated for its accuracy as follows:

  1. The entire original relay data set DS is partitioned into training and test sets using k-fold cross validation technique.

  2. Estimating classification performance of the relay decision algorithm by rule firing-voting strategies.

The discovered ALG(DS) has been evaluated and verified by Othman et al. [1517] to be able to be used to predict and discriminate future relay events having unknown trip state in unsupervised learning. This evaluation is necessary prior to allowing the eventual deduction of the relay association rule to take place.

Finally, postpruning (or filtering) is performed on the generated prediction rules (CpredD) so as to discover relay association rules (denoted as CpredD). These pertinent association rules essentially characterize the tripping decision logic of protective relay upon fault detection. This has been referred at the outset as the hypothesization of protective relay operation. This final version of knowledge representation shall be the main constituent for the Expert System knowledge base.

Because there are too large prediction rules to be filtered from, it is difficult to manually determine which rules are more useful, interesting, or important. Therefore, a measure of rule quality called G2 Likelihood Ratio Statistic as well as a measure of rule interestingness are used to select the most appropriate relay association rules and filter away the unwanted ones.

As mentioned above, these finally discovered relay association rules essentially describe the logical pattern of the correlating descriptions of conditions (i.e., C, the attribute set for various multifunctional protection elements) and the decision class (i.e., D, the attribute for trip assertion status). Thus, the symbol CD is used to illustrate C-D association and “CD-association rule” has been labeled as such to recognize it.

The final CD-association rule for one such fault condition as zone 1 A–G fault is shown in Eq. (4). Different fault condition would provide correspondingly different association rules to describe the relay’s behavior.

IFZag(123)ANDCB52_A(closed)ANDpg_PkUp(123)ANDFltType(AGflt)ANDpp50_Z3(A)ANDpp50_Z4(A)ANDp50_Z1(A)AND p50_Z3(A)ANDr50(1234)ANDQ32(Fwd)ANDZload(0)ANDQ50(1234)ANDDist_ag(123)ANDpg_Trp(1)THENTrip(A

It is important to note that Eq. (4) defines the necessary triggering of the required relay multifunctional protective elements (antecedent) in order to recognize the zone 1 phase-A-to-ground fault and consequently assert the trip signal (consequent) to open pole A of the circuit breaker concerned. This is what the protection engineers would like to know in understanding the domain of the distance relay in responding to the fault.

Thus, it is necessary to verify how true it is that this rule can be used to interpret the distance relay behavior subjected to zone 1 A–G fault as represented by the predata-mining DS in Table 2. Out of all the relay events in the entire length of the relay event report, relay events t90 and t91 identified as the fault detection and trip signal assertion instances, respectively, will be our emphasis for cross reference to verify the exactness of the above-mentioned rationalized CD-association rule. In Table 2, the rule is seen to be an exact interpretation of the relay events t90 and t91. Thus, the discovered rationalized CD-association rule is verified.

The eventually discovered (CassocD), and thus the desired hypothesis, has been proven to be an exact manifestation of the relay operation characteristics hidden in the event report [1517]. The intelligent data mining framework provides the potential facility to conveniently discover exhaustively available knowledge of relay behavior from big event data subjected to exhaustively possible fault contingencies. Ultimately, a complete rule base for inference execution of an Expert System for relay operation analysis can be developed. This is the motivation of developing an Expert System called Protective Relay Analysis System (PRAY) that provides a platform for gathering previously discovered rules for its knowledge base construction.

3. Developing protective relay analysis system (PRAY) expert system

The concept of protective relay performance analysis is related to the convention that in any analysis known or correct events must first be hypothesized (expected operations are assumed), then an analysis is performed to confirm (validate) or refute the hypothesis by running matching exercise between expected and actual operations of the device under test [22]. If it is determined that the protective relay operation was incorrect, the diagnosis for cause must be performed [8]. This fundamental concept shall form the very basis of developing PRAY for distance protection.

PRAY is developed as an application tool under LabVIEW framework from National Instruments [23]. The main components of PRAY are as shown in Figure 5 and described as follows:


Figure 5.

Architecture of Protective Relay Analysis System (PRAY).

  1. Construction of a rule base for PRAY’s inference engine by collating as an array all relay CD-association rules discovered from the KDD processes performed on trained relay. All attributes of each rule in the rule base shall be time tagged and arranged in a chronological order so that validation and diagnosis of the analyzed relay’s operations can be presented in an apparent operations logical sequence.

  2. Construction of phase and ground distance impedance channels (attributes) and fault-type channel. Using these channels, further identification processes of fault type, faulted zone, and distance to fault are executed and later used in singling out the most suitable relay CD-association rule from the rule base.

  3. Inferring, from the rule base according to both impending fault type and zone of pick-up, an expected relay CD-association rule to be best chosen as a hypothesis for the prediction of operations logic of the relay under analysis.

  4. Validation of occurrence of protective element pick-ups and their correctness of operations against hypothesis of the selected relay CD-association rule.

  5. Symptom of relay element misoperation and its diagnosis as well as possible solution suggestion.

  6. Graphical plots of ground and phase impedance locus against respective ground and phase distance quadrilateral characteristics. The distance characteristics are constructed based on parameter settings taken from the relay under analysis. Instantaneous filtered voltages and currents and logic operands are also plotted.

3.1. PRAY inputs

The different inputs needed by PRAY for its analysis functions are as follows:

  1. Relay CD-association rules: These rules saved as a plain text format in the KDD process are collated via graphical user interface (GUI) dialog input. The user is prompted for sufficient number of rules to be imported. The collated rules are converted into an array to form a rule base for the Expert System inference engine. Each rule input is an outcome of KDD after the Rough-Set-and-Genetic-Algorithm-based data mining and Rule Quality Measure (G2 Likelihood Ratio Statistic) in ROSETTA [24]. In its untreated form, each rule input consists of a number of sub-CD-association rules. These subrules are rationalized into a single CD form by taking conjunction of them and using the concept of Boolean function manipulation by applying law of absorption.

  2. Analyzed relay event reports in the form of raw and prepared decision systems, (relay DSs): The raw relay DS is a converted data from relay resident IEEE COMTRADE format to DIAdem native format (.tdm), which is needed for processing in LabVIEW [25]. The prepared relay DS is a resultant file after the same data preparation process as that in the KDD for trained relay. This prepared relay DS in DIAdem format (.tdm) is of the same data structure as that used in the KDD; the latter is ready for the Rough Set data mining albeit not executed on for the expert system analysis. Having the same data structure is important so that the prepared DS of the relay under analysis can be correctly cross validated with a CD-association rule chosen from the PRAY rule base.

  3. Protection parameter settings: Imbedded as a separate “channel group” from the raw relay DS’s channel group in the same tdm file. The relay settings are originally recorded by the relay under analysis as a number of COMTRADE files. Since they are in the same file as the raw relay DS, they are also converted by DIAdem into tdm format.

  4. Performance specifications: The user has the option to key in values for parameters. For simplicity of analysis, TNB specifications for relay tripping time according to various zones of protection have been included as default values without requiring user’s inputs. (TNB is a short form for Tenaga Nasional Berhad, a Malaysian major utility organization.)

3.2. PRAY reasoning strategy for validation and diagnosis

The reasoning for validation and diagnosis of relay operations analysis starts with identification of fault type, faulted zone, and distance to fault by PRAY itself. The information from the fault type and picked-up faulted zone is then used to determine the index in the rule base array to determine the subarray containing the appropriate relay CD-association rule to be used in analyzing the relay under analysis. This chosen rule shall act as the hypothesis of anticipated operations of individual protective elements in the relay under analysis when a particular fault has occurred. All the antecedents and consequent in the rule have been initially arranged in sequential order during the rule base construction according to the time instances that have been tagged alongside them. Time tagging is important so that validation and diagnosis of relay operations can be executed according to the logical sequence stipulated by the hypothesis. This logical sequence is in fact indicative of relay operations logic. The following is a fictitious example of relay operation hypothesis based on a chosen relay CD-association rule:

  • 0.000 CB52_B(closed) Q32(Fwd)

  • 0.096 p50_Z1(B)

  • 0.097 FltType(BGflt)

  • 0.100 Q50(1234) r50(1234)

  • 0.104 Zload(0)

  • 0.107 Dist_bg(123) Zbg(123) pg_PkUp(123) pg_Trp(1)

  • 0.108 Trip(B)

The consequent Trip(B) is associated with antecedents occurring beforehand. Any protective elements (antecedents) on the same row having the same time tagging indicate that they pick up (or stay in certain states) in concurrence. Expectedly, the last row having the highest tagged time must be the consequent (decision attribute) Trip(B).

The validation strategy of the operations of the analyzed relay starts by iterating through all antecedents in the hypothesis and comparing each one with that of the corresponding attribute of the prepared DS of the relay under analysis. Matched values result in messages describing the correctness of operations of the respective protective elements. On the other hand, any differences in the cross matches (either due to wrong pick-up values or nonassertion of the respective protective elements) will produce messages describing the relay’s failed elements. The result of the validation is presented starting from the consequent (decision attribute, “Trip”) at the top followed by antecedents arranged in descending sequence according to the order of the time tags in the hypothesis.

Diagnosis is carried out on failed, inoperative or misoperative protective elements. To view the cause–effect of events, a hierarchical tree is constructed based on the hypothesis where nodes are all hierarchically time sequenced, increasing in time from downstream nodes toward root node. The root node (top most) is the consequent of all the downstream antecedent nodes. Antecedents at the same nodes (i.e., having the same indentation) are concurrent in time instance. For the above-mentioned hypothesis, the diagnosis shall follow the following hierarchy:


  •   - Dist_bg(123)

  •   - Zbg(123)

  •   - pg_PkUp(123)

  •   - pg_Trp(1)

  •    - Zload(0)

  •     - Q50(1234)

  •     - r50(1234)

  •      - FltType(BGflt)

  •       - p50_Z1(B)

  •        - CB52_B(closed)

  •        - Q32(Fwd)

4. PRAY analysis system results

In the rule base construction of PRAY, each of the imported CD-association rules, prior to being rationalized using the concept of Boolean function manipulation by applying the law of absorption, would be formatted by ROSETTA into a text file. When imported into PRAY, the file will be cleared of all unnecessary data such as comments and rule interestingness numerical measures leaving only the required relay CD-association rules for subsequent rationalization.

Figure 6 illustrates the GUI for the constructed rule base. Size of rule base and the selected subarray (0-indexed) of collated rule base array are shown. The size of the rule base reflects the number of training of various fault contingencies the trained relay has been subjected to.


Figure 6.

GUI for constructed rule base.

Figure 7 illustrates the GUI for analysis of a distance protective relay operation that has been subjected to a zone-1-AG fault. Using data in the relay’s raw tdm file, PRAY discovered that an AG fault has indeed occurred in zone 1 of the relay under analysis at approximately 39 km from its location in the substation. From this information, an appropriate relay CD-association rule has been chosen and displayed in the GUI. This rule shall be used to analyze whether any appropriate measures have been taken by the relay under analysis to clear the fault. In validating the individual operations of protective elements, the Validation field displays the correctness of actions taken by the relay after cross matching anticipated operations of individual protective elements hypothesized by the rule with the corresponding attributes obtained from the preprocessed tdm relay file under analysis. The consequent “Trip” is validated to have correctly sent a pole A trip signal to the circuit breaker. This is followed by correct antecedent statuses arranged in descending sequence according to the hypothesis. The relay tripping time of 1.2 ms is compliant with the TNB requirement of 25 ms for zone 1 operation. The circuit breaker operating time and fault clearance time are also displayed in the GUI.


Figure 7.

GUI for analysis of distance protective relay operations.


Figure 8.

GUI for ground distance quadrilateral characteristics plots.

Figure 8 shows the graphical plots of ground impedance locus against respective ground distance quadrilateral characteristics. Since the fault is AG occurring in zone 1, it is noted that only trajectory of Zag traverses through into zone 1 of the ground quadrilateral characteristics and all phase impedances stay as outliers of the phase quadrilateral characteristics as expected.


Figure 9.

Validation of misoperative relay.

Figure 9 illustrates a screenshot of PRAY’s validation for a distance relay that had failed to operate (maloperated) when the transmission line it was protecting was subjected to a zone-1-AG fault. PRAY discovered that an AG fault had occurred in one of the relays under analysis at approximately 40 km forward its location in the substation. (This is actually the same fault occurred in the above analysis of the same relay operating successfully.) From this information, an appropriate relay CD-association rule had been chosen as the hypothesis (similar to the above) and used to validate that appropriate measures had not been taken to clear the fault. The consequent “Trip” was validated to have not sent a pole-A trip signal to the circuit breaker. The descending sequence of antecedents indicated that although there were correct operations of negative sequence overcurrent (Q50) and residual overcurrent supervision (r50) elements, signifying the impending A–G imbalanced fault, the zone-1 overcurrent supervision element (p50_Z1) had failed to do likewise. This was believed to have attributed to the relay’s failure to trip. Looking at the operation logic of different protective elements at different levels of sequence in the Diagnosis field’s hierarchical tree, it is apparent that the failure by the overcurrent element p50_Z1 is diagnosed to be the possible cause of the relay maloperation. Finding the symptom related to the malfunctional p50_Z1 element as shown in Figure 10 reveals that an incorrect threshold setting could have caused its failure.


Figure 10.

Diagnosis of misoperative relay.

5. Summary

The developed Protective Relay Analysis (PRAY) Expert System has demonstrated how the problems related to the maintenance of rule base of an Expert System can be addressed. By collating all the necessary relay CD-association rules discovered previously from the earlier KDD processes involving integrated-Rough-Set-and-Genetic-Algorithm data mining, Rule Quality Measure, and rule interestingness and importance judgments (as discussed in the authors’ cited works), a maintainable knowledge base for inference strategy can be conveniently prepared. Although this study revolves around analyzing a modeled distance relay’s big event data by hypothesis discovery, validation, and diagnosis, it is envisaged that using this approach a more rigorous analysis implementation of actual protective relay of different types can be embarked on.


This work was supported by the Universiti Putra Malaysia under the Geran Putra IPB scheme with the project no. GP-IPB/2013/9412101.


Crule condition attribute(s)
CB52_Bstatus of circuit breaker.
CDrelay decision rule, general term for (CassocD) and (CpredD)
(CassocD)relay CD-association rule
(CpredD)relay CD-prediction rule
CD-association rule  a relay association rule associating between C and D
CD-decision alg.a set of relay prediction rules that predict D from C (alg. is algorithm)
CD-prediction rule  rule that predicts D from C
CNFconjunctive normal form (i.e., product of sum (POS) in Boolean algebra).
COMTRADEcommon format for transient data exchange, an IEEE file format
Drule decision attribute
Dist_bgzone of Gnd Dist flt (ground distance fault)
DNFdisjunctive normal form (i.e., sum of product (SOP) in Boolean algebra)
DS/DTdecision system/decision table
fC(D)discernibility function
FltTypefault type
GAgenetic algorithm
G2G2 Likelihood ratio statistic, a rule quality measure
ISinformation system
KDDKnowledge discovery in database
MC(D)D-discernibility matrix of C
p50_Z1phase overcurrent supervision in zone
pg_PkUpground distance pick-up
pg_Trpground distance trip
PRAYProtective relay analysis system, an Expert System
Q32negative sequence directionality
Q50zone having NonDir Neq Seq OvrCurrt (nondirectional negative sequence overcurrent)
r50residual overcurrent supervision in zone
REDD(C)D-reducts of C, sets of reduced number of indispensable attributes
RSTRough set theory
M (fC(D))multiset
M (fC(D))Min Hit Setminimal hitting set
SOPsum of products
Triprelay pole trip signals
U|IND(D)indiscernibility-relation/equivalence-class/elementary-sets about universe of relay events U with respect to D
Zbgzone of ground distance pick-up.
Zloadimpedance encroaching load characteristic


1 - M. Ennas, L. Budler, T. W. Cease, A. Elneweihi, E. Guro, and M. Kezunovic. Potential applications of expert systems to power system protection. IEEE Transaction on Power Delivery. 1994;9(2):720–728.
2 - C. Fukui and J. Kawakami. An expert system fault section estimation using information from protective relays and circuit breakers. IEEE Transaction on Power Delivery. 1986;1(4):83–90.
3 - Y. Sun and C. C. Liu. RETEX (Relay Testing Expert): an expert system for analysis of relay testing data. IEEE Transaction on Power Delivery. 1992;7(2):986–994.
4 - D. Kosy, V. Grinberg, and M. Siegel. Screening digital relay data to detect power network fault response anomalies. In: SPIE Proc. 2nd International Symposium on Measurement Technology and Intelligent Instruments (ISMTII); Wuhan, People Republic of China; 29 Oct–5 Nov 1993.
5 - M. Kezunovic, P. Spasojevic, C. Fromen, and D. Sevcik. An expert system for transmission substation event analysis. IEEE Transaction on Power Delivery. 1993;8(4):1942–1949.
6 - M. Kezunovic, I. Rikalo, and C. W. Fromen. Expert system reasoning streamlines disturbance analysis. IEEE Computer Applications in Power. 1994;7(2):15–19.
7 - M. Kezunovic, I. Rikalo, C. W. Fromen, and D. R. Sevcik. New automated fault analysis approaches using intelligent system technologies. CiteSeerx Scientific Literature Digital Library and Search Engine, College of Info. Sci. and Tech., Pennsylvania State Univ [Internet]. 1998 [Updated: 1998]. Available from: [Accessed: 21 Dec 2015].
8 - M. Kezunovic and X. Luo. Automated analysis of protective relay data. In: CIRED 18th International Conference on Electricity Distribution; Turin; 2005.
9 - S. MacArthur, J. McDonald, S. Bell, and G. Burt. Expert systems and model based reasoning for protection performance analysis. In: IEE Colloquium on Artificial Intelligence Applications in Power Systems; 1995.
10 - X. Luo and M. Kezunovic. Fault analysis based on integration of digital relay and DFR data. In: IEEE Power Engineering Society General Meeting; 12–16 June 2005.
11 - Z. Wen-jing and J. Qing-quan. Research and simulation of an expert system on the wide-area back-up protection system. In: IET 9th International Conference on Developments in Power Systems Protection (DPSP 2008); Glasgow, UK; 17–20 March 2008.
12 - K. Tuitemwong and S. Premrudeepreechacharn. Expert system for protective devices coordination in radial distribution network with small power producers. In: IEEE Lausanne POWERTECH; Lausanne; 2007.
13 - K. Tuitemwong and S. Premrudeepreechacharn. Expert system for protection coordination of distribution system with distributed generators. International Journal of Electrical Power and Energy Systems. 2011;33(3):466–471.
14 - M. M. Saha, E. Rosolowski, and J. Izykowski. Artificial Intelligent Application to Power System Protection [Internet]. 2000. Available from: [Accessed: 23 Dec 2015].
15 - M. L. Othman, I. Aris, S. M. Abdullah, M. L. Ali, and M. R. Othman. Knowledge discovery in distance relay event report: a comparative data-mining strategy of rough set theory with decision tree. IEEE Transaction on Power Delivery. 2010;25(4):2264–2287.
16 - M. L. Othman, I. Aris, M. R. Othman, and H. Osman. Rough-set-based timing characteristic analyses of distance protective relay. Applied Soft Computing. 2012;12(8):2053–2062.
17 - M. L. Othman, I. Aris, M. R. Othman, and H. Osman. Rough-set-and-genetic-algorithm based data mining and rule quality measure to hypothesize distance protective relay operation characteristics from relay event report. International Journal of Electrical Power and Energy Systems. 2011;33(8):1437–1456.
18 - M. L. Othman, I. Aris, and N. I. Abdul Wahab. Modeling and simulation of industrial numerical distance relay aimed at knowledge discovery in resident event report. Simulation: Transactions of Society for Modelling and Simulation International. 2014;90(6):660–686.
19 - A. Øhrn. ROSETTA Technical Reference Manual. Trondheim, Norway: Norwegian University of Science and Technology; 2000.
20 - D. S. Hockbaum. Approximation Algorithms for NP-Hard Problems. Boston, MA: PWS Publishing Company; 1996.
21 - A. Øhrn. Discernibility and Rough Sets in Medicine: Tools and Applications. Trondheim, Norway: Norwegian University of Science and Technology; 1999.
22 - MAAC. Requirements for Protection System Operation Reporting and Analysis. Cleveland, Ohio: Mid Atlantic Area Council; 2003.
23 - N. Instruments. LabVIEW Basics I Introduction Course Manual. Austin, TX: National Instruments Corporation; 2006.
24 - A. Øhrn, J. Komorowski, A. Skowron, and P. Synak. The design and implementation of a knowledge discovery toolkit based on rough sets: The ROSETTA system. In: L. Polkowski and A. Skowron, editors. Rough Sets in Knowledge Discovery 1: Methodology and Applications, Studies in Fuzziness and Soft Computing. Heidelberg, Germany: Physica-Verlag, Springer; 1998. pp. 376–399.
25 - N. Instruments. DIAdem: Data Mining, Analysis, and Report Generation. Austin, TX: National Instrument Corporation; 2005.