Open access peer-reviewed chapter

A Fuzzy Rule Based Approach to Geographic Classification of Virgin Olive Oil Using T-Operators

Written By

Suzan Kantarcı-Savaş and Efendi Nasibov

Submitted: 20 November 2017 Reviewed: 05 July 2018 Published: 26 September 2018

DOI: 10.5772/intechopen.79962

From the Edited Volume

Potential of Essential Oils

Edited by Hany A. El-Shemy

Chapter metrics overview

1,097 Chapter Downloads

View Full Metrics

Abstract

Olive oil is an important agricultural food product. Especially, protected designation of origin (PDO) and protected geographic indications (PGI) are useful to protect the intellectual property rights of the consumers and producers. For this reason, the importance of the geographic classification increases to trace geographical indications. This chapter suggests a geographical classification system for the virgin olive oils. This system is formed on chemical parameters. These parameters include fuzziness. Novel proposed system constructs the rules by using fuzzy decision tree algorithm. It produces rules over fuzzy ID3 algorithm. It uses fuzzy entropy on the fuzzified data. The reasoning procedure depends on weighted rule-based system and is adapted into the fuzzy reasoning handled with different T-operators. Fuzzification is performed with fuzzy c-means algorithm for the olive oil data set. The cluster numbers of each variable are selected based on partition coefficient validity criteria. The model is examined by using different decision tree approaches (C4.5 and standard version fuzzy ID3 algorithm) and FID3 reasoning method with eight different T-operators. Also, the conclusions are supported by statistical analysis. Experimental results support that the weights have important manner on fuzzy reasoning method for the geographic classification system.

Keywords

  • fuzzy decision tree
  • fuzzy rule
  • T-operators
  • geographic classification
  • olive oil

1. Introduction

Geographic indications are very important signs used on products. Their aim is to specify geographical origin of the product and follow the qualities. There are two kinds of geographical indications, protected designation of origin (PDO) and protected geographic indications (PGI). These indications are generally used for agricultural products. Olive oil has crucial manner among these agricultural food products. It is necessary to observe the properties of olive oil produced from different kinds of regions or different types of olive varieties. Geographical classification problem investigates the relationship among the chemical and sensorial parameters for each region.

Nowadays, machine learning discipline and chemical data structures come together with the information age. Machine learning is interested in the design and development of algorithms for computers. It aims to observe the relationships among the data structure and to make knowledge mining without assumptions. There are several machine learning algorithms to search the knowledge.

Decision trees as machine learning tasks, are most commonly used in machine learning discipline. There are several types of decision tree algorithms such as ID3, C4.5, CART, etc. Nowadays, fuzzy logic is adapted into decision tree algorithms to handle the uncertainty. The decision trees adapted with fuzzy logic are called as fuzzy decision tree [1, 2, 3]. It consists of nodes for testing attributes, edges for branching by test values of fuzzy sets, and leaves for deciding class according to class membership.

The chemical measurements have also uncertainty [4, 5, 6, 7, 8]. In this study, geographical classification problem uses chemical measurements. This study aims to propose an improved methodological approach for the classification of olive oil samples based on fuzzy ID3 classification approach.

This novel proposed system constructs the rules by using fuzzy decision tree algorithm. Its reasoning procedure is based on weighted rule-based system adapted into the fuzzy reasoning handled with different T-operators. The model is examined by using different decision tree approaches (C4.5 and standard version fuzzy ID3 algorithm) and FID3 reasoning method with eight different T-operators. This study is examined on 101 virgin olive oil samples collected from four different regions (North Aegean, South Aegean, Mediterranean, and South East) by using measurements of chemical parameters. Min-max normalization was applied into the dataset. The nonparametric methods were preferred for the statistical analysis because of the data structure. Leave-one-out procedure was performed in order to measure the performances of the algorithms. The Friedman aligned rank test and pairwise comparisons were performed to evaluate fuzzy reasoning method based on different T-operators. And, the comparison between unweighted and weighted fuzzy reasoning approaches was done. The rest of the paper is organized as follows: Section 2 presents the geographical classification problem definition and related works. The preliminaries such as fuzzification, fuzzy ID3 algorithm, and fuzzy rule-based classification system are given in Section 3. Experimental study on unweighted and weighted fuzzy rule-based approach to Geographic Classification of Virgin Olive Oil Using T-Operators is given in Section 4, and finally, the conclusion is represented in Section 5.

Advertisement

2. Geographic classification problem

Geographic classification problem aims to find the region for an unassigned olive oil sample. This problem comes to exist to support the traceability of denominated protected origin policy for olive oil samples. Especially, the definition of a methodology is an important issue for Turkey. In literature, it is seen that the scholars generally prefer to study on the classification of olive oils [9, 10]. Principal component analysis, linear discriminant, probabilistic neural networks, and classification binary tree were preferred techniques to evaluate the parameters [9, 10]. Back propagation artificial neural networks (BP-ANN) is also used to solve [11] this kind of problem. In [12], the adulteration in olive oil was defined by near-infrared spectroscopy and using chemometric techniques such as principal component analysis, partial least squares regression (PLS), and applied methods for data pretreatments such as signal detection correction. Principal component analysis and SIMCA classification model [13] are other methods to support the geographic classification problem given in Figure 1.

Figure 1.

Geographical classification problem scheme for olive oil.

Advertisement

3. Preliminaries

We briefly explain fuzzy logic and fuzzy c-means algorithm as fuzzification tool. Also, we review briefly fuzzy ID3 builder combined with fuzzy rule-based classification and its reasoning method. We give information about T-operators and we suggest fuzzy ID3 weighted reasoning method approach via different types of T-operators in subsections.

3.1. Fuzzy logic and fuzzy c-means algorithm as fuzzification tool

In 1965, fuzzy set theory was first proposed in [14]. A fuzzy subset of the universe of discourse U is described by a membership function μ v V : U 0 1 , which represents the degree to which uϵU belongs to the set v. Each value defines by a membership degree. The transformation process into membership degrees for each term of fuzzy variables is called as fuzzification. In literature, there are many types of membership functions, triangular membership functions, trapezoidal membership functions, Gaussian membership functions, etc. [15]. In general, triangular membership functions are preferred. Otherwise, fuzzy c-means (FCM) algorithm, which was suggested in [16] and it was improved in [17], can be used for the transformation of membership degrees for each term of fuzzy variables. This algorithm is a kind of clustering algorithm. This clustering algorithm aims to reach a fuzzy C partition matrix U. The objective function J m is minimized as follows for fuzzy partition (Eq. (1)):

J m U v = k = 1 n i = 1 c μ ik m d ik 2 E1

where

d ik = d x k v i = j = 1 p x kj v ij 2 1 / 2 , k = 1 , , n ; i = 1 , , c E2

and, μ ik is explained as the membership degree of the kth data point in ith class. Dimensionality of the data space is indicated by ‘p’. The parameter 1 demonstrates sharpness of the fuzzification process. In Eq. (2), d ik indicates any distance measure (usually the Euclidean distance) between k th data point and i th cluster center in p dimensional space. Then, v i displays i th cluster center. Eq. (3) calculates each of the clusters centers for each class:

v ij = k = 1 n μ ik m x kj k = 1 n μ ik m , i = 1 , 2 . , c ; j = 1 , 2 . , p . E3

Membership degrees are calculated according to the Eq. (4):

μ ik = 1 z = 1 c x k v i x k v z 2 m 1 , i = 1 , 2 . . , c ; k = 1 , . , n E4

Validity indicators are used in order to determine the number of clusters (c) [18, 19, 20]. One of them is partition coefficient formulized as below (Eq. (5)):

V PC = 1 n i = 1 c j = 1 n μ ij 2 E5

whereas optimal cluster number is determined by the calculation of max V PC U c . Each cluster number represents the number of fuzzy linguistic term for each fuzzy variable.

3.2. Fuzzy rule-based classification system (FRBCS)

Fuzzy rule-based classification system (FRBCS) is very useful for the solution of classification problems. In real life, they have been applied into the different kinds of problems, such as image processing [21], medical problems [22], etc.

There is a class C j from a preassigned class set C = C 1 C 2 C M to an object, which is a part of a certain feature space x S N and a classifier is to realize an assignment for an appropriate class, ( D = S N C ) [23].

In general, the classifier includes a set of fuzzy rules. It can be a neural network, a decision tree, fuzzy decision tree etc. If the classifier produces a set of fuzzy rules, the system is called a fuzzy rule-based classification system (its acronym is FRBCS).

The antecedents of fuzzy rules defined by fuzzy variables provide computational flexibility. Using a set of training samples and a classifier solves a classification problem. The model provides the class of a new sample. The scheme of classification problem with fuzzy ID3 algorithm combined with fuzzy rule-based classification system is summarized in Figure 2 as follows.

Figure 2.

A classification problem with fuzzy ID3 algorithm combined with FRBC.

In this study, it is seen that fuzzy interactive dichotomizer 3 (fuzzy ID3) algorithm is preferred as a classifier. This algorithm generate rules, fuzzy ID3 algorithm constructs a tree in learning process. Fuzzy entropy is applied to find the attributes, which has the maximum information whereas minimum uncertainty. Each path of the tree shows the rules. Each leaf node has rule weight (RW) for each class. RW j represents jth rule’s weight handled from fuzzy confidence value CF j which equals to RW j . After the rules induction, fuzzy rule-based reasoning is performed to handle the classification task.

In literature, there are three definitions for fuzzy rules [23]. In this study, the following type of rules is used for the experiments constructed from the fuzzy decision trees.

Fuzzy rules with a class and a certainty degree in the consequent [24].

R k : If x 1 is A 1 k and . and x N is A N k then Y is C j with r k

where r k is the certainty degree of the classification in the class C j for a pattern belonging to the fuzzy substance restricted by the fuzzy antecedent.

3.3. Fuzzy interactive dichotomizer 3

Fuzzy decision tree is the adaptation of decision tree structure with fuzzy logic. There are many types of decision tree algorithms, which are adapted with fuzzy logic to construct a fuzzy decision tree. A tree is generated and the decision rules are achieved by using each path from the root to the leaves of the tree. Fuzzy interactive dichotomizer 3 (Fuzzy ID3) defined in [2] is widely used as a classification tree builder algorithm. It is the adaptation of ID3 algorithm proposed by Quinlan in [25] with fuzzy logic. One of the important advantages is to deal with crisp and fuzzy variables defined by the user. This algorithm separates the data set according to a data attribute, which is selected by using a measure called as information gain based on fuzzy entropy. It seeks the attributes, which has the information with the highest degree of resolution.

Let a training set consists of p samples, x p = x p 1 x pn be the pth sample of the training set where x pi is the value of the ith attribute i = 1 2 n of the pth training sample. Each sample belongs to a class shown as y p ϵC = C 1 C 2 C m , where m is the number of classes of the problem [26]. Assume there are N labeled fuzzified patterns and n attributes A = A 1 A 2 A n . For each k assume that 1 k n . The attribute A k takes m k values of fuzzy subsets A k 1 A k 2 A k m k . C denotes the classification target attribute, taking m values C 1 , C 2 , , C m . The symbol M . is used to denote the cardinality of a given fuzzy set, that is, the sum of the membership values of the fuzzy set [2, 26].

The induction process of fuzzy ID3 is given as follows:

Step 1: Produce a root node, which contains a set of all data. Each data is fuzzified, and each membership degree equals to 1 for all data for the initialization.

Step 2: The attribute for each internal node is selected by using the following steps:

Step 2a: Compute its relative frequencies with respect to class C j j = 1 2 m for each linguistic label A ki i = 1 2 m k ,

p ki j = M A ki C j M A ki E6

Step 2b: Compute its fuzzy classification entropy for each linguistic label A ki i = 1 2 m k :

Entr ki = j = 1 m p ki j log p ki j E7

Step 2c: Compute the average fuzzy classification entropy ( E k ) of each attribute.

E k = i = 1 m k M A ki j = 1 m k M A kj Entr ki E8

Step 2d: Select the attribute (Attr) that maximizes the gain information ( G k ) [27].

Attr = max 1 k n G k , where G k = E k Entr ki E9

Step 2e: Assign the selected attribute as the root node and the linguistic labels as candidate branches of the tree.

Step 3: Pick out one branch to analyze. Remove the branch if it is containing nothing. If the branch is nonentity, calculate the relative frequencies via (Eq. (6)) of all objects within the branch into each class. If the relative frequency of each class is above the given threshold θ r or all the attributes have been expanded for this branch, stop the branch as a leaf. Otherwise, select the attribute from among those, which have not been extended yet in this branch with the smallest average fuzzy classification entropy (Eq. (9)) as a new decision node for the branch and add its linguistic labels as candidates branches to analyze. At each leaf, each class will have its relative frequency [27].

Step 4: Repeat Step 3 while there are branches to analyze. If there are no candidate branches then the decision tree is totaled [27].

The rule structure generated from each branch of the fuzzy decision tree.

After the fuzzy decision tree induction, the rules are generated from each branch. Each branch behaves as path. The rule R j is given as follows [27]:

Rule R j : If x 1 is A j 1 and … and x n is A jn then Class = C j with RW j , where R j is the label of the jth rule. x = x 1 x 2 x n is an n-dimensional pattern vector. This vector is used to represent the example. A ji is a fuzzy set. C j ϵ C is the class label, and RW j is the rule weight. In fuzzy decision tree, at each leaf node has rule weights. These rule weights are founded via the relative frequency for each class (as given in Step 3) [27].

3.4. Fuzzy reasoning method based on T-operators

Fuzzy reasoning method (FRM) is defined as an inference procedure. This inference procedure aims to achieve an assignment from a set of fuzzy if then rules. It makes the combination between the information of the rules fires and the pattern to be classified. This ability of FRM supports the generalization capability of the classification system [25]. We will analyze this idea in this section according to the following structure. In this section, the adaptation of the general model of fuzzy reasoning is represented with the classical FRM. After that, we talk about a general model of reasoning that involves different possibilities as reasoning methods, we suggest eight alternative FRMs as some particular new proposals, which are adapted with the general reasoning model. Finally, in the last section, we present the experiments carried out, displaying the advantageous behavior of the alternative proposed reasoning methods.

3.4.1. General model of fuzzy reasoning

Let x p = x p 1 x pn be the pth example of the training set, which is composed of P examples, where x pi is the value of the ith attribute i = 1 2 n of the pth sample. Each example belongs to class y p ϵC = C 1 C 2 C m , where m is the number of classes of the problem. It is assumed that x p is a novel example to be classified FID3 reasoning procedure given in [2]. Fuzzy reasoning method for FARC-HD in [28] is summarized in four steps. In our approach, fuzzy ID3 reasoning method is combined with T-operators. T-operators were developed from the triangular inequalities [29, 30]. The combination of fuzzy set theory and T-operators are used to intersect and reunite two fuzzy sets [31, 32]. There are different types of T-operators, which are also called T-norms and T-conorms in literature [33]. These operators are used in different types of problems [33]. T-operators are two placed functions from 0 1 × 0 1 to 0 1 that are monotonic, commutative, and associative [33].

T-norm is used to find the intersection of two fuzzy sets A and B. The intersection of two fuzzy sets A and B is a fuzzy set C, written as C = A and B , whose MF is related to those of A and B by

μ C x = μ A x μ B x E10

On the other hand, T-conorm is performed to achieve the union of two fuzzy sets A and B is a fuzzy set C, written as C = A or B , whose membership function (MF) is related to those of A and B by

μ C x = μ A x μ B x E11

T-Operators used in fuzzy reasoning method are given in Table 1 [27].

Nonparametric operators [27]
Ref T-norm operators T-conorm operators
Zadeh [14] T 1 x y = min x y T 1 = max x y
Product Sum [41, 42] T 2 x y = x . y T 2 = x + y x . y
Nonparametric Hamacher [43] λ = 0 T 3 x y = x . y x + y x . y T 3 x y = x + y 2 . x . y 1 x . y
Parametric operators [27]
Ref T-norm operators T-conorm operators Parametric Range
Hamacher [43] T 4 x y = x . y λ + 1 λ x + y x . y T 4 x y = x + y 2 λ . x . y λ + 1 λ 1 x . y λ 0
Yager [44] T 5 x y = max 1 1 x p + 1 y p 1 / p 0 T 5 x y = min x p + y p 1 / p 1 p > 0
Dombi [45] T 6 x y = 1 1 + 1 x 1 λ + 1 y 1 λ 1 / λ T 6 x y = 1 1 + 1 x 1 λ + 1 y 1 λ 1 / λ λ > 0
Dubois and
Prade [46]
T 7 x y = x . y max x y λ T 7 x y = 1 x . 1 y max 1 x 1 y λ λ = 0 1
Weber [41] T 8 x y = max x + y 1 + λ . x . y 1 + λ 0 T 8 x y = min x + y + λ . x . y 1 λ > 1

Table 1.

T-Operators used in fuzzy reasoning method.

3.4.2. Fuzzy rule evaluation measures in data mining

There are two measures called as confidence and support in the field data mining to evaluate rules. Assume that fuzzy rule R j is defined as A q C q where A q = A q 1 A qn . In [34, 35, 36, 37], fuzzy versions of two rule evaluation measures were explained as below:

Let us assume that m labeled patterns,

x p = x p 1 x pn , p = 1 , . , m E12

are given from M classes for an n-dimensional pattern classification problem.

In literature [38, 39, 40], the compatibility grade of each training pattern x p with the antecedent A q is defined by the product operation as μ A q x p = μ A q 1 x p 1 × × μ A qn x pn , where μ A qi . is the membership function of the antecedent fuzzy set A qi .

The confidence of the fuzzy rule A q C q is written as follows [39, 40]:

c A q C q = x pϵClass C q μ A q x p p = 1 m μ A q x p E13

The confidence is a numerical approximation of the conditional probability. On the other hand, the support of A q C q is written as follows [39, 40, 41, 42, 43, 44, 45, 46]:

s A q C q = x pϵClass C q μ A q x p m E14

The support measures the coverage of the training patterns by A q C q .

3.4.3. Heuristic methods for rule weight specification

While the determination of the consequent class, there are many ways to give weights to the rules [38, 39, 40]. In general, the consequent C q of the fuzzy rule A q C q in [38] is settled with the class who has the maximum confidence for the antecedent A q .

c A q C q = max c A q Class h h = 1 , 2 , , M E15

The confidence c A q C q can be used as the rule weight RW q of the fuzzy rule A q C q .

While a set of antecedent fuzzy sets is given for each attribute, the antecedent part of each fuzzy rule (i.e. A q ) is defined with the combination of antecedent fuzzy sets for n attributes. In [36], it is seen that the confidence is directly used for each class for the fuzzy rule with multiple consequent classes [23].

RW qh = c A q Class h , h = 1 , 2 , 3 , , M . E16

The adaptation of generalized model with weighted fuzzy reasoning based on T-operators.

The steps are given below combined with FID3 reasoning based on T-operators:

Step 1: Antecedent degree of a rule: In this step, the strength of activation of the if-part for all rules handled from each path of the fuzzy decision tree in the RB with the pattern x p is computed

μ A j x p = T μ A j 1 x p 1 . μ A j 1 x p n j E17

where μ A j x pi is the matching degree of the example with ith antecedent of the rule R j , which is handled from a leaf node at the end of each path. T is a T-norm (listed in Table 1) and n j is the number of antecedents of the rule.

Step 2: Consequent degree for a class: The consequent degree favor of class l by the rule R j for the pattern x p is computed as follows where RW jl the weight is computed according to the multiple consequent classes (Eq. (16))

b j l x p = T μ A j x p RW jl E18

Step 3: Confidence degree for a class: In this stage, the confidence degree for the class l according to all rules in RB is computed. To obtain the confidence degree of a class, the association degrees of the rules of that class are aggregated by using conjunction operators, where T* is a T-conorm (listed in Table 1) [2, 27].

conf l x p = T b 1 l x p b 2 l x p . . b R l x p E19

where b j l x p , j = 1 , 2 , , R , is the association degree of the pattern x p , to the class l, according to the j .th rule.

Step 4: Classification: The class is obtained with the highest confidence degree assign as the predicted one [2, 27].

Class = arg max l = 1 , 2 , m c onf l x p E20

Advertisement

4. Experimental study on fuzzy rule-based approach to geographic classification of virgin olive oil using T-operators

In this section, fuzzy rule-based approach to geographic classification of virgin olive oil problem is summarized. And, the solution is given step by step. Then, we describe the experimental study. Firstly, the description of the olive oil samples and the methodology used in chemical analyses of olive oil samples are explained in detail. Secondly, we explain performance measure and statistical tests. Fuzzy reasoning methods with nonparametric operators are examined. The behavior of fuzzy ID3 weighted fuzzy reasoning method based on different T-operators is observed. Then, the weighted and unweighted fuzzy reasoning methods based on different T-operators are compared.

4.1. Olive oil samples

Olives were collected from certain trees of the cultivars, which were determined subject matter of this work: Ayvalik, Memecik, Kilis Yaglik, and Nizip Yaglik. The samples collected in 2002–2003, 2004–2005, and 2005–2006 harvest seasons. About 101 olive oil samples [47] were used for the experimental study. These samples were collected from different regions [North Aegean (33), South Aegean (53), Mediterranean (4), and South East (11)]. The detail information about the chemical analysis of the samples was given in pioneer studies [27, 47, 48]. PCA was applied in SPSS 20.0, partition coefficients and fuzzy c-means algorithm were handled in MATLAB 2015. The software is designed named as OliveDeSoft in the Visual C# for the experimental study (intel i7, 2.4 GHz, 4 Gb RAM) [48]. The data fuzzification process was applied by using fuzzy c-means (FCM). Partition coefficient determined the number of clusters [19, 20]. The calculated partition coefficient value for each cluster is given in former study [27].

4.2. Performance measure and statistical tests

In former study [27], principal component analysis is performed on this data set in order to explore the data structure. It is seen that the geographic origin of virgin olive oils on the results handled from the chemical analyses are explained clearly. Yet one region (Mediterranean) has less data than the other regions, so it is not explained. The data implementation is done in IBM SPSS 20. The chemical measurements have fuzziness. So, we prefer to use fuzzy ID3 algorithm based on fuzzy logic for the classification in our study. In classical case, ID3 algorithm works with categorical variables. It is an advantage of fuzzy ID3 algorithm. This algorithm carries out numerical variables via fuzzy variables. Each numeric variable is converted to fuzzy variable. Fuzzy c-means algorithm is performed for the fuzzification. This proposed approach displays eight different T-operators in the reasoning procedure. The performances of standard fuzzy ID3 represented in [2, 27] and C4.5 [49] algorithms are examined in the experimental study. Leave one out validation procedure was performed for the performances measurement of the algorithms. Accuracy rate is preferred to test different methods [13]. In experimental study, threshold value for fuzzy decision tree is set to θ r = 0.75 . Parameters of parametric operators are fixed as Yager p = 2, Hamacher p = 0.25, Dombi = 1, Dubois = 0.25, and Weber = 15 for fuzzy reasoning procedure. The comparison of the performances of unweighted and weighted fuzzy reasoning approaches is performed.

Studying fuzzy reasoning method with nonparametric operators: C4.5 algorithm also uses entropy as splitting criteria. It is the improved version of ID3 algorithm. It was presented by Quinlan in 1994 to work on the numerical data [27]. The performance of it is 86.14%. Then, it is observed that the performance of fuzzy ID3 algorithm with reasoning method in [2] is 86.14% too [27].

The performance results of nonparametric approaches given in Table 2 shows that the result handled from three nonparametric operators have the same performance value with handled from C4.5 algorithm. Yet, the accuracy handled with Zadeh T-operators is smaller value with 82.18%.

Algorithms Accuracy rate (%)
C4.5 86.14
FuzzyID3_reasoning with Weighted Product Sum_Umano 86.14
FuzzyID3_ reasoning with Weighted T-Operators T 1 & T 1 82.18
FuzzyID3_ reasoning with Weighted Product-Sum T 2 & T 2 86.14
FuzzyID3_ reasoning with Weighted Non Parametric Hamacher ( λ = 0 ) T 3 & T 3 86.14

Table 2.

The performance results of each algorithm for nonparametric operators [27].

Study of the behavior of fuzzy ID3 weighted fuzzy reasoning method based on different T-operators: We have made use of the Friedman aligned ranks as a nonparametric statistical procedure to discover statistical differences among a group of results for 20 threshold ( θ r ) values in Table 3.

Algorithm Rank Friedman aligned ranks
Zadeh 3.02
Umano 6.40
Product-Sum 6.80 Total N 20
Nonparametric Hamacher ( λ = 0 ) 6.88 Test Statistic 76.396
Yager 3.55
Hamacher 6.80 Degrees of Freedom 8
Dombi 2.90
Dubois 3.22 Asymptotic Sig. (2 sided test) 0.000
Weber 5.42

Table 3.

Friedman aligned ranks for weighted Fuzzy ID3 reasoning based on different T-operators.

The results of pairwise comparisons for weighted fuzzy ID3 reasoning based on different T-operators [27] with 20 different thresholds (range = 0.71-0.90) via adjusted significance values are given in Table 4.

Weber Zadeh Yager Hamacher Nonparametric Hamacher ( λ = 0 ) Product sum Umano Dubois
Dombi 0.128 1.000 1.000 0.000 0.000 0.000 0.002 1.000
Dubois 0.399 1.000 1.000 0.001 0.001 0.001 0.009
Umano 1.000 0.004 0.036 1.000 1.000 1.000
Product sum 1.000 0.000 0.006 1.000 1.000
Non parametric Hamacher ( λ = 0 ) 1.000 0.000 0.004 1.000
Hamacher 1.000 0.000 0.006
Yager 1.000 1.000
Zadeh 0.201

Table 4.

The results of pairwise comparisons for weighted Fuzzy ID3 reasoning based on different T-operators with 20 different thresholds (range = 0.71–0.90) via adjusted significance values.

Friedman aligned ranks test shows that p-value is equal to zero. It means that there are significant differences among the results. Then, the pairwise comparisons are performed. The results are shown in Table 4. These nonparametric tests were performed in IBM SPSS 20.

The comparison of the weighted and unweighted fuzzy reasoning methods based on different T-operators: Accuracy rates handled for different thresholds within unweighted fuzzy reasoning method based on different T-operators are given in Table 5. It is seen that maximum value has Dombi T-operators handled for θ r = 0.85 with 88.11%. As a result, it is observed that we can also reach better results by using different threshold values.

θ r Zadeh Umano Product-sum Nonparametric Hamacher ( λ = 0 ) Yager (p = 2) Hamacher (p = 0.25) Dombi (1) Dubois (0.25) Weber (15)
0.71 85.15 85.15 85.15 84.16 85.15 85.15 85.15 82.18 51.48
0.72 85.15 85.15 85.15 84.16 85.15 85.15 85.15 82.18 51.48
0.73 85.15 85.15 85.15 84.16 85.15 85.15 85.15 82.18 85.15
0.74 85.15 85.15 85.15 84.16 85.15 85.15 85.15 82.18 85.15
0.75 86.14 86.14 86.14 85.15 86.14 86.14 86.14 83.16 86.14
0.76 86.14 86.14 86.14 85.15 86.14 86.14 86.14 83.16 86.14
0.77 84.16 84.16 84.16 83.17 84.16 84.16 84.16 82.18 84.16
0.78 82.18 82.18 82.18 81.19 82.18 82.18 82.18 82.18 82.18
0.79 86.14 84.16 84.16 85.15 84.16 86.14 84.16 84.16 84.16
0.80 86.14 84.16 84.16 85.15 84.16 86.14 84.16 84.16 84.16
0.81 86.14 84.16 84.16 85.15 84.16 86.14 84.16 84.16 84.16
0.82 86.14 84.16 84.16 85.15 84.16 86.14 84.16 84.16 84.16
0.83 86.14 84.16 84.16 85.15 84.16 86.14 84.16 84.16 84.16
0.84 87.13 87.13 87.13 86.14 87.13 87.13 87.13 87.13 87.13
0.85 87.13 86.14 86.14 86.14 86.14 87.13 86.14 88.11 86.14
0.86 87.13 86.14 86.14 86.14 86.14 87.13 86.14 86.14 86.14
0.87 86.14 83.17 83.17 85.15 83.17 86.14 83.17 86.14 83.17
0.88 85.15 36.63 36.63 84.16 36.63 85.15 36.63 36.63 36.63
0.89 84.16 37.62 37.62 83.17 37.62 86.14 37.62 35.64 37.62
0.90 84.16 42.57 42.57 83.17 42.57 83.17 42.57 40.59 42.57
Ave. 85.54 77.97 77.97 84.46 77.97 85.59 77.97 77.03 74.60

Table 5.

Accuracy rates handled for different thresholds (%) unweighted fuzzy reasoning based on different T-operators.

Maximum values are given as bold.

On the other hand, accuracy rates handled for different thresholds within weighted fuzzy reasoning method based on different T-operators are given in Table 6. It is seen that Umano T-operators, Product-Sum T-operators, nonparametric Hamacher ( λ = 0 ), and Hamacher λ = 0.25 reached maxmimum accuracy rate for θ r = 0.84 with 88.12%. While unweighted fuzzy reasoning based on Dombi T-operators ( λ = 1 ) was handled maximum accuracy rate for θ r = 0.84 with 88.11%, weighted fuzzy reasoning based on Dombi T-operators ( λ = 1 ) reached 87.13% for θ r = 0.84 .

θ r Zadeh Umano Product-sum Nonparametric Hamacher ( λ = 0 ) Yager (p = 2) Hamacher ( λ = 0 .25) Dombi (1) Dubois (0.25) Weber (15)
0.71 81.19 85.15 85.15 85.15 84.16 85.15 79.21 83.17 85.15
0.72 82.18 85.15 85.15 85.15 84.16 85.15 78.22 83.17 85.15
0.73 82.18 85.15 85.15 85.15 84.16 85.15 78.22 83.17 85.15
0.74 82.18 85.15 85.15 85.15 84.16 85.15 78.22 83.17 85.15
0.75 82.18 86.14 86.14 86.14 85.15 86.14 78.22 84.16 86.14
0.76 82.18 86.14 86.14 86.14 85.15 86.14 78.22 84.16 86.14
0.77 80.20 84.16 84.16 84.16 83.17 84.16 81.19 83.17 84.16
0.78 79.20 82.18 83.17 83.17 79.21 83.17 81.19 80.20 80.20
0.79 81.18 84.16 85.15 85.15 79.21 85.15 81.19 81.19 80.20
0.80 81.18 84.16 85.15 85.15 79.21 85.15 81.19 81.19 80.20
0.81 80.20 84.16 85.15 85.15 79.21 85.15 81.19 81.19 80.20
0.82 80.20 85.15 85.15 85.15 79.21 85.15 81.19 81.19 80.20
0.83 80.20 85.15 85.15 85.15 79.21 85.15 81.19 81.19 80.20
0.84 80.20 88.12 88.12 88.12 80.20 88.12 87.13 83.17 81.19
0.85 80.20 87.13 87.13 87.13 80.20 87.13 86.14 82.18 81.19
0.86 73.27 85.15 85.15 85.15 78.22 85.15 82.18 82.18 81.19
0.87 72.28 36.64 36.64 66.34 76.24 36.63 36.63 35.64 76.24
0.88 75.25 36.64 36.64 36.64 74.26 36.67 36.63 35.64 76.24
0.89 75.27 35.64 35.64 35.64 76.24 35.64 34.65 32.67 78.22
0.90 74.26 41.58 41.58 41.58 77.23 41.58 38.61 38.61 78.22
Ave. 79.56 75.76 75.94 77.36 80.65 75.95 72.27 73.10 81.74

Table 6.

Accuracy rates handled for different thresholds (%) weighted fuzzy reasoning based on different T-operators.

Maximum values are given as bold.

The comparison of the performances between weighted and unweighted fuzzy reasoning based on different t-operators is done for each T-operator with Wilcoxon Signed Rank Test. It is seen that the performances of unweighted and weighted fuzzy reasoning based on Zadeh T-operators (p < 0.001), Yager T-operators (p < 0.001), Dombi T-operators (p < 0.001), Dubois T-operators (p < 0.05), and Weber T-operators (p < 0.001) are significantly different.

If the average is taken for the performances of the T-operators with 20 different thresholds (range = 0.71–0.90), Hamacher ( λ = 0.25 ) has the maximum value with 85.59% for unweighted fuzzy reasoning approach and Weber ( λ = 15 ) has the maximum value with 81.74% for weighted fuzzy reasoning approach.

Advertisement

5. Conclusion

Geographical classification of olive oil is an important topic. This topic has crucial manner for the human health from past to present. In addition, this topic is the main topic for the traceability of designation of origin olive oil. In pioneer study, we were interested in geographic classification system of olive oil. In accordance of this paper, chemical measurements were used for the experimental study. Chemical measurements contain imprecise information. In order to deal with imprecise information, fuzzy ID3 classifier was selected for the classification of olive oil samples. In addition, fuzzy ID3 reasoning method based on T-operators has been suggested. We made the experiments for the performances of proposed fuzzy reasoning method in order to solve geographic classification problem. In this paper, we propose weighted fuzzy reasoning approach based T-operators. Three nonparametric operators [Product-Sum_Umano, Product-Sum, and Nonparametric Hamacher ( λ = 0 )] have the same performance value with handled from C4.5 algorithm. Yet, the accuracy handled with Zadeh T-operators is smaller value with 82.18%. Then, we have checked the performance of parametric operators. Statistical procedure was performed in order to detect statistical differences among a group of results for 20 threshold ( θ r ) values. It is observed that there are significant differences among the results between unweighted and weighted fuzzy reasoning based approaches. It is seen that weighted fuzzy reasoning approach based on Umano T-operators, Product-Sum T-operators, Nonparametric Hamacher ( λ = 0 ), and Hamacher λ = 0.25 reached maxmimum accuracy rate for θ r = 0.84 with 88.12%. So, we claim that by using different parameters and weights for each rule, we can handle better reasoning performances.

Advertisement

Acknowledgments

The authors would like to thank Erden Kantarcı for his valuable support and Mrs. Ummuhan Tibet and Dr. Aytac Gumuskesen for allowing us to use the data set.

References

  1. 1. Chang RLP, Pavladis T. Fuzzy decision tree algorithms. IEEE Transactions on Systems, Man, and Cybernetics. 1977;7:28-35. DOI: 10.1109/TSMC.1977.4309586
  2. 2. Umano M, Okamoto H, Hatono I, Tamura H, Kawachi F, Umedzu S, Kinoshita J. Fuzzy decision trees by fuzzy ID3 algorithm and its application to diagnosis systems. In: Proceedings of the 3rd IEEE Conference on Fuzzy Systems; 26-29 June 1994; Orlando, FL, USA; 1994. pp. 2113-2118. DOI: 10.1109/FUZZY.1994.343539
  3. 3. Yuan Y, Shaw MJ. Induction off fuzzy decision trees. Fuzzy Sets and Systems. 1995;69:125-139. DOI: 10.1016/0165-0114(94)00229-Z
  4. 4. Aparicio R, Aparicio-Ruiz R. Chemometrics as an aid in authentication. In: Jee M, editor. Oils and Fats Authentication. Oxford, United Kingdom: Blackwell Publishing; and Boca Raton, FL: CRC Press; 2002. pp. 156-180
  5. 5. Marini F. Artificial neural networks in foodstuff analyses: Trends and perspectives A review. Analytica Chimica Acta. 2009;635(2):121-131. DOI: 10.1016/j.aca.2009.01.009
  6. 6. Harrington PB. Fuzzy multivariate rule-building expert systems: Minimal neural networks. Journal of Chemometrics. 1991;5:467-486. DOI: 10.1002/cem.1180050506
  7. 7. Harrington PB. Minimal neural networks: Differentiation of classification entropy. Chemometrics and Intelligent Laboratory Systems. 1993;19:143-154. DOI: 10.1016/0169-7439(93)80098-3
  8. 8. Harrington PB, Kister J, Artaud J, Dupuy N. Automated principal component-based orthogonal signal correction applied to fused near infrared-mid infrared spectra of French olive oils. Analytical Chemistry. 2009;81(17):7160-7169. DOI: 10.1021/ac900538n
  9. 9. Rezzi S, Axelson DE, Hėberger K, Reniero F, Marini C, Guillou C. Classification of olive oils using high throughput flow “H” NMR fingerprinting with principal component analysis, linear discriminant analysis and probabilistic neural networks. Analytica Chimica Acta. 2005;552(1):13-24. DOI: 10.1016/j.aca.2005.07.057
  10. 10. Petrakis PV, Agiomyrgianaki A, Christophoridou S, Spyros A, Dais P. Geographical characterization of Greek virgin olive oils (Cv. Koroneiki) using “H” and “P NMR” fingerprinting with canonical discriminant analysis and classification binary trees. Journal of Agricultural and Food Chemistry. 2008;56:3200-3207. DOI: 10.1021/jf072957s
  11. 11. Marini F, Balestrieri F, Bucci R, Magrý AD, Magrý AL, Marini D. Supervised pattern recognition to authenticate Italian extra virgin olive oil varieties. Chemometrics and Intelligent Laboratory Systems. 2004;73:85-93. DOI: 10.1016/j.chemolab.2003.12.007
  12. 12. Cichelli A, Pertesana GP. High performance liquid chromotographic analysis of chlorophylls, pheophytins and catotenoids in virgin olive oils: chemometric approach to variety classification. Journal of Chromatography A. 2004;1046:141-146. DOI: 10.1016/j.chroma.2004.06.093
  13. 13. Gurdeniz G, Ozen B, Tokatlı F. Comparison of fatty acid profiles and mid-infrared spectral data for classification of olive oils. European Journal of Lipid Science and Technology. 2010;112:218-226. DOI: 10.1002/ejlt.200800229
  14. 14. Zadeh LA. Fuzzy sets. Information and Control. 1965;8:338-353. DOI: 10.1016/S0019-9958(65)90241-X
  15. 15. JSR J, Sun CT, Mizutani E. Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence. Upper Saddle River: Prentice Hall; 1997
  16. 16. Dunn JC. A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. Journal of Cybernetics. 1973;3:32-57. DOI: 10.1080/01969727308546046
  17. 17. Bezdek JC. Pattern Recognition with Fuzzy Objective Function Algorithms. Vol. 256. New York: Plenum; 1981
  18. 18. Bezdek JC. Cluster validity with fuzzy numbers. Journal of Cybernetics. 1974:58-73. DOI: 10.1080/01969727308546047
  19. 19. Bezdek JC. Numerical taxonomy with fuzzy sets. Journal of Mathematical Biology. 1974;1:57-71. DOI: 10.1007/BF02339490
  20. 20. Dunn J. Well separated clusters and optimal fuzzy partitions. Journal of Cybernetics. 1974;4:95-104. DOI: 10.1080/01969727408546059
  21. 21. Nakashima T, Schaefer G, Yokota Y. A weighted fuzzy classifier and its application to image processing tasks. Fuzzy Sets and Systems. 2007;158(3):284-294. DOI: 10.1016/j.fss.2006.10.011
  22. 22. Sanz J, Galar M, Jurio A, Brugos A, Pagola M, Bustince H. Medical diaognosis of cardiovascular diseases using an interval-valued fuzzy rule based classification system. Applied Soft Computing. 2014;20:103-111. DOI: 10.1016/j.asoc.2013.11.009
  23. 23. Cordón O, Jesus MJ, Herrera F. A proposal on reasoning methods in fuzzy rule-based classification systems. International Journal of Approximate Reasoning. 1999;20:21-45. DOI: 10.1016/S0888-613X(00)88942-2
  24. 24. Ishibuchi H, Nozaki K, Tanaka H. Distributed representation of fuzzy rules and its application to pattern classification. Fuzzy Sets and Systems. 1992;52:21-32. DOI: 10.1016/0165-0114(92)90032-Y
  25. 25. Quinlan JR. Induction of decision trees. Machine Learning. 1986;1:81-106. DOI: 10.1007/BF00116251
  26. 26. Sanz JA, Bustince H, Fernández A, Herrera F. IIVFDT: Ignorance functions based interval-valued fuzzy decision tree with genetic tuning. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems. 2012;20(2):1-30. DOI: 10.1142/S0218488512400132
  27. 27. Nasibov E, Kantarcı Savaş S, Vahaplar A, Kınay AÖ. A survey on geographic classification of virgin olive oil with using T-operators in fuzzy decision tree approach. Chemometrics and Intelligent Laboratory Systems. 2016;155:86-96. DOI: 10.1016/j.chemolab.2016.04.004
  28. 28. Elkano M, Galar M, Sanz JA, Fernández A, Barrenechea E, Herrera F. Enhancing multiclass classification in FARC-HD fuzzy classifier: On the synergy between n-dimensional overlap functions and decomposition strategies. IEEE Transactions on Fuzzy Systems. 2015;23(5):1562-1580. DOI: 10.1109/TFUZZ.2014.2370677
  29. 29. Menger K. Statistical metrics. Proceedings of the National Academy of Sciences of the United States of America. 1942;28:535-537
  30. 30. Schweizer B, Sklar A. Probabilistic Metric Spaces. Amsterdam: North-Holland; 1973
  31. 31. Höhle U. Probabilistic uniformization of fuzzy topologies. Fuzzy Sets and Systems. 1978;1:311-332. DOI: 10.1016/0165-0114(78)90021-0
  32. 32. Alsina C, Trillas E, Valverde L. On some logical connectives for fuzzy set theory. Journal of Mathematical Analysis and Applications. 1983;93:15-26. DOI: 10.1016/0022-247X(83)90216-0
  33. 33. Gupta MM, Qi J. Theory of T-norms and fuzzy inference methods. Fuzzy Sets and Systems. 1991;40:431-450. DOI: 10.1016/0165-0114(91)90171-L
  34. 34. Marsala C, Bouchon-Meunier B. Choice of a method for the construction of fuzzy decision trees (Published in conference proceedings style.). In: Fuzzy Systems (FUZZ’03) The 12th IEEE International Conference, 1, 584-589. May 2003. pp. 23-28. DOI: 10.1109/FUZZ.2003.1209429
  35. 35. Pedrycz W, Sasnowski ZA. C-Fuzzy decision trees. IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applications and Reviews. 2005;35(4):498-511. DOI: 10.1109/TSMCC.2004.843205
  36. 36. Ishibuchi H, Yamamoto T. Rule weight specification in fuzzy rule based classification systems. IEEE Transactions on Fuzzy Systems. 1992;13(4):428-435. DOI: 10.1109/TFUZZ.2004.841738
  37. 37. Fernandez A, Almansa E, Herrera F. Chi-Spark-RS: An spark-built evolutionary fuzzy rule selection algorithm in imbalanced classification for big data problems (Published in conference proceedings style.). In: Fuzzy Systems (FUZZ’17) IEEE International Conference, 1, 1-6; 9-12 July 2017. DOI: 10.1109/FUZZ-IEEE.2017.8015520
  38. 38. Ishibuchi H, Nakashima T. Effect of rule weights in fuzzy rule weights in fuzzy rule based classification systems. IEEE Transactions on Fuzzy Systems. 2001;9(4):506-515. DOI: 10.1109/91.940964
  39. 39. Ishibuchi H, Yamamoto T, Nakashima T. Fuzzy data mining: Effect of fuzzy discretization. In: Proceeding 1st IEEE International Conference Data Mining; November 2001; San Jose, CA. pp. 241-248. DOI: 10.1109/ICDM.2001.989525
  40. 40. Hong T-P, Kuo C-S, Chi SC. Trade off between computation time and number of rules for fuzzy mining from quantitative data. International Journal of Uncertainty, Fuzziness and Knowlege-Based Systems. 2001;9(5):587-604. DOI: 10.1142/S0218488501001071
  41. 41. Weber S. A general concept of fuzzy connectives, negations and implications based on t-norms and t-conorms. Fuzzy Sets and Systems. 1983;11:115-134
  42. 42. Bandler W, Kohout L. Fuzzy power sets and fuzzy implication operators. Fuzzy Sets and Systems. 1980;4:13-30
  43. 43. Oussallah M. On the use of Hamacher’s t-norms family for information aggregation. Information Sciences. 2003;153:107-154
  44. 44. Yager RR. On a general class of fuzzy connectives. Fuzzy Sets and Systems. 1980;4:235-242
  45. 45. Dombi J. A general class of fuzzy operators, the De Morgan class of fuzzy operators and fuzziness induced by fuzzy operators. Fuzzy Sets and Systems. 1982;8:149-163
  46. 46. Dubois D, Prade H. New results about properties and semantics of fuzzy set-theroetic operators. In: Wang PP, Chang SK, editors. Fuzzy Sets. New York: Plenum Press; 1986. pp. 59-75
  47. 47. Gumuşkesen AS, Yemiscioglu F. Project Name: Türkiye'deki Zeytin Çeşitlerinin ve Zeytinyağlarının Bölgesel Olarak Karakterizasyonu (2007) Project Number: 2005/BİL/020[Internet]. Available from: http://food.ege.edu.tr/d-83/akademikyapi.html [Accessed: February 24, 2016]
  48. 48. Kantarcı S, Vahaplar A, Kınay AÖ, Nasiboğlu E. Influence of different T-norm and T-conorm operators in fuzzy decision trees. In: Proceedings of 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). 2015. pp. 1-6
  49. 49. Quinlan JR. C4.5: Programs for Machine Learning. San Mateo, California: Morgan Kaufmann; 1993

Written By

Suzan Kantarcı-Savaş and Efendi Nasibov

Submitted: 20 November 2017 Reviewed: 05 July 2018 Published: 26 September 2018