Ten students assessed under five academic subjects.
A multicriteria (MC) problem usually consists of a set of predetermined alternatives or subjects to be analyzed, which is prescribed under a finite number of criteria. MC problems are found in various applications to solve various area problems. There are three goals in solving the problems: ranking, sorting or grouping the alternatives according to their overall scores. Most of MC methods require the criteria weights to be combined mathematically with the quality of the criteria in finding the overall score of each alternative. This chapter provides an overview on the practical consideration of evaluators’ credibility or superiority in calculating the criteria weights and overall scores of the alternatives. In order to show how the degree of credibility of evaluators can be practically considered in solving a real problem, a numerical example of evaluation of students’ academic performance is available in the Appendix at the end of the chapter. The degree of credibility of teachers who participated in weighting the academic subjects was determined objectively, and the rank-based criteria weighting methods were used in the example. Inclusion of the degree of credibility of evaluators who participated in solving multicriteria problems would make the results more realistic and accurate.
- multicriteria problem
Multicriteria decision-making (MCDM) is now considered as one discipline of knowledge, which has been expanding very fast in its own domain. Basically, it is about how to make decision when the undertaken issue is surrounded with a multiple number of criteria. The MC problem consists of two main components, alternatives and criteria. In real-life situations, the alternatives are options, organizations, people, or units to be analyzed which are prescribed under a set of finite criteria or attributes. If the number of alternatives is finite and known, the task is to select the best or the optimal alternative, to rank the alternatives according to their overall quality or performance, or to sort or group the alternatives based on certain measurements or values. In this case, the MC problem is usually called as a multiattribute decision-making (MADM) problem, and the alternatives are prescribed under a finite number of criteria or attributes . The MADM methods are utilized to handle discrete MCDM problems . This chapter focuses on MADM problems or more generally MCDM problems, where this type of problem has a finite number of predetermined alternatives, which is described by several criteria or attributes. MCDM problems can be found in various sectors.
1.1 Examples of multicriteria decision-making problems
Selection problems are really of an MCDM type, a simple problem that we are facing almost every day, for example, when we want to select a dress or a shirt to wear. A decision to choose which dress or shirt is based on certain attributes or factors, such as for what function (office, leisure, and business), color preference, and style or fashion. Here, the types of dress/cloth are the alternatives, while all factors that become the basis of evaluation are the attributes. Another example is when we want to choose the best location to set up projects such as housing, industrial, agricultural activities, recreation center, hoteling, and so on. Many factors or criteria that may be conflicting with each other should be considered by the decision-makers. Selecting the best candidate for various positions that can be conducted in many settings such as face-to-face interviews or online test is also an MCDM problem since the selection will be based on certain requirements. Selecting employees in different organizations with different scope of jobs with different requirements imposed by the related organization can also be categorized as an MCDM problem.
Another example is about selection of the best supplier of a manufacturing firm [3, 4], selection of the best personal computer , and selection of a suitable e-learning system  to be implemented in an educational institution. These studies focused on selecting the best alternative from a finite number of alternatives that were prescribed under a few evaluation criteria. These studies have the same main issue that is the relative importance or the weights of the evaluation criteria toward the overall performance of the alternatives under study. The studies provide ways to find weights subjectively and how to aggregate the weights when a group of decision-makers were involved in judging the importance of the criteria.
In addition, conducting an evaluation of a program, for example, is usually done after identifying the aspects of the program to evaluate. We may have many programs to be evaluated under several aspects of evaluations with the involvement of one evaluator or a group of evaluators. In a different situation, it may be only one program to be evaluated under several aspects and may be evaluated by one or many evaluators. Besides, many other evaluation situations are usually performed with the presence of many criteria such as evaluation of students, evaluation of employees’ performance, evaluation of learning approaches , and evaluation of students’ performance . In relation to the study about the evaluation of students’ academic performance in primary schools, five academic subjects were assumed to have different contribution toward the overall performance of the students. A few experienced teachers were asked to evaluate the degree of importance of the subjects. The resulting weights of the academic subjects were incorporated in finding the overall academic performance of the students in year six in one selected primary school in the northern part of Malaysia. For the purpose of illustrating the practical consideration of the credibility of the evaluators, the problem of evaluation of students’ academic performance is extended by considering the credibility of the teachers who participated in weighting the academic subjects. The detailed discussion is available in the Appendix at the end of the chapter.
1.2 Credibility of the evaluators
Referring to those examples of MCDM scenarios, decision-maker(s) or evaluator(s) are involved in many stages of the evaluation process in searching for the optimal solution. As all MCDM problems have two main components, the alternatives and the criteria or attributes, the decision-maker(s) or the evaluator(s) would involve in at least two situations: deciding the quality of each alternative based on each of the criteria and also finding the relative importance of the criteria toward the overall performance of the alternatives. As what is usually arose in solving MCDM problems, criteria are contributing at different level of importance and should become a concern to the decision-maker(s) or evaluator(s). The criteria or attributes of the units to be analyzed should not be assumed to have same contribution toward the overall quality of the alternatives.
Besides having a challenge in finding the suitable evaluator(s) or decision-maker(s), since they might come with different background and experience, they also come with different levels of superiority or credibility that should be taken into consideration. This issue should be thought seriously because the results may be misleading if those who are involved in doing the evaluation or judgment do not have enough experience or less credible to give judgment regarding the MCDM problem under study. Moreover, the results may differ among the evaluators if the evaluators are at different levels of superiority . Therefore, the credibility of expert(s) or evaluator(s) or decision-maker(s) who are involved in assessing quality of the alternatives or relative importance of attributes should be taken into consideration.
Webster’s New World College Dictionary defines credibility as the quality of being trustworthy or believable. Credibility is also interpreted by good reputation, reputation, honor, and the presence of someone who stands out in the professional community . Meanwhile, professionalism refers to competence or skill expected of a professional. In other words, a professional is someone who is skilled, reliable, and entirely responsible for carrying out their duties and profession . This definition of professionalism has a resemblance to the term of credibility so that the two are like two sides of a coin that cannot be separated. For the purpose of assessment or evaluation, professionalism and credibility are the competencies of assessors in carrying out their functions and roles well, full of commitment, trustworthiness, and accountability.
It is normal that the assessors have different levels of credibility, and their credibility should be considered together with their assessments or evaluations. This chapter provides an overview of the current work on how the credibility of the decision-maker(s) or evaluator(s) could be considered especially on evaluating the importance of the criteria or attributes of any MCDM under investigation, how to quantify the credibility of those people, and how that quantitative values could be incorporated in finding the overall score of the alternatives. This issue falls under the concept of group decision-making and extends it with the consideration of the degree of superiority or credibility of the decision-maker(s) or evaluator(s). By deliberation of different relative importance of the attributes plus the different level of credibility or superiority of those who are involved in finding the optimal solution of the MCDM problem, the solution of the problem would be more realistic, accurate, and representative of the true setting of the problem.
In achieving the objective of the writing, the chapter is organized as follows. The next section describes the basic notations for this chapter. Section 3 discusses the concept of weights and the related methods, particularly the rank-based weighting method. Section 4 discusses on the aggregation of criteria weights and the values of criteria. Section 5 explains how to aggregate the credibility of the evaluators who are involved in weighting or finding weights or relative importance of the criteria. Furthermore, Section 5 also illustrates two approaches to aggregate the degree of credibility of evaluators in finding the relative importance in order to find the overall performance of the alternatives and their rankings. Section 6 suggests a few ways to quantify the credibility of the evaluators. The conclusion of the chapter is in Section 7, which is followed by a list of all references of the chapter. A numerical example is provided in the Appendix at the end of the chapter.
2. Basic notation
Let be a set of
In relation to the numerical example in the Appendix, the students are the alternatives, while the academic subjects are the criteria. So, represents a set of 10 students that are prescribed under five academic subjects, , and is the score of student , under academic subject , where and . The weights of the criteria, , obviously refer to the relative importance of the academic subjects toward the composite or final score of each student.
3. Weights of criteria
In finding the relative importance of the criteria or simply the weights of the criteria, , there are many methods available in literature which are classified into two main approaches, objective methods and subjective methods . The objective methods are data-driven methods where quality values of the criteria should be available prior to the evaluation of criteria‘s relative importance. Based on the criteria’s values, proxy measures such as standard deviation, correlation, variance, range, coefficient of variation, and entropy [13, 14, 15, 16, 17] would represent the criteria weights to be calculated. In relation to the concept of entropy, it was introduced in the communication theory, usually refers to uncertainty. The measure of entropy is often used to quantify the information or message. However, the entropy measure has become the proxy measures of criterion weights in MCDM domain. In other words, these objective methods produce weights of criteria based on the intrinsic information of the criteria. These methods do not require evaluators to do the criteria weighting. No further discussion is included in this chapter because objective weights are not the focus of the chapter.
3.1 Rank-based weighting methods
This subsection focuses on the discussion of rank-based weighting methods [18, 19] as these methods are used in this chapter in the illustration of practical consideration of evaluators’ credibility in evaluating relative importance of criteria for some real-life multicriteria problems. These methods are very easy to use but have good impact . Three popular rank-based methods are rank-sum (RS), rank reciprocal (RR), and rank order centroid (ROC). The mathematical representations of the three methods are as follows.
Suppose be a ranking of criterion given by an evaluator where is an integer number with possible values from 1 to
Referring to the numerical example in the Appendix, there are five criteria representing five academic subjects; is a ranking of academic subjectwhere is an integer number with possible values from 1 to 5, while the value of represents ranks of academic subjects 1 to 5 that can be transformed into weights of academic subjects 1 to 5, , respectively.
Many studies were conducted to study the performance of these rank-based methods as criteria weighting methods. For example, a simulation experiment was conducted on investigating the performance of the three rank-based weighting methods (RS, RR, RS) and equal weights (EW) where the data was generated on a random basis . Three performance measures of the methods were “hit rate,” “average value loss,” and “average proportion of maximum value range achieved.” The results show that the ROC was found to be the best technique in most cases an in every measure. Another study on these three rank-based weighting techniques and EW concludes that the rank-based methods have higher correlations with the so-called true weights than EW .
A study is also done where EW, RS, and ROC methods were compared to direct rating and ratio weight methods . Basically, the direct rating method is a simple type of weighting approach in which the decision-maker or the evaluator must rate all the criteria according to their importance. The evaluator can directly quantify their preference of the criteria. The rating does not constrain the decision-maker’s responses since it is possible for the evaluator to alter the importance of one criterion without adjusting the weight of another . The comparison was conducted under a condition that the evaluators’ judgments of the criteria weights are not certain and subject to random errors. The results show that the direct rating tends to give better quality of decision results when the uncertainty is set as small, while ROC provides comparable results to the ratio weights when a large degree of error is placed. Please note that the ratio weight method requires the evaluators firstly rank the related criteria based on their importance. The evaluators should allocate certain value such as 10 for the least important attribute, and the rest of attributes are judged as multiples of 10. The weight of a criterion is obtained by dividing the criterion’s weight with the sum of all attributes’ weights.
The superiority of ROC over other rank-based methods is also subsequently confirmed in different simulation conditions . An investigation on RS, RR, and ROC weighting methods was also carried out by changing the number of criteria from two to seven . It is found that ROC gives the largest gap between the weights of the most important criterion and the least. RS provides the flattest weight function in the linear form. For RR, the weight of the most important one descends most aggressively to that of the second highest weight value, and then, the function continues to move flatter. In relation to rank-based weighting methods, another rank-based method was proposed . This new rank-based method is called as generalized sum of ranks (GRS). Further investigation was carried out where the performance of GRS was compared to RS, RR, and ROC using a simulation experiment. The result of the investigation shows that GRS has a similar performance to ROC.
Based on the previous discussion, it can be concluded that the three rank-based weighting methods, RS, RR, and ROC, are having good features especially the ROC method. Therefore, these rank-based methods are used in the current study to illustrate how to include the degree of credibility of the evaluators who are involved in ranking the importance of the criteria. Furthermore, converting the ranks into weight values is not difficult, and the related formula is given as in Equations (1), (2), and (3).
3.2 Other subjective weighting methods
Other subjective weighting methods are analytic hierarchy process (AHP) [4, 27, 28], swing methods [29, 30], graphical weighting (GW) method , and Delphi method . The AHP technique was introduced in 1980 . It is a very popular MC approach, and it is done by conducting pairwise comparison of the importance of each pair of criteria. A prioritization procedure is implemented to draw a corresponding priority vector, where this priority vector represents the criteria weights. Thus, if the judgments are consistent, all prioritization procedures would give the same results. At the same time, if the judgments are inconsistent, prioritization procedures will provide different priority vectors . Nevertheless, AHP is widely criticized for being such a tedious process, especially when there are a significant number of criteria or alternatives.
For the swing method, the evaluator must identify an alternative with the worst consequences on all attribute. The evaluator(s) can change one of the criteria from the worst consequence to the best. Then, the evaluator(s) is asked to choose the criteria that he/she would most prefer to modify from its worst to its best level, the criterion with the most chosen swing is the most important, and 100 points is allocated to the most important criterion.
The GW method begins with a horizontal line that is marked with a series of number, such as (9-7-5-3-1-3-5-7-9). The evaluator is expected to place a mark that represents the relative importance of a criterion on the horizontal line with the basis that a criterion is either more, equally, or less important than another criterion by a factor of 1–9. Then, a decision matrix is built as a pairwise comparison matrix. A quantitative weight for a criterion can be calculated by taking the sum of each row, and then the scores are normalized to obtain an overall weight vector. The GW method enables the evaluators to express preferences in a purely visual way. However, GW is sometimes criticized, since it allows evaluator(s) to assign weights in a more relaxed manner.
A Delphi subjective weighting method  requires one focus group of evaluators to evaluate the relative importance of the criteria. Each evaluator remains nameless to each other that can reduce the risk of personal effects or individual bias. The evaluation is conducted in more than one round until the group ends with a consensus of opinions on the relative importance of the criteria under study. The main advantage of this method is that the method avoids confrontation of the experts . However, to pool up such a focus group is quite costly and timely.
4. Aggregation of criteria weights and values of criteria
Finding the final score of each alternative is very important since the final scores of the alternatives are required to rank the alternatives. Basically, those alternatives with higher scores should be positioned at higher rankings and vice versa. In order to find the overall or composite or final values of each alternative, the criteria weights should be aggregated with each alternative’s values of the corresponding criteria. There are many aggregation methods available in literature. The section focuses on simple additive weighted average (SAW) method as the chapter uses SAW in the numerical example (in the Appendix at the end of the chapter). Furthermore, SAW method is a very well-established method and very easy to use .
4.1 Simple additive weighted average (SAW) method
The mathematical equation for SAW is given as follows:
is the overall score of alternative . Based on , where , the
SAW is an old method, and MacCrimmon is one of the first researchers that summarized this method in 1968 . As a well-established method, it is used widely  in solving MC problems, particularly for the evaluation of alternatives. Basically, this method is the same as the simple arithmetic average method, but instead of having the same weight values for the criteria, SAW method uses mostly distinct weights values of the criteria. As given in Eq. (4), the overall performance of each alternative is obtained by multiplying the rating of each alternative on each criterion by the weight assigned to the criterion and then summing these products over all criteria . The best alternative is the one that obtained the highest score and will be selected or ranked at the first position. Many recent studies used the SAW method, for example, in [39, 40, 41], and a review on its applications is also available .
Besides SAW or also known as weighted sum method (WSM), there is another average technique, called weighted product model (WPM) or simple geometric weighted (SGW) or simple geometric average method. In WPM, the overall performance of each alternative is determined by raising the rating of the alternative to the power of the criterion weight and then multiplying these products over all criteria . However, WPM is a little bit complex as compared to SAW since WPM involves power and multiplications.
4.2 Other aggregation methods
AHP , technique for order preference by similarity to ideal solution (TOPSIS), and
AHP and TOPSIS are two different aggregation methods. TOPSIS assigns the best alternative that relies on the concepts of compromise solution, where the best alternative is the one that has the shortest distance from the ideal solution and the farthest distance from the negative ideal solution . In other words, alternatives are prioritized according to their distances from positive ideal solutions and negative ideal solutions, and the Euclidean distance approach is utilized to evaluate the relative closeness of the alternatives to the ideal solutions. There is a series of steps of TOPSIS, but this method starts with the weighted normalization of all performance values against each criterion. Some recent applications of the TOPSIS method are available [45, 46, 47, 48].
VIKOR method  is quite similar to TOPSIS method, but there are some important differences, and one of the differences is about the normalization process. TOPSIS uses the vector linearization where the normalized value could be different for different evaluation unit of a certain criterion, while VIKOR uses linear normalization where the normalized value does not depend on the evaluation unit of a criterion. VIKOR has also been used in many real-world MCDM problems such as mobile banking services  digital music service platforms , military airport location selection , concrete bridge projects , risk evaluation of construction projects , maritime transportation , and energy management .
5. Inclusion of credibility of evaluators in solving multicriteria problems
This section discusses how credibility can be included practically in solving MC problems. Suppose the evaluators are requested to evaluate the relative importance of the criteria based on rank-based weighting methods as explained in Section 3.1. Suppose there is a panel of evaluators, and let be rank of criterion , evaluated by evaluator , where . In order to include the credibility of the evaluators, let us introduce a new set of values that represents the different credibility of the evaluators. Let be the degree of credibility of evaluator , where , and . There are two approaches  where the degree of credibility of the evaluators could be attached in finding the overall scores. The first approach is in calculating the final weight of criteria as given in Figure 1, and the second approach is in computing the overall performance of the alternatives as given in Figure 2.
For the first approach as portrayed in Figure 2, the degree of credibility of the evaluators is attached to the resulted weights from the ranks of criteria by using any of the equations, Eq. (1), Eq. (2), or Eq. (3). So, here there are
For the second approach, the criteria weights obtained from each evaluator are kept, and then each set of weights is aggregated with the quality values of each alternative. So, here there are
Referring to the numerical example in the Appendix, there were three evaluators involved in ranking the importance of the five academic subjects, and the number of students is 10. So, is the rank of academic subject , with
6. Quantification of credibility of evaluators
Credibility is synonym to professionalism, integrity, trustworthiness, authority, and believability. A study focuses on how to assess the credibility of expert witnesses . A 41-item measure was constructed based on the ratings by a panel of judges, and a factor analysis yielded that credibility is a product of four factors: likeability, trustworthiness, believability, and intelligence. Another study concerns about the credibility of information in digital era . Credibility is said to have two main components: trustworthiness and expertise. However, the authors conclude that the relation among youth, digital media, and credibility today is sufficiently complex to resist simple explanations, and their study represents a first step toward mapping that complexity and providing a basis for future work that seeks to find explanations.
It can be argued that the degree of credibility of evaluators or judges or decision-makers can be determined subjectively or objectively, where the former one can be done by using certain construct as proposed in  or can be determined based on certain objective or exact measures such as years of experience, salary scale, or amount of salary. The quantification of the degree of credibility opens a new potential area of research as there are very few researches done especially on finding the suitable objective proxy measures of the degree of credibility.
Finding the degree of credibility subjectively requires more time and much harder as it involves a construct or an instrument which would be used as a rating mechanism to obtain the degree of credibility. Meanwhile, finding the degree of credibility based on objective information is simpler and easier to do. As an illustration on how to quantify the credibility objectively, suppose there are three experts with their basic salaries in a simple ratio of 1:2:3. So, this ratio can be converted as 0.167:0.333:0.500, so that the sum of credibility of the evaluators is equal to 1. These values can be used to represent the degree of credibility of the evaluators or experts 1, 2, and 3, respectively. It should be noted that the sum of the degrees of credibility of the three evaluators is equal to one to make the future calculation simple while easier for interpretation of the values. Here, evaluator 3 is the most credible one since he/she has the highest salary among the three, and it is a usual practice that those who are higher in terms of expertise usually are paid higher. The same computation can be used for the years of experience or salary scale.
The numerical example in the Appendix extends the problem of evaluating students’ academic performance which is discussed earlier in the Introduction. Here, the credibility of the teachers who were asked to assess the relative importance of the five subjects was considered. In order to incorporate the degree of credibility of the teachers, a new set of values is introduced to represent these different degrees of credibility. The example shows two ways of calculations on how the credibility values could be included in finding the overall scores of the alternatives. As expected, the overall scores and the overall ranking are different as compared to overall scores of not considering the different credibility of the teachers. The details and the step-by-step methodology are also included in the Appendix.
This chapter provides an overview on the practical consideration of evaluators’ credibility in evaluating relative importance of criteria for some real-life multicriteria problems. Credibility of the evaluators who are involved in solving any multicriteria problem should be included in calculating the overall scores of the alternatives or the units of analysis. This chapter demonstrates how the credibility of evaluators who participated in finding the criteria weights can be combined with the criteria weights and the quality of the criteria of the alternatives. Rank-based criteria weighting methods are used as an illustration in a numerical example of evaluation of students’ academic performance problem at the end of the chapter. However, other criteria subjective weighting methods are also possible to be used but with caution especially at the stage of aggregation of criteria weights and criteria values. It may exist only one approach to do the aggregation due to the underpinning concepts of the aggregation methods. The chapter uses simple additive weighted average method as the aggregation method since the method is very well established. The use of other aggregation techniques is also plausible. The chapter also suggests a few practical proxy measures of the credibility but is still very limited. More researches should be conducted to find ways of measuring the credibility of evaluators or experts either subjectively or objectively. Inclusion of the credibility of evaluators in solving multicriteria problems is realistic since the evaluators come from different backgrounds and levels of experience. Quantification of the evaluators’ credibility subjectively or objectively opens a new insight in group decision-making field. Furthermore, the credibility of the evaluators should also be considered in other multicriteria problems in other areas, so that the results are more practical and accurate.
Mr. Zachariah is a class teacher of 10 excellent students in one of the best primary schools of a country. The 10 students were already given the final marks of five main academic subjects by their respective teachers as in Table 1. Mr. Zachariah must rank the students according to their performance because these students will be given awards and recognition on their graduation day.
|Native language||English language||Mathematics||Science||History|
Suppose three experienced teachers, Edward, Mary, and Foong, were asked to evaluate the relative importance of the five academic subjects with their degree of credibility as discussed in previous section, that is, the salary ratio of the three teachers is 0167: 0.333: 0.500. The rank-based technique is used to analyze the ranking of importance of the academic subjects given by these three teachers by using Eq. (1).
The results are given in Table 2. Column 2 displays the ranking of the criteria evaluated by teacher 1, and column 3 shows the corresponding criteria weights as analyzed by Eq. (1), while columns 4 and 5 and columns 6 and 7 show the respective results by teachers 2 and 3, respectively. The second last column of the table summarizes the criteria weights when the teachers are of same credibility. The values were computed as the simple arithmetic average of the corresponding criterion, while the last column has the final weights that were calculated as the simple arithmetic average as well but with consideration of the different degree of credibility according to Approach 1 as given in Figure 2. Please note that the both sets of final weights are already summed to one. So, the normalization process to guarantee the sum of weights is one and is not necessary.
|Teacher 1 (0.167)||Teacher 2 (0.333)||Teacher 3 (0.500)||Final weight same credibility (SC)||Final weight different credibility (DF)|
Now, in order to find the overall performance of each student, for example, the overall performance of student 1 without consideration of credibility of teachers in evaluating the relative importance of the academic subjects, it is simply done by multiplying row 2 of Table 1 with its corresponding criteria weights in the second last column of Table 2 by using Eq. (4) as follows:
The same process is performed to find the overall scores of student 1, if the credibility of the teachers in finding weights of the criteria is considered but the weights in last column of Table 2 is used, instead.
Table 3 gives the overall scores and the corresponding final rankings of all students based on average criteria weights with the same (SC) and different (DC) credibility of the teachers. The overall scores are all different, while the rankings are different especially for ranks 8 and 9 and 4 and 5.
Table 4 summarizes three individual overall score of the three different teachers without consideration of their credibility, while the second last column and the last column are the average overall scores of the three overall scores and its corresponding rankings, respectively.
Table 5 shows the three overall scores by consideration of the credibility of teachers in finding the academic subjects’ weights, and the average overall scores of the three overall scores. The ranking of the students is based on the average overall scores in column 5 of the table. Here, Approach 2 as in Figure 3 is used to find the final overall scores of the students.
To make the comparison easier, Table 6 summarizes the overall scores and their corresponding rankings of the students with SC and DC of the teachers when calculating the academic subjects’ weights based on Approach 2.
As the two sets of the overall scores are different, all rankings based on both sets of the overall scores are the same except for ranks 8 and 9. There is not much different in the overall rankings since the MC problem that is considered here is only a small scale problem with only 10 alternatives and 5 criteria. However, the two sets of overall values are totally different. There may be much more differences in terms of rankings if a bigger MC problem with more alternatives and more criteria is considered. The final ranking of the students obtained by consideration of the different credibility of the teachers should be selected as the practical and valid results.