Computer and Information Science » Human-Computer Interaction » "Human-Computer Interaction", book edited by Inaki Maurtua, ISBN 978-953-307-022-3, Published: December 1, 2009 under CC BY-NC-SA 3.0 license

Chapter 15

Adaptive Evaluation Strategy Based on Surrogate Models

By Yi-nan Guo and Jian Cheng
DOI: 10.5772/7732

Article top

Overview

Human-computer interface in AES-IGAs.
Figure 1. Human-computer interface in AES-IGAs.

							ρ(t) in experiments.
Figure 2. ρ(t) in experiments.
Evaluation error of surrogate models.
Figure 3. Evaluation error of surrogate models.
The trend of human fatigue.
Figure 4. The trend of human fatigue.

Adaptive Evaluation Strategy Based on Surrogate Models

Yi-nan Guo1 and Jian Cheng1

1. Introduction

Interactive genetic algorithm(IGA) is a kind of genetic algorithms in which individuals’ fitness values are evaluated by human subjectively. They combine canonical genetic algorithms with human intelligent evaluations. Now they have been applied to optimization problems whose objectives can not be expressed by explicit fitness functions, such as music composition(Biles,1996), production design (Takagi,1998), image retrieval (Takagi,1999) and so on. However there exits human fatigue in evaluation, which limits population size and the number of evolution generations. Moreover, subjective fitness values given by human mainly reflect human cognition and preference, which is related to domain knowledge. So IGAs need more knowledge than other genetic algorithms for explicit optimization functions.

In IGAs, human fatigue is a key problem which limits the application of the algorithm. According to physiological characters of human, if human are absorbed in one work for a long time, they are easy to feel tired. However, human need to evaluate each individual in each generation. More individuals human evaluate, they feel more tired. If cognition knowledge about human preference to optimization problems can be extracted from the evolution and utilized to evaluate individuals instead of human, the number of individuals evaluated by human shall decrease which will alleviate human fatigue. This cognition knowledge is generally described by surrogate models.

Surrogate models are trained by samples which consist of human evaluated individuals and their fitness values. Hence, they approximatively reflect human preference. When human feel tired, surrogate models can replace human to evaluate all individuals or part of individuals in each generation so as to reduce the number of individuals evaluated by human. This can reduce human burden and alleviate human fatigue.

Here, four key issues about surrogate models which influenc the performances of the algorithm are taken into account. Firstly, surrogate models must exactly keep consistency with human cognition and preference in order to ensure the convergence of the algorithms. So how to obtain the models with good prediction precision and generalization is the base. The second issue is when to use surrogate models instead of human in evaluation. The last two problems are how many and which individuals are selected to calculate fitness values by surrogate models in each generation.

Up to now, many researches on surrogate models have been done. But most of them focus on the structure of surrogate models. Different models adopting artificial neural networks(Zhou,2005) or support vector machines(Wang,2003) are proposed. However, how to rationally utilize surrogate models are not taken into account enough. Firstly, in existing researches, no matter whether surrogate models exactly reflect human real preference, the models are used to evaluate individuals when human feel tired. Secondly, population size is small and fixed all the time. And the number of individuals evaluated by the models in each generation is also fixed. These make IGAs alleviating human fatigue limitedly. Thirdly, few of researches concern which individuals are selected to be evaluated by models. To solve above problems, latter three issues are mainly studied, including when the models are used in evaluation, how many and which individuals are evaluated by the models in each generation.

In the chapter, assume that human preference to optimization problems is stable. A novel interactive genetic algorithms adopting adaptive evaluation strategy based on surrogate models(AES-IGAs) is proposed. Firstly, two measures reflecting the models’ precision and human fatigue are given. Based on them, evaluation process is divided into three phases. In different phase, human participate in evaluation to different extent. Secondly, the number of individuals evaluated by surrogate models is adaptively tuned in evolution according to above two measures so as to alleviate human fatigue at most. Because surrogate models calculate individuals’ fitness values by computers which do not need human participation, population size is enlarged when only surrogate models are adopted in evaluation so as to improve convergence speed. Thirdly, how to select rational individuals evaluated by the models from population is studied. Fuzzy c-means clustering algorithm is adopted And three kinds of approximate distance is put forward as measures. To validate the rationality of AES-IGAs, experiments based on fashion evolutionary design system are done. And testing results are analyzed at last.

2. Startup Mechanism about Surrogate Model

Startup mechanism offers some conditions which decide when to use surrogate model in evaluation process. That is, in which generation these conditions are satisfied, population can be evaluated by surrogate models in proper proportion. In general, when human feel tired or surrogate models have learned human preference exactly, the models are adopted to calculate the individuals’ fitness values. So startup mechanism about surrogate models normally includes two conditions. They are the condition of human fatigue and the condition of models’ precision. When any of conditions is satisfied, surrogate models are start up to evaluate.

Suppose F M is the fitness value calculated by surrogate models. F U is the fitness value given by human. x M and x U respectively denote individuals evaluated by the models and human. X(t) expresses the population in t-th generation. Adaptive evaluation strategy is shown as follows(Guo,2007).

F(X(t))={FM(xM,t),FU(xU,t)},xMxU,xM,xUX(t)s.t.FM,((Fa(t)ε)(Pr(t)Ψ))

Here, Fa(t)≥ε describes the condition of human fatigue where Fa(t) expresses the degree of human fatigue and ε is the threshold of Fa(t).

Definition 1: the degree of human fatigue

The degree of human fatigue reflects how tired human are. In general, human will spend more time evaluating individuals when there are more similar individuals in the population. Therefore, human feel more tired when the total number of individuals evaluated by human is more and time for evaluation in each generation is more.

Letting v(t) denotes how long human spend evaluating. β(t) is the proportion of human evaluated individuals to population. The degree of human fatigue is defined as follows(Guo, 2007).

Fa(t)=1-e-tv(t)β(t)S(t)

In formula(2), S(t) denotes the similarity of population, which describes average similarity of individuals in population, shown as follows.

S(t)=2i=1|P1|j=i+1|P|l=1nσl(xi(t),xj(t))|P||P1|

where |P| is population size and n is the length of individuals. σ l (xi (t),xj (t)) expresses the similarity of l-th bit between two individuals. σ l (xi (t),xj (t))=1 if l-th bit of xj (t) is same as it of xj (t), otherwise σ l (xi (t),xj (t))=0.

In formula(1), Pr(t)≤Ψ describes the condition of models’ precision where Pr(t) expresses the reliability of surrogate models and Ψ is the threshold of Pr(t).

Definition 2: the reliability of surrogate models

The reliability of surrogate models reflect the consistency between surrogate models and human preference. It is measured by average Euclid distance between individuals’ fitness values calculated by the models and them given by human, shown as follows.

Pr(t)=1|P|i=1|P|(FU(xi,t)-FM(xi,t))2

Obviously Pr(t) also describes the generalization of surrogate models.

In a word, whether surrogate models are utilized lies on two conditions: whether or not the degree of human fatigue exceed the threshold; whether or not reliability of surrogate models exceed the threshold.

3. The Number of Substituted Individuals Evaluated by Surrogate Models

In AES-IGAs, some individuals are evaluated by human, whereas others’ fitness values are calculated by surrogate models. Here, individuals evaluated by the models in each generation are called substituted individuals. How many substituted individuals are there is a key problem, which influences the performance of AES-IGAs. Up to now, in most of IGAs adopting surrogate models based evaluation strategy, the number of substituted individuals is fixed and equal to fifty percent of population size. This limits the effect of surrogate models on performances. Aiming at this problem, the number of substituted individuals adaptively varies in AES-IGAs.

In general, one hopes to evaluate less individuals in IGAs when he feels tired. And when surrogate models can reflect human preference better, it shall be used to evaluated more individuals instead of human so as to reduce human burden. Therefore, two factors including human fatigue and models’ precision are considered to decide the number of substituted individuals. When human feel more tired or models’ precision is better, less individuals are evaluated by themselves. So the number of substituted individuals in t-th generation is defined as

NM(t)=|P|ρ(t)

Here, ρ(t) is the proportion of substituted individuals to population, called the substituted proportion, shown as follows.

ρ(t)=e-(Pr(t)/f¯)(ε-Fa(t))

Here, f¯ denotes the upper bound of the fitness values. Obviously the substituted proportion also satisfies ρ(t)+β(t)=1.

4. Substitution Mechanism about Surrogate Model

In existing researches, the substituted proportion in evolution is normally fixed after surrogate models are used. Corresponding evaluation process in IGAs can be divided into two phases: human-evaluation phases and models-evaluation phases. According to the substituted proportion, two kinds of dual-phase evaluation strategies with different ρ(t) are analyzed as follows.

(1) While the conditions of startup mechanism are satisfied, ρ(t)=1. That is, the evaluation process is divided into two phases. When surrogate models are not used, population is evaluated by human only. Otherwise, all individuals are evaluated by surrogate models.

(2) While the conditions of startup mechanism are satisfied, ρ(t)=C(0<C<1). Here C is a constant (Zhou,2005). That is, when surrogate models are adopted, population is not evaluated by surrogate models only. Some of individuals are evaluated by human, others are evaluated by surrogate models. And the number of substituted individuals is |P|ρ(t) .

In above two evaluation strategies, fixed substituted proportion does not make surrogate models used enough. Therefore, variable substituted proportion is adopted in this chapter. Corresponding evaluation process is different from above two instances. It contains three phases.

Phase I: Population is evaluated by human only

In this phase, the models are not used and all of individuals are evaluated by human. The evaluation mode is defined as follows.

F(X(t))={FU(xU,t)},xUX(t)s.t.((Fa(t)ε)(Pr(t)Ψ))

And the number of substituted individuals is

NM(t)=0

It is obvious that in this phase, human do not feel tired and surrogate models can not exactly reflect human preference. Above phenomena possibly appear at the beginning of evolution. So this evaluation mode is usually adopted in the former evolution. Note that surrogate models shall be continually update according to evaluated individuals in this phase.

Phase II: Population is mixed evaluated by human and surrogate models

In this phase, surrogate models are startup because they learn human preference approximatively. However, the degree of human fatigue does not exceed the threshold. That means human still participate in the evaluation process. And human evaluated individuals are used as samples to improve models’ precision further. Obviously some of individuals are evaluated by human and others’ fitness values are calculated by the models. So the evaluation mode are shown as follows.

F(X(t))={FM(xM,t),FU(xU,t)},xMxU,xM,xUX(t)s.t.((Fa(t)ε)(Pr(t)Ψ))

In order to reduce human burden gradually, the number of substituted individuals in phase II is increasing along with the degree of human fatigue.

NM(t)=|P|e(Pr(t)/f¯)(εFa(t))

In above two phases, population size is fixed and small because human participate in the evaluation process. In general, population size in IGAs is less than ten so as to alleviate human visual fatigue.

Phase III: Population is evaluated by surrogate models only

In this phase, all of individuals are evaluated by surrogate models. So the evaluation mode is described as follows.

F(X(t))={FM(xM,t)},xMX(t)s.t.((Fa(t)ε)(Pr(t)Ψ))

In phase III, human often feel very tired. Individuals’ fitness values given by human may differ from their real preference. So these individuals can not be used to train surrogate models. And the models are not updated any more. This evaluation mode is normally adopted in the latter evolution. Because the evaluation process based on surrogate models is done by computers, there does not exit human fatigue in phase III. That means the evolution of AES-IGAs is the same as traditional genetic algorithms. So population size can be enlarged. But how to increase population size is a key problem.

Higher the models’ precision is, the generalization of the model is better. So enlarged population size is defined as follows.

Np(t)=|P|f¯Pr(τk)+0.5

Here, Pr(τk) expresses the reliability of surrogate models in Phase III. It is obvious that the exponent in formula(12) may be 1, 2 or 3. So population size may be enlarged to corresponding multiple of |P|.

5. Surrogate Model-based Evaluation Strategy with Oriented Selection of Substituted Individuals

In existing researches on surrogate model-based IGAs, substituted individuals are often randomly selected from population after the evaluation mode is decided. But if selected substituted individuals are far away from existing training samples, their fitness values calculated by surrogate models may differ from human cognition. This will misguide the evolution away from human real preference. Aiming at above problem, an oriented selection method of substituted individuals is proposed. Here, substituted individuals are selected in terms of the distance between individuals and samples so as to improve the evaluation precision to the utmost extent.

Taken oriented selection process in t-th generation as example, algorithm steps are shown as follows(Guo,2007).

Step1: The number of substituted individuals NM(t) is decided in terms of human fatigue.

Step2: Samples are composed of evaluated individuals and their fitness values given by human. They are grouped into sub-sample sets adopting Fuzzy-C mean algorithm.

Step3: Aiming at each sub-sample set, a surrogate model described by artificial neural networks is trained.

Step4: The distance between each individual in population and each samples or each sub-sample sets are calculated.

Step5: Individuals are sorted according to above distance. Then individuals with less distance are regarded as substituted individuals.

Step6: The fitness values of substituted individuals are calculated by their affiliated clusters’ surrogate model.

Obviously how to group samples, train surrogate models and select substituted individuals are three main issues. Here, former two problems are mainly discussed.

5.1. Sample clustering

How to construct sample set and decide the cluster number are two key problems in sample clustering.

Sample set are composed of evaluated individuals and their fitness values given by human. They are expressed by x U(t) and f(x U(t)). Because human cognition may vary in evolution, the fitness value of same individual may be variant in different generation. In order to track human preference, latest fitness values of any individuals are reserved. Therefore, sample set are described as follows.

Q(t)={(xU,f(xU))|xUlxUk,l,k=1,2,|Q(t)|}

Sample set are grouped into clusters by fuzzy c-means clustering algorithm (Karayiannis, 1997). Suppose L=2,3,…,K(K<|Q(t)|) is the cluster number, which is normally decided according to domain knowledge. Improper K will lead to wrong clustering result. Aiming at this problem, comparison method (Cheng,2006) is adopted to obtain suitable cluster number. Assume that Qi (t)(i=1,2, …,L) is sub-sample set. And each cluster center is

ci(t)=l=1|Q(t)|uilm(t)xUl(t)l=1|Q(t)|uilm(t),i=1,2,L

Here, m[1,) is the weighing index. uil[0,1] denotes the subjection degree of xUl(t) to Qi(t), which satisfies i=1Luil=1,l=1,2,,|Q(t)| .

5.2. Selection of substituted individuals

How many substituted individuals are there in each generation is the principal issue, which is decided by ρ(t). However, which individuals are selected as substituted individuals depends on their approximate distances. Here, the distances between an individual and each sample or each sub-sample set are called approximate distances.

Suppose xj (t) is an individual. Three kinds of approximate distances are defined as follows(Cheng,2009).

(1) Center approximate distance

The clustering center of each sub-sample set is used to judge which cluster does an individual belong to. Suppose ci (t) is clustering center of i-th cluster. The minimum distance between xj (t) and ci (t) is defined as center approximate distance, expressed by D1.

D1j=mini{1,2,L}d1(xj(t),ci(t))

If individuals are encoded by real number, d1 is described as

d1(xj(t),ci(t))=h=1n(xjh(t)cih(t))2

If individuals are encoded by binary number, d1 is described as

d1(xj(t),ci(t))=h=1n(xjh(t)Cih(t))

Because only clustering centers of sub-sample sets are taken into account, the computation complexity of center approximate distance is O(|P|L).

(2) Cluster approximate distance

All of samples in each sub-sample set are used to judge which cluster does an individual belong to. The minimum distance beween xj(t) and samples in Qi(t) is defined as cluster approximate distance, expressed by D2.

D2j=mini{1,2,L}d2(xj(t),Qi(t))

Suppose xUi(t) is a sample in i-th cluster. If individuals are encoded by real number, d2 is described as

d2(xj(t),Qi(t))=1|Q(t)i|l=1|Qi(t)|h=1n(xjh(t)xUi,lh(t))2

If individuals are encoded by binary number, d2 is described as

d2(xj(t),Qi(t))=1|Qi(t)|l=1|Qi(t)|[h=1n(xjh(t)xUi,lh(t))]

Here, |Q(t)|=i=1L|Qi(t)| . So the computation complexity of cluster approximate distance is O(|P||Q(t)|).

(3) Generalized approximate distance

All of samples in sample set are used to judge which cluster does an individual belong to. The minimum distance between xj (t) and x U(t) is defined as generalized approximate distance, expressed by D3.

D3j=minl{1,2,|Q(t)|}d3(xj(t),xUl(t))

If individuals are encoded by real number, d3 is described as

d3(xj(t),xUl(t))=h=1n(xjh(t)xUlh(t))2

If individuals are encoded by binary number, d3 is described as

d3(xj(t),xUl(t))=h=1n(xjh(t)xUlh(t))

Obviously the computation complexity of generalized approximate distance is O(|P||Q(t)|).

Through comparison of the computation complexity of above three approximate distances,

we know that the computation complexity of generalized approximate distance is the same as it of cluster approximate distance. In addition, population size is normally more than the cluster number, namely |Q(t)|≥L. So the computation complexity of center approximate distance is least.

6. Analysis of Convergence

In genetic algorithms, whether and how fast they converge are used to measure the algorithm’s performances. And many researches have discussed genetic algorithms’ convergence. However, IGAs’ convergence is seldom taken into account. In this chapter, AES-IGAs’ convergence is analyzed in terms of drift analysis and the first hitting time condition proposed by Jun He(He, 2001).

Definition 3: critical generation

In AES-IGAs, when t>τk, surrogate models are used to calculate all individuals’ fitness values. If the algorithms can converge to optimal solution under above condition, τk is called critical generation as shown in follows(Guo,2007). Whether τk exists is a key problem which influences AES-IGAs’ convergence.

τk=min{t|(ρ(t+1)=1)ΛD(X(t+ζ))=0,ζ[0,T-t),t(0,T]}

Where

D(X(t+ζ))=min{i|D(xi(t+ζ)),xi(t+ζ)X(t+ζ)}

Here, D(xi(t+ζ)) is simplified description of D(xi(t+ζ),x*),x*S* . S* is the optimal variable space. D() denotes the distance between two individuals. If individuals are encoded by binary number, D() is calculated by Hamming distance. Otherwise, if individuals are encoded by real number, D() is calculated by Euclidean distance.

Theorem 1: For any generation satisfied t>τk, all individuals are evaluated by surrogate models. During this phase, the evolution of AES-IGAs is the same as genetic algorithms. Based on the first hitting time drift condition with exponential form(He,2001), there exists ζ¯=min{ζ|D(X(t+ζ))=0} , which satisfies drift condition E[ζ¯|d(X(τk))0]h(|P|) . Here, h(|P|) is a polynomial function of |P|.

Proof: Assume that individuals are encoded by binary number. So the distance between an individual and optimal solution is

D(xi(t))=l=1nσl(xi(t),x*)

Given a random sequence {D(X(τk+j)),j=0,1,} , X(τk+j),D(X(τk+j))|P| is obviously satisfied. If the algorithms adopt one-point crossover, bit mutation and roulette selection, following inequation is obtained in terms of drift condition.

E[D(X(τk+j+1))D(X(τk+j))|D(X(τk+j))0]h1(|P|)

Here, h 1(|P|)>0.From Theorem.1 in reference(He,2001), we know E[ζ¯|d(X(τk))0]|P|h1(|P|) .

Let h(|P|)=|P|h1(|P|) . Then E[ζ¯|d(X(τk))0]h(|P|) is satisfied.

Theorem 2: When t>τk, the substituted proportion ρ(t) converges to 1.

Proof: Suppose {ρ(t),ρ(t+1)|ρ(t)0} is a random sequence. The gradient of ρ(t) is

ρ(t+1)ρ(t)=e[(Pr(t+1)/f¯)(εFa(t+1))]e[(Pr(t)/f¯)(εFa(t))]=e[(Pr(t+1)/f¯)(εFa(t+1))]+[(Pr(t)/f¯)(εFa(t))]

Considering εFa(t)0andFa(t+1)Fa(t)εFa(t+1)εFa(t) , then

Pr(t+1)Pr(t)Pr(t+1)(εFa(t+1))Pr(t)(εFa(t))Pr(t)(εFa(t))Pr(t+1)(εFa(t+1))0e[(Pr(t+1)/f¯)(εFa(t+1))]+[(Pr(t)/f¯)(εFa(t))]1

Therefore, ρ(t+1)ρ(t)1 . This shows that random sequence {ρ(t),ρ(t+1)|ρ(t)0} increases by degrees. And ρ(t) continuously varies along with t. When tτk ,Faε. So

ρ(t)=e[(Pr(t)/f¯)(εFa(t))]1

This makes limρ(t)tτk=1 while τkT . Therefore, total number of individuals evaluated by human in evolution is

Nsum=t=1τk|P|(1ρ(t))

7. Experiments and Analysis

7.1. Background for experiments

Here, fashion evolutionary design system is adopted as a typical background to validate the rationality of IGAs with adaptive evaluation strategy (AES-IGAs). The goal of the system is to find a dress which wins human favor (Kim,2000). Visual Basic 6.0 as programming tool for human-machine interface and Microsoft Access as database are utilized. Matlab 6.5 is adopted to train surrogate models based on artificial neural networks. System’s human-computer interface is shown as follows.

media/image61.png

Figure 1.

Human-computer interface in AES-IGAs.

In fashion evolutionary design system, each dress is composed of collar, skirt and sleeve. Each part has two factors including pattern and color which described by two bits. So each dress is expressed by twelve bits, which act as six gene-meaning-units (GM-units) (Guo,2007). Each gene-meaning-unit has four alleles. The meanings of each allele in gene-meaning-unit are shown in Table 1.

GM-unitallele
meaningcodemeaningcodemeaningcodemeaningcode
collar’s patternmedium collar00high collar01wide collar10gallus11
sleeve’s patternlong sleeve00medium sleeve01short sleeve10non-sleeve11
skirt’s patternlong skirt00formal skirt01medium skirt10short skirt11
colorpink00green01black10white11

Table 1.

The meanings of each allele in gene-meaning-unit.

7.2. Desired objectives and parameters in experiments

In order to validate the rationality of adaptive evaluation strategy and the influence on performance of IGAs, two groups of experiments are designed. They have different desired objectives which reflect different psychological requirements of human. Desired objectives of experiments are shown as follows.

Experiment I: To find a favorite dress fitting for summer without the limit of color.

Experiment II: To find a favorite dress fitting for summer and the color is pink.

In both experiments, artificial neural network is adopted as surrogate model. The values of parameters about the model and the evolution process are shown in Table 2.

parameters about the
evolution
crossover probabilitymutation probabilitypopulation sizegeneration ε ψ
0.60.018400.80.9
parameters about the
model
input neuronshidden neuronsoutput neuronslearning rateepochserror
61510.091500010-2

Table 2.

The values of parameters.

7.3. Comparison of different substituted proportion

In order to validate the rationality of AES-IGAs, thirty persons are gathered to do two groups of experiments aiming at desired objective of experiment II. In experiments, when surrogate models are used in evaluation process, two kinds of fixed substituted proportion and adaptive substituted proportion are adopted respectively. Testing results done by all persons are integrated, as shown in Table 3.

the substituted
proportion
average generationaverage number of human evaluated individualsaverage generation when the models are adopted
ρ(t)=0.51510411
ρ(t)=1148811
ρ(t)=e-(Pr(t)/f¯)(ε-Fa(t)) 12558

Table 3.

Comparison of the performance by different substituted proportion.

Obviously when different ρ(t) are adopted, convergence speed of the algorithms are similar. But total number of individuals evaluated by human varies. The reason for this is that surrogate models are adopted in evaluation only when formula(1) is satisfied. Firstly, if ρ(t)=0.5, half of population is evaluated by human when surrogate models are adopted. However, human shall evaluate individuals all along. So total number of individuals evaluated by human is more and human will feel more tired. Secondly, if ρ(t)=1, all individuals are evaluated by surrogate models when human feel tired and the models can reflect human preference exactly. Although human evaluate fewer individuals than first strategy, the models are used later than adaptive evaluation strategy. That is, less individuals are evaluated by human in AES-IGAs. Therefore, IGAs adopting adaptive evaluation strategy can effectively alleviate human fatigue more.

7.4. Comparison of different population size in Phase III

When population is evaluated by surrogate models only, fixed population size |P| and population size NP are adopted in experiments respectively. Testing results are shown in Table 4.

population sizeaverage generationaverage number of human evaluated individuals
|P|1255
Np1055

Table 4.

Comparison of different population size in Phase III.

It is obvious that different population size in Phase III do not influence the degree of human fatigue in evolution because average number of individuals evaluated by human is same. As we know, no matter how many individuals are there in Phase III, human do not participate in evaluation any more. That is, total number of human evaluated individuals in evolution only depends on the evaluation in Phase I and Phase II. However, convergence speed is faster while population size in Phase III is enlarged. The reason for this is that exploration of the algorithm with larger population size is better.

7.5. Critical generation when Phase III is started

Thirty persons are gathered to do two groups of experiments aiming at above two desired objectives. Results in experiments are shown in Figure 2.

media/image66.png

Figure 2.

ρ(t) in experiments.

From Figure 2, we know AES-IGAs converges. This matches Theorem.2. In addition, the substituted proportion is increasing along with the evolution. Aiming at desired objective of experiment II, all individuals are evaluated by the models when t≥8. That is, critical generation is 8. Obviously desired objective of experiment II include more detail requirements. So human need to pay attention to more GM-units, which make human feel more tired. This leads to larger critical generation. However, aiming at desired objective of experiment I, AES-IGAs converge when t=7. At this time, the degree of human fatigue do not exceed the threshold. So surrogate models are not completely used to replace human in evaluation process. That means critical generation do not exist in experiment I.

7.6. Comparison of different selection methods of substituted individuals

In order to analyze the performance of oriented selection methods, following three measures are given.

(1) Generation when stop rules are satisfied or human find the satisfied individuals reflects convergence speed, as shown in follows.

T=argmint|f(x(t))-f*|δ

(2) Total number of individuals evaluated by human in evolution influences the degree of human fatigue. Obviously T or N U is larger, human feel more tired.

NU=t=1T(|P|NM(t))

(3) Evaluation error of surrogate model describes square root error of the fitness values given by human human f U(x M(t)) and the models f M(x M(t)). Note that f U(x M(t)) is only used as reference and not utilized in evolution. Obviously this measure reflects generalization of the models.

EF(t)=1NM(t)f¯k=1NM(t)(fM(xMk(t))fU(xMk(t))2

Sixteen persons are divided into four groups. Aiming at desired objective of experiment II, each group does experiments adopting random selection method and three kinds of oriented selection methods respectively. Here, cluster number L=3. This means human preference is divided into three ranks, including favor, normal, dislike. Testing results done by each group are integrated, as shown in Table 5.

measuresTNU
random selection18112
oriented selection (D1)value1492
comparison with random selection↓22.2%↓17.9%
oriented selection (D2)value1276
comparison with random selection↓33.3%↓32.1%
oriented selection (D3)value1280
comparison with random selection↓33.3%↓28.6%

Table 5.

Comparison of different selection methods.

From Table5, we know that: (I) Because the number of samples is small in AES-IGAs, generalization of surrogate models are bad. If substituted individuals are randomly selected, they may be far from samples, which may lead to large evaluation error of surrogate model. So convergence of IGAs adopting random selection method is worse. (II) Human may favor more than one satisfied dress. Hence desired objectives of both experiments in fact are implicit multi-mode functions. Because oriented selection methods adopting D2 or D3 all consider the distances between all samples and individuals, useful information embodied in evolution are utilized enough. So their performance are better. However, oriented selection methods adopting D1 only takes the distribution of cluster center into account. Incomplete information of individuals may result in larger evaluation error so as to make the performance worse.

In above experiments, evaluation error of surrogate models varies during the evolution, as

shown in Figure 3. Obviously no matter which selection method is adopted, EF increases in former evolution. During this stage, human do not feel tired and can not exactly decide which one is he most like. Surrogate models are trained based on samples and do not participate in evolution. Along with the evolution, human gradually know what they favor. Surrogate models are close to human preference. So evaluation error decrease in latter evolution. In addition, oriented selection methods adopting D2 or D3 reserve samples’ information to largest degree. Hence AES-IGAs using these selection methods have less evaluation error. This can effectively avoid misguiding evolution.

media/image71.png

Figure 3.

Evaluation error of surrogate models.

In each generation, different substituted individuals are selected when above four selection methods are respectively adopted. In order to compare the difference among random selection method and oriented selection methods, the number of dissimilar substituted individuals in each generation are noted, as shown in Table 6. Here, NMD(t) denotes the number of dissimilar substituted individuals.

generation6781011121518
random selectionNM(t)00004444
oriented selection (D1)NM(t)0004444-
NMD(t)0004322-
oriented selection (D2)NM(t)04444---
NMD(t)04443---
oriented selection (D3)NM(t)004444--
NMD(t)004433--

Table 6.

Comparison of dissimilar substituted individuals selected by different methods.

In a word, the performances of oriented selection methods are better than random selection method. Through comparison of three kinds of oriented selection methods, oriented selection methods using D2 or D3 has better performance and stability, yet larger computation complexity. On the contrary, computation complexity of oriented selection methods using D1 is less, whereas the algorithm’s performance is worse. Considering multi-mode property of human preference and small population size in IGAs, oriented selection methods using D2 is more fit for IGAs than others.

7.7. Comparison of selection methods with different cluster number

Experiments adopting AES-IGAs with different cluster number are done. Here, two fixed cluster number including 3 or 6 and adaptive cluster number are adopted. Testing results are shown in Table 7.

L36adaptive
measuresTNUTNUTNU
random selection181121710815100
oriented selection (D1)value149212801172
comparison with random selection↓22.2%↓17.9%↓29.4%↓25.9%↓26.7%↓28.0%
oriented selection (D2)value127611761276
comparison with random selection↓33.3%↓32.1%↓29.4%↓29.6%↓20.0%↓24.0%
oriented selection (D3)value128012841176
comparison with random selection↓33.3%↓28.6%↓35.3%↓22.2%↓26.7%↓24.0%

Table 7.

Comparison of selection methods with different cluster number.

By compare the results, we know that: (I) AES-IGAs using adaptive cluster number have best performance. Because cluster number is more reasonable, the clustering precision of individuals is better. This will speed up convergence. (II) No matter which kind of approximate distance is adopted, oriented selection methods converge faster than random selection method. And oriented selection method adopting D2 has better stability. The reason for this is individuals in each cluster contains more information than cluster center, which makes evaluation error less. Hence, surrogate model can participate in evaluation earlier so as to alleviate human fatigue at most.

In order to analyze the influence of selection methods and cluster number on human fatigue, the curves describing the degree of human fatigue are shown in Figure 4. Note that the threshold of human fatigue is 0.95 in these experiments.

media/image73.jpeg

Figure 4.

The trend of human fatigue.

Obviously no matter which kind of selection methods are used, human fatigue all can be alleviated as long as cluster number is reasonable. Especially the degree of human fatigue is less if oriented selection methods using D2 or D3 are adopted.

7.8. Comparison of performance about IGAs

In order to validate the improvement in performance of IGAs with adaptive evaluation strategy, thirty persons are gathered. Everyone do experiments aiming at both desired objectives adopting conventional IGA and AES-IGA respectively. Testing results for above four experiments done by all persons are integrated, as shown in Table 8.

experimentsIII
algorithmsIGAAES-IGAIGAAES-IGA
average generation1473112
average number of individuals evaluated by human1123824855
average number of individuals evaluated by human in each generation8585
generations when Fa(t)≥ε---8

Table 8.

Comparison of the performance between IGAs and AES-IGAs.

Through comparison of testing results in experiment I, generation adopting AES-IGA averagely reduces 52.1% than IGA. And total number of individuals evaluated by human adopting AES-IGA averagely reduces 71.5%. These indicate adaptive evaluation strategy can effectively alleviate human fatigue and speed up convergence.

8. Conclusion

In order to further alleviate human fatigue in interactive genetic algorithms, surrogate models are used to evaluate individuals instead of human. When to utilize surrogate models, how many and which individuals are calculate fitness values adopting the models are studied. Startup mechanism about surrogate models considering the degree of human fatigue and the models’ precision is given. Variable proportion of population evaluated by surrogate models is proposed. Three phases including evaluated by human only, mixed evaluated by human and the models, evaluated by the models only, are discussed according to the number of substitution individuals. Especially, population size is enlarged in third phase. Surrogate model-based evaluation strategy with oriented selection of substituted individuals is proposed. Three kinds of approximate distance are presented so as to judge which cluster individuals belongs to. At last, convergence of AES-IGAs is proved.

Taking fashion evolutionary design system as a testing platform, the rationality of adaptive evolution strategy is validated aiming at different psychological requirements of human. Testing results indicate AES-IGAs converge faster than canonical IGAs and human feel less tired. Therefore, AES-IGAs can effectively alleviate human fatigue and improve the convergence speed so as to reduce human burden for evaluation which makes human absorbed in more creative design work.

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant No. 60805025) and the 863 Project of China (Grant No. 2007AA12Z162).

References