## 1. Introduction

### 1.1. What are feeders?

Application of automation to assembly processes is vitally important to meet the requirements of manufacturing and production.Feeders form a critical part of automated assembly lines (Singh et al., 2009)^{.} They are used to feed discrete parts to assembly stations or work cells or assembly cells on the production line from bulk supplies. They convert the randomness of parts into a flow in geometrical patterns such that the parts can become an integral contribution to the production process and get delivered at a pre-determined rate. Most of the time these parts are added to other parts to become the finished product at the end of an assembly line (Mitchell Jr., 2010). Feeders are sometimes also used as inspection devices (Boothroyd, 2005). Part feeders can be designed to reject certain kinds of defective parts, which when fed to the machine may result in its breakdown. Assembly process requires the presence of the correct parts, in the suitable amounts, at the appropriate places and at the right times, in the absence of which the entire production line may come to a halt. Ad-hoc setting of system parameters results in either starvation or saturation, where too less or too many parts are delivered to the work cells respectively.

This chapter describes a method of studying the behaviour of part feeding devices via a statistical analysis of the given system, carried out to formulate its empirical model. Once the model has been formulated and its validity confirmed, it can be used to suggest values of inputs and operating factors to get the desired output. The regression model can also be used to find the local optimum.

### 1.2. Types of feeders

Vibratory feeders are the most widely employed and versatile part-feeding devices in the industry (Boothroyd, 2005). Detailed theoretical analysis of vibratory feeders has been carried out (Redford and Boothroyd, 1967; Parmeshwaran and Ganpathy, 1979; Morrey and Mottershead, 1986, Ding and Dai, 2008 etc.).

While vibratory feeders remain the part-feeders of choice for general purpose requirements, many other designs of feeders have been developed for feeding parts having special features like headed parts or abrasive materials. Such feeders can usually be classified under:

Some common feeders that fall under these categories have been listed in Table 1.

For describing the optimization technique of a feeding system, a Reciprocating-Fork Hopper Feeder shall serve as an example throughout this chapter. As discussed in Table 1, it consists of a shallow cylindrical bowl that rotates about an axis inclined at a small angle (10º) to the vertical plane as it is placed on a rotating shaft of the gear box which in turn is connected to the motor shaft with the help of a belt and two pulleys (Singh et al., 2009). A two pronged fork reciprocates in the vertical plane above the rotating cylindrical bowl.

During the first stage of operation, the fork is dipped into the bowl, and due to the centrifugal force produced by rotation of the bowl, parts having the right orientation start climbing up the fork. The fork is then lifted up along with the parts nestled between its prongs. Finally, due to the effect of gravity, these parts slide down the delivery chute. For the parts being handled by this feeder to maintain their orientation throughout the operation, it is necessary to use only headed parts. The fork reciprocates by means of a pneumatic actuator.

### 1.3. Need for optimization

Part feeders, which singulate and orient the parts prior to packing and insertion, are critical components of automated assembly lines and one of the biggest obstacles to rapid development of assembly systems (Gudmundsson & Goldeberg, 2007). The design of part feeders is responsible for 30 % of the cost and 50% of the work cell failures (Boothroyd et. al 1982; Nevins and Whitney, 1978). Optimization of feeding systems thus becomes a critical issue in the development of assembly lines.

### 1.4. Analysis of feeders

The objective of this chapter is to outline methods to analyse part feeding systems on the basis of their functional specifications. The analysis of vibratory bowl feeders which require appropriate resonant frequencies to achieve optimal feeding conditions has been done by Ding & Dai, 2008. A systematic dynamic model of the bowl feeder along with the effect of various design (particularly assembly) parameters on the resonant frequencies (and thus throughput) has been developed.

But as far as the other common part feeders are concerned, the governing equations of the system are not available, and we attempt to infer the underlying structure by studying the behaviour of the system under certain conditions. Tests are designed by carefully choosing different input values, trying to design scenarios that will allow us to explore the functional relationship between the system inputs and outputs. The following sections are arranged to reflect the process of optimizing feeding systems loosely.

## 2. Parameter selection

### 2.1. Objective identification

Objective selection plays a very crucial role in overall process, by influencing the type of experiments to be run and analysis to be carried out. A poorly defined objective would lead to a poorly planned and poorly executed experiment which will not yield the information required for our purpose, and may lead to scraping of the whole effort resulting in wastage of precious time, effort, money and resources. In industrial environments such scenarios create a wrong impression about the effectiveness of the process amongst the workers, particularly if it is a first time implementation. Workers and operators would prefer to work according to traditional trial and error methods for obtaining the best performance rather than going for a systematic optimization procedure.

Depending on the system, the process of selection of objectives may vary in complexity. For a simple system, this task can be carried out by the operator based on his experience; for complex systems a panel of experts are required to identify the measures of output that require investigation. Once the objectives are defined they have to be organised in a hierarchy of their relative importance. This facilitates the selection of variables, procedures of conducting the experiment and techniques of measurement of response, so that no critically important information is lost due to lack of foresight.

The objective of an exercise in optimization involving part feeding devices can vary from maximization of feed rate, minimizing feed cycle time, minimizing cost to other relevant aspects. Depending upon the time and capital available, the amount of information required, equipment available and complexity of the system, such an exercise can have one or more than one objective. In case of feeders the output of interest if often the unrestricted feedrate, and the objective is to maximize it, or to obtain a particular pre-determined value of this output as per requirements.

In the study of reciprocating-fork hopper feeder for the example considered, the orientation of parts to be fed is not of primary importance as the fork is designed specifically to maintain the orientation of headed parts being fed by the system. Hence, the primary objective is to observe the behaviour of part flow rate by means of Regression modelling.

### 2.2. Process factor selection

Process factor selection depends on the stated goal of investigation. We have to identify primary factors affecting the targeted output. Sometimes these primary factors in turn depend on secondary factors.

Process variables include both *inputs* and *outputs* - i.e., *factors* and *responses*.

The factors that have a reasonable effect on the throughput of a part feeder in general are:

*Load Sensitivity*i.e. sensitivity in the feed rate on account of changes in the load i.e. amount of parts present in the feeder.*Part Specifications*like geometry (shape and size), weight etc*Feeder Design Specifications*like angle of inclination of hopper, depth and shapes of grooves, length of slots etc*Operating Conditions*like frequency of reciprocation, speed of rotation etc

Factors for a Reciprocating Fork Hopper Feeder are:

**I. Load Sensitivity (Quantity of parts in feeder)-As has been explained earlier, the parts are circulated in the bowl of the feeder by a motor continuously during the operation of the feeder. The presence of an excessively large number of parts causes**

overloading of the system,

abrasion of feeder bowl,

abrasion of parts due to friction against each other, and

a large number of parts are forced onto the fork causing overloading of the fork. Overloading of the fork is a serious concern as it can disturb the designed parameter settings of the system. On the other hand if too few parts are present in the feeder bowl,

there will not be sufficient parts present to load the fork sufficiently,

fewer parts will climb up the fork due to lack of back pressure from other parts, and

scraping of the fork on bowl will cause creation of abraded tracks.

**II. Part Specifications**

Part Size

The diameter of the headed part should be slightly greater than gap between the fork prongs for them to be picked up in the correct orientation. Also if the part is too long, it creates problem due to a slanted pickups at the edge of the fork, leading to fewer parts being picked up.

Part Shape

Hexagonally shaped heads were found to have a greater probability of being picked up than circular shaped heads.

Weight of Part

The parts should weigh between the limits prescribed during design of the fork. Too heavy parts may render the fork unable to lift the pieces or cause the fork to bend under repeated stress.

**III. Feeder Design Specifications**

Angle of inclination of fork

If the angle of inclination of the fork is not appropriate the fork will pick up few pieces or simply not pick-up any pieces. The pieces picked up will not slide smoothly ahead, they may get jammed or be thrown, neither of which is a desirable condition.

Angle of inclination of bowl

The bowl has to be inclined at a small angle to its axis for facilitating a scenario where the parts come into contact with the fork such that when the fork is horizontal, the parts are at a lower level than it, so that they can climb up within the prongs of the fork.

**IV. Operating Conditions**

Speed of rotation of bowl

This is a very important consideration as rotation of the bowl is what facilitates the pick-up of the pieces the fork by keeping them in constant contact with it. Too high a speed will result in the parts being thrown to the very edges of the bowl and outside the path of the fork. At high speeds even if the parts are in contact with the fork it is often unable to pick them up as the parts do not get any time to climb up the fork. At too low speeds, the rotation of the bowl ceases to have an effect on the process.

Frequency of reciprocation of fork

The reciprocation of the fork results in picking up parts and delivering it to the next stage. Too fast a frequency of reciprocation doesn’t allow the parts to climb up the prongs of the fork. At the most the fork picks up a few pieces and due to lack of time for the parts to slide down, throws them randomly. This is an extremely dangerous situation, as the flying parts can hit labour and machinery causing injuries and damage. On the other hand too slow a frequency of reciprocation results in too few pieces being picked up leading to wastage of the potential of the feeder, slowing down operations etc.

Ratio of time during which fork is lifted up to pointed down

The time period of each cycle of the reciprocation of the fork consists of three times:

Time when the fork is descending: Too fast descent will lead to the fork crashing into the rotating bowl causing damage to the fork, bowl and parts in contact. It may modify the angle of the fork. It can also cause parts to fly out of the bowl which is an extremely dangerous situation.

Time when the fork is horizontal: This time allows for the parts to climb up and nestle securely between the prongs of the fork. Too small a time will not allow sufficient parts to climb up the fork. If too large a time is allowed the number of parts that can be picked up will reach saturation and the fork will stay in position not picking up additional parts, thus holding up the operation.

Time when the fork is ascending: This time allows for the parts to slide down smoothly the delivery chute. It is very important that the parts maintain contact with the chute until they can fall down to the bin under gravity. If they break contact too soon they will fly off.

Table 2 describes some of the process factors (when the objective is maximization of feed rate) for rest of the common part feeders mentioned in Table 1. It is imperative to note that factor I (Load sensitivity) is there for all the feeders listed in the table and has thus not been mentioned due to space constraints. Preferred part shapes refer to the shapes which will give the maximum feed rate for that feeder. The factors mentioned here are collected from various results reported in literature (Boothroyd, 2005). However, lack of standard governing models mean that these cannot be considered as absolute and actual experimentation must be carried out to find out the effect of these process factors on the objective function.

After the critical factors have been identified, their operational values have to be determined. In case of a Reciprocating Fork Hopper Feeder, the values of parameters like angle of inclination of fork, weight of parts, shape of parts, part size etc are determined and held constant. Three critical factors are taken as variable:

### 2.3. Factor level selection

Complete coverage of the entire region of operation of the system is often not possible; we have to limit our experiments to a region that is determined to be most relevant to our purpose on the basis of past experience or other preliminary analysis.

Once all factors affecting the output significantly are listed, constraints affecting them are identified. This can be done based on previous experience i.e. empirical data or based on process capability. This section will explain the process of determination of factor levels for each factor according to constraints. It is not always feasible to consider the whole set of viable values of a factor. In some cases, extreme values will give runs that are not feasible; in other cases, extreme ranges might move one out of a smooth area of the response surface into some jagged region, or close to an asymptote.

In our chosen example we have isolated the three variables of interest. For each of these variables we will identify a range of operation where conducting the investigation will be most fruitful based on previous work. This is summarised in Table 3 as well as explained in the following paragraph. To facilitate the application of transformations if required later, the high and low stages are set to -1 and 1 level respectively.

Speed of Rotation of Cylindrical Bowl

The rotation of bowl is provided by an adjustable speed motor at the base of the feeder and its speed is measured in rotations per minute (rpm). As it has already been discussed too high or too low speeds are not conducive to obtaining a good output, the range over which speed of rotation is varied is 500 rpm to 1050 rpm.

Part Population

The number of parts being circulated in the rotating bowl is varied from 300 to 700 parts. For less than 300 initial parts the number of parts being picked up during operation of the feeder is negligible. Whereas the number of parts transferred stagnates even if the initial part population is increased beyond 700.

Number of strokes per minute

The number of strokes that the fork makes per minute is varied from 4 to 8. These levels are selected based on previous experience that at fewer than 4 strokes too few parts are picked up, while at more than 8 strokes the parts fly off due to too less time of contact with the fork.

## 3. Process characterization – experimental designs

Obtaining the operating conditions at which the system output is optimal would be straightforward if theoretical expressions or model were available. In absence of such expressions, a model has to be created and the information employed to determine the performance of the device is to be obtained empirically. This can be done efficiently using Design of Experiments.

### 3.1. Design of experiments

The statistical Design of Experiments (*DOE*) is an efficient procedure for planning experiments so that the data obtained can be analysed to yield valid and objective conclusions (NIST, 2010). Once the objectives have been defined, variables selected and their ranges chosen, a set of conditions need to be defined that bring them together. The manipulation of these conditions in order to observe the response of the system is called an experiment.

The individual experiments to be performed can be organised in such a manner that the information obtained from each supplements the other. Thus, well planned experimental designs maximize the amount of information that can be obtained for a given amount of experimental effort. More importantly, the validity of experimental analysis is affected by the organisation and execution of the experiments. The significance, validity and confidence level of results obtained from a series of experiments can be improved drastically by incorporating elements like randomisation and replication into it. Randomisation is carried out to eliminate any existing bias or inertia in the experimental set-up. Randomisation of experiments can be done in two common ways:

Completely Randomised Designs

As the name suggests, the sequence in which a set of experiments is performed is completely random. One of the ways in which this can be accomplished is by generating random numbers using a computer and assigning this random sequence of numbers to the run order and then performing the experiments in accordance with the new and modified arrangement.

Randomised Block Design

In this method the experimental subjects are first divided into homogenous groups called blocks, which are then randomly assigned for experimental treatments so that all treatment levels appear in a block. (NIST,2010)

### 3.2. Selecting experimental design

The choice of experimental design has to be in accordance with the parameter selection i.e. process objectives, number of process variables and their levels. The possibilities for modelling data are also related to the experimental design chosen. This interconnection between all levels of the process demands that special attention is paid the step of selecting the experimental design. Depending on the requirements of the experimenter, a number of designs have been developed, like:

Completely Randomised Designs

These deigns are generally used when the experiment is small and the experimental units are roughly similar. A completely randomized design is the simplest type of randomization scheme in that treatments are assigned to units completely by chance. In addition, units should be run in random order throughout the experiment (Cochran & Cox, 1957).In practice, the randomization is typically performed by using a computer program or random number tables. Statistical analysis is carried out by one-way ANOVA and is relatively simple. It is quite flexible and missing information is not very problematic due to the large number of degrees of freedom assigned to error.

Randomised Block Designs

Randomised block designs are constructed when in addition to the factor whose effect we wish to study, there are other noise factors affecting the measured result. In order to reduce the effect of undesirable noise in the error term the method of blocking is employed, in which the experimental units are divided in relatively homogeneous sub-groups. In a block the noise factor, called blocking factor, is held constant at a particular level and the treatments are assigned randomly to the units in the block. This is done for all levels of the noise factor so that its effect maybe estimated and subsequently eliminated.

Full Factorial Designs

The full factorial design of experiments consists of the exhaustive list of treatments obtained by combination of all levels of all the factors with each other. The number of experimental runs id given by:

Where,

n = number of levels of a factor

k = numbers of factors

For full factorial designs at large number of levels or for large number of factors, the number of runs required becomes prohibitive either in terms of cost or time or resources required. For this reason many other designs have been proposed for handling such cases, e.g. Plackett-Burman designs for 5 or more factors at 2 levels.

Fractional Factorial Designs

It has been observed that there tends to be a redundancy in Full Factorial Designs in terms of an excess number of interactions that are estimated (Box et al., 1978). Though full factorial designs can provide an exhaustive amount of information, in practical cases estimation of higher order interactions are rarely required. This problem can be addressed by selecting an appropriate fraction of experiments from the full factorial design. In these arrangements certain properties are employed to select a (1/n)^{p} fraction of the complete design, and the reduced number of runs is given by:

Where,

n = Number of levels

k = Number of factors

p = fraction to be run

The chosen fraction should be balanced, i.e. each factor occurs an equal number of times at all the levels, and the design is orthogonal. An experimental design is orthogonal if the effects of any factor balance out (sum to zero) across the effects of the other factors ^{[4]}.

The interactions that have been confounded are lost, but the main effects and other interactions (not confounded) can be more precisely estimated due to reduced block size. This can be expressed in terms of the Resolution of the design:

Resolution III Design

Easy to construct but main effects are aliased with 2-factor interactions.

Resolution IV Design

Main effects are aliased with 3-factor interactions and 2-factor interactions are confounded with other 2-factor interactions.

Resolution V Design

Main effects are aliased with 4-factor interactions and 2-factor interactions are aliased with 3-factor interactions.

For the example considered, a 2 level factorial design is selected to investigate the behaviour of a reciprocating fork-hopper feeder for the three factors selected. This gives rise to a 2^{3} run design, i.e. the total experiment consists of 8 runs. Taking 4 replicates of each experiment we finally have 32 randomised runs that required to be conducted.

## 4. Analysis of data

### 4.1. Preliminary analysis

Once an experimental design has been selected that is commensurate with the requirements of the project, the experiments are performed under stipulated conditions and data is collected. The data obtained by performing the experiments is checked for suitability of further analysis. The best way to examine the data is by means of various plots such as Normal and Half-Normal Plots, Pareto charts, FDS graphs, etc. The right graphs and plots of a dataset can uncover anomalies or provide insights that go beyond what most quantitative techniques are capable of discovering (NIST, 2010). This analysis is further supplemented by evaluating statistics to check for outliers and errors. Some common evaluators are:

Response distributions

Histograms

A histogram essentially shows the frequency with which a data point, which falls within a certain range, occurs. The ranges into which the data points are divided are called bins, in particular for histograms or classes more generally. It can also be called as the plot of frequency of response vs. response. These graphs provide information regarding presence of outliers, the range of the data, centre point of data and skewness of data. A histogram is shown in Figure 1(a)

Box Plots

Boxplots, also called box and whisker plots, are a quick graphic approach for examining data sets. This plot provides an excellent visual summary of the important aspects of a distribution. The box stretches from the lower hinge which is drawn at the 25th percentile to the upper hinge which is drawn at the 75th percentile and therefore contains the middle half of the scores in the distribution (Lane, 2001). The median is shown as a line across the box. Therefore 1/4 of the distribution is between this line and the top of the box and 1/4 of the distribution is between this line and the bottom of the box. Two lines, called whiskers, extend from the front and back of the box. The front whisker goes from Q1 to the smallest non-outlier in the data set, and the back whisker goes from Q3 to the largest non-outlier. A box plot is shown in Figure 1(b).

Typical DOE plots

Pareto plots

In terms of quality improvement, the Pareto effect states that 80% of problems usually stem from 20% of the causes. Due to this Pareto charts are extremely helpful when the goal of the investigation is to screen all possible variables affecting a system output, and to select and isolate parameters that are most significant. This is achieved by ranking the main effects and variable interactions in their descending order of contribution to the output. The chart consists of bar graph showing parameters in a prioritized order, and the bars are placed on the graph in rank order that is the bar at the left has the highest impact on output, so it can be determined which variables should be selected for further study. The purpose of the Pareto Chart is to distinguish the vital few from the trivial many, therefore, it is desirable that only a few variables are present on the left side of the Pareto Chart that account for most of the contribution to output. Then a second stage of investigations can be embarked upon, dealing with fewer parameters and thus smaller and more economical experiments. In figure 2, it can be clearly seen that feedrate has the greatest impact on the throughput of the feeder, whereas the interaction effect of speed and population has almost negligible impact.

Normal or half-normal plots of the effects

The half-normal probability plot is a graphical tool that uses the ordered estimated effects to help assess which factors are important and which are unimportant. Quantitatively, the estimated effect of a given main effect or interaction and its rank relative to other main effects and interactions is given via least squares estimation, Having such estimates in hand, one could then construct a list of the main effects and interactions ordered by the effect magnitude. Figure 3 (a) shows a Normal Plot and Figure 3 (b) shows a Half-Normal Plot.

FDS Graph

When the goal is optimization, the emphasis is on producing a fitted surface as precisely as possible. How precisely the surface can be drawn is a function of the standard error (SE) of the predicted mean response—the smaller the SE the better. Figure 1 shows a contour plot of standard error for two of the factors in the. As can be seen from the graph the predictions around the perimeter of the design space exhibit higher standard errors than near the centre. To circumvent this, the design should be centred at the most likely point for the potential optimum. The fraction of design space plot is shown in Figure 4. It displays the area or volume of the design space having a mean standard error less than or equal to a specified value. The ratio of this volume to the total volume is the fraction of design space (Whitcomb, 2011).

### 4.2. Theoretical model creation

The goal of this experimental investigation is to formulate an appropriate empirical model between the response, say y, and independent variables, say x1, x2, x3etc, and predict the behaviour of the system with sufficient accuracy. This can be mathematically expressed as,

In this case the function f, which defines a relationship between the response feedrate and independent variables like part population, is not known and needs to be approximated. Usually first or second order model are sufficient to approximate the function f. The term e represents variability that is not accounted for by the function f, and is assumed to have a normal distribution with mean zero and constant variance. Often the step of model creation is preceded by coding the variables to make them dimensionless and such that they have mean zero and the same standard deviation.

The model used to fit the data should be consistent with the goal of the experiment, to the extent that even the experimental design and data collection methodology are chosen such that maximum information can be extracted from the observed data for model creation.

The first order model is appropriate when the independent variables are varied over a relatively small region, such that the curvature of the selected region in response space is negligible. A first order model where only the main effects of the variables are deemed as significant is given by:

When the interactions of the factors amongst each other also play a significant role in the response along with the main effects, then the model is given by:

However if the curvature of the solution space is significant, then first order models are inadequate to predict the response of the system. In such cases, higher order models are used, typically second or third – order models like:

Regression is a collection of statistical techniques for empirical model building (Karley et al., 2004). The independent variables are also called predictor variables or regressors, the coefficients as regression coefficients and their values are approximated using the various regression methods. A number of equations estimating the results can be formulated, larger the number of regressor variables, larger the number of possible equations. Undoubtedly evaluating all possible solutions can be computationally exacting, thus methods have been developed for evaluating only a small number of subset regression models which are built by either adding or deleting regressors one at a time. Some statistics that can help in selecting the best possible model out of the ones generated are discussed later in the section.

F-statistic is a value resulting from a standard statistical test used in regression analysis to determine if the variances between the means of two populations are significantly different. The t-statistic is the estimated coefficient divided by its own standard error. Thus, it is used to test the hypothesis that the true value of the coefficient is non-zero, in order to confirm that the inclusion of an independent in the model is significant.

The regression equation fitted to the collected data is given below and factor estimates are summarized in Table 5.

The coefficient of determination, R-squared, is a measure of the fraction of the total squared error that is explained by the model. By definition the value of R^{2} varies between zero and one and the closer it is to one, the better. However, a large value of R^{2} does not necessarily imply that the regression model is good one. Adding a variable to the model will always increase R^{2}, regardless of whether the additional variable is statistically significant or not. Thus it is possible for models that have large values of R^{2} to yield poor predictions of new observations or estimates of the mean response. To avoid this confusion, an additional statistic called the Adjusted R-squared statistic is needed; its value decreases if unnecessary terms are added. These two statistics can, when used together, imply the existence of extraneous terms in the computed model which is indicated by a large difference, usually of more than 0.20, between the values of R^{2} and Adj-R^{2}. The amount by which the output predicted by the model differs from the actual output is called the residual. Predicted Residual Error Sum of Squares (PRESS) is a measure of how the model fits each point in the design. It is used to calculate predicted R^{2}. Here, the "Pred R-Squared" of 0.9859 is in reasonable agreement with the Adj R-Squaredof 0.9897. Adeq Precision measures the signal to noise ratio. A ratio greater than 4 is desirable. These statistics are used to prevent over fitting of model. A summary of the statistics is given in Table 4.

Standard Deviation | 2.34 | R-Squared | 0.9921 |

Mean | 56.78 | Adjusted R-Squared | 0.9897 |

Coefficient of variation | 4.13 | Predicted R-Squared | 0.9859 |

PRESS | 234.22 | Adequate Precision | 63.594 |

Inflated variances are quite detrimental to regression because some variables add very little or even no new and independent information to the model (Belsley, Kuh & Welsch, 1980). Multicollinearity causes variances to be high. A way for detecting multicollinearity is using the Variance Inflation Factor (VIF).

A likelihood ratio test is a statistical test used to compare the fit of two models, the null model as compared to the selected model. The likelihood ratio numerically indicates the possibility of the selected model being the correct one to represent the data set as compared to the null model. Thus it can be used to reject the null hypothesis.

The test statistic, D is given by:

By default a model having more parameters included in the model will have a greater log-likelihood. Whether the fit is significantly better should be determined further before making a decision.

As has been mentioned earlier, for a set of experimentally observed data a number of regression models can be formulated. One approach through which the best model among all the possible models can be selected is through the evaluation of statistics like the Akaike Information Criterion (AIC) and Schwarz Criterion.

The Akaike information criterion is a measure of the relative goodness of fit of a statistical model (Akaike, 1974).

Where,

k = number of parameters in the model,

L = maximized value of likelihood function of the model.

In case of a least-squares regression model,

Where, RSS is the estimated residual of the fitted model.

The first term can be called the bias term and the second the variance term. From the formula it is clear that the value of the statistic decreases with goodness of fit, and increases with increasing number of parameters in the model. This takes care of the concern of increasing the goodness of fit by increasing the number of terms included in the model by assigning them opposing roles. All possible models created using the same data set, are ranked according to their AIC values in ascending order and the best model is one with the least value of AIC.

However, the scope of AIC should not be overestimated; it does not test for model validity, it only compares the models created, even if all of them are unsatisfactory in predicting the output of the system.

Similarly Schwarz Information Criterion or Bayesian Information criterion also ranks the models created on the basis of their goodness of fit and the number of parameters included in the regression model, by assigning a penalty on increasing the number of parameters. The penalty of addition of more terms to the model is greater for Schwarz Criterion. It is given by:

Where, n is the sample size. The model having the smallest value of this statistic is recommended.

There may exist, correlation between different values of the same variable measured at different times; this is called autocorrelation and is measured by the autocorrelation coefficient. It is one of the requirements of a good regression model that the error deviations remain uncorrelated. The Durbin-Watson test is a statistic that indicates the likelihood that the error deviation values for the regression are correlated. This statistic also tests for the independence assumption.Autocorrelated deviations are indicators a host of possible problems in the regression model ranging from the degree of the regression model, i.e. a linear equation fitted to quadratic data to non-minimum variance of the regression coefficients and underestimation of standard error. If the true standard error is miscalculated it results in incorrect computation of t-values and confidence intervals for the analysis. The value of the Durbin Watson statistic, d always lies between 0 and 4. The ideal value of d is 2, which implies no autocorrelation. If value of d is substantially less than 2, it indicates positive correlation and if d is greater than two, it indicates negative correlation. Values of d smaller than one indicate that particularly alarming as successive error terms are close in value to one another.

In Table 6, the best regression model amongst the ones formulated for a different system is shown for which a number of the statistics discussed have been evaluated.

An Analysis of Variance or ANOVA table provides statistics about the overall significance of the model being fitted. Table 6 displays the results of ANOVA for the system under observation. Here, Degrees of Freedom stands for the number of the independent variables of the dataset and is obtained by subtracting the number of the parameters from the number of elements in the dataset. DOF plays a very important role in the calculation and comparison of variation. The Sum of Squares and the Total Sum of Squares have different degrees of freedom and cannot be compared directly. So they are averaged such that variation can be compared for each degree of freedom.

For ANOVA, if N=total number of data points and M=number of factor levels, then:

Df (Factor) or corresponding to between factor variance = M-1

Df(error) or corresponding to residual = N-M

Df total = N-1 (which is summation of above two pieces)

The F value statistic tests the overall significance of the regression model. It compares curvature variance with residual variance. It tests the null hypothesis which states that the regression coefficients are equal to zero. The value of this statistic can range from zero to an arbitrarily large number.If the variances are close to the same, the ratio will be close to one and it is less likely that curvature is significant. The Model F-value of 428.49 implies the model is significant. There is only a 0.01% chance that a F-Value this large for the model could occur due to noise. Prob(F) values give information about the probability of seeing the observed F value if the null hypothesis is true. Small probability values call for rejection of the null hypothesis that curvature is not significant. For example, if Prob(F) has a value of 0.01000 then there is 1 chance in 100 that all of the regression parameters are zero. This low a value would imply that at least some of the regression parameters are nonzero and that the regression equation does have some validity in fitting the.

The user-specified probability used to obtain the cut-off value from the F distribution is called the significance level of the test. The significance level for most statistical tests is denoted by α. The most commonly used value for the significance level is α=0.05, which means that the hypothesis of an adequate model will only be rejected in 5% of tests for which the model really is adequate. Values of Prob(F) less than 0.0500 indicate model terms are significant. In this case A, B, C, AC, BC are significant model terms.

#### 4.2.1. Testing of model assumptions using experimental data

Once a good regression model has been created for the system and its adequacy tested, it has to be ensured that the model does not violate any of the assumptions of regression. To examine how well the model selected conforms to the regression assumptions and how soundly the experimental data fits the model selected; there exist a variety of graphical and numerical indicators. However, carrying out any one of such tests is not sufficient to reach a conclusion regarding the effectiveness of the model. No statistic or test is competent in itself to diagnose all the potential problems that may be associated with a certain model. For the purpose of testing model assumptions, graphical methods are preferred as deviations and errors are easier to spot in visual representations. The task of assessing model assumptions leans heavily on the use of residuals. As already mentioned in the previous section residuals are the difference between the observation and the fitted value. Studentized residuals, i.e., residuals divided by their standard errors are rather popular for this purpose as scaled residuals are easier to handle and provide more information.

The assumptions are:

Normality –the data distribution should lie along a symmetrical bell shaped curve,

Homogeneity of variance or homoscedasticity - error terms should have constant variance, and

Independence - the errors associated with one observation are not correlated with the errors of other observations.

Additionally, the influence of observations on the regression coefficients needs to be examined. In some cases, one or more individual observations exert undue influence on the coefficients, and in case, the removal of such an observation is attempted it significantly affects the estimates of coefficients.

It has already been examined how well the experimental data fits the model via some numerical statistics like R-squared and Adjusted R-squared. The plot of predicted response versus actual responses performs the same function, albeit graphically and also helps to detect the points where the model becomes inadequate to predict the response of the system. This is the simplest graph which shows that the selected model is capable of predicting the response satisfactorily within the range of data set as shown in the Figure 5 (a).

To draw evidence for violations of the mean equal to zero and the homoscedastic assumptions, the residuals are plotted in many different ways. As a general rule, if the assumptions being tested are true, the observations in a plot of residuals against any independent variable should have a constant spread.

The plot of Residuals versus Predictions tests the assumption of constant variance, it is shown in Figure 5 (b). The plot should be a random scatter. If the residuals variance is around zero, it implies that the assumption of homoscedasticity is not violated. If there is a high concentration of residuals above zero or below zero, the variance is not constant and thus a systematic error exists. Expanding variance indicates the need for a transformation.

The linearity of the regression mean can be examined visually by plots of the residuals against the predicted values. A statistical test for linearity can be constructed by adding powers of fitted values to the regression model, and then testing the hypothesis of linearity by testing the hypothesis that the added parameters have values equal to zero. This is known as the RESET test. The constancy of the variance of the dependent variable (error variance) can be examined from plots of the residuals against any of the independent variables, or against the predicted values.

Random, patternless residuals imply independent errors. Even if the residuals are even distributed around zero and the assumption of constant variance of residuals is satisfied, the regression model is still questionable when there is a pattern in the residuals.

Residuals vs. Run: This is a plot of the residuals versus the experimental run order and is shown in Figure 6 (a). It checks for lurking variables that may have influenced the response during the experiment. The plot should show a random scatter. Trends indicate a time-related variable lurking in the background.

The normal probability plot indicates whether the residuals follow a normal distribution, in which case the points will follow a straight line. Definite patterns may indicate the need for application of transformations. A Normal Probability plot is given in Figure 6 (b).

Leverage is a measure of how far an independent variable deviates from its mean. It is the potential for a design point to influence the fit of the model coefficients, based on its position in the design space. An observation with an extreme value on a predictor variable is called a point with high leverage. These high leverage points can have an unusually large effect on the estimate of regression coefficients. Leverage of a point can vary from zero to one and leverages near one should be avoided. To reduce leverage runs should be replicated as the maximum leverage an experiment can have is 1/k, where k is the number of times the experiment is replicated. A run with leverage greater than 2 times the average is generally regarded as having high leverage. Figure 7(a) shows the leverages for the experiment.

Cook’s Distance is a measure of the influence of individual observations on the regression coefficients and hence tells about how much the estimate of regression coefficients changes if that observation is not considered. Observations having high leverage values and large studentized residuals typically have large Cook’s Distance. Large values can also be caused by recording errors or an incorrect model. Figure 7(b) shows the Cook’s D for the investigation under discussion.

Lack of fit tests can be used supplement the residual plots if there remains any ambiguity about the information provided by them. The need for a model-independent estimate of the random variation means that replicate measurements made under identical experimental conditions are required to carry out a lack-of-fit test. If no replicate measurements are available, then there will not be any baseline estimate of the random process variation to compare with the results from the model.

#### 4.2.2. Choice of transformations

Data transformations are commonly used to correct violations of model assumptions and to simplify the model. If the residuals are not randomly and normally distributed, or if the variance not constant; transformations are required to make the data suitable for statistically proper application of modelling techniques and for tests like F-test, t-Test, Chi-Squared tests etc. Significant violations of model assumptions result in increased chances of Type I and Type II errors (Osborne, 2010). While parametric tests like ANOVA and regression modelling assuredly benefit from improved normality, their results may not be significantly affected by minor violations of these assumptions. Thus, the decision to apply transformations should be taken judiciously, as their application makes the task of interpreting results more complicated.

The need for application of transformations can be identified through residual vs. predicted value plots, Normal probability plots etc. Normal probability plots indicate whether the data is normal and in case of non-normality also give information about the nature of non-normality, thus help in selection of appropriate transforms for correcting the violation of normality assumption. To obtain the normal probability plot, the data are plotted against a theoretical normal distribution such that they form an approximate straight line. Departures from this straight line indicate departures from normality. Other commonly used tests include Shapiro-Wilk test, D’Agostino omnibus test, Kolmogorov-Smirnov test, Pearson Chi-squared test etc.

Some traditional transformations often used to correct the problem of non-normality and non-homogeneity of variance is:

Logarithmic Transformation: Logarithmic transformations can give rise to a class of transformations depending on the base used, base 10, base e or other bases. These transformations are considered appropriate for data where the standard deviation is proportional to the mean. These types of data may follow a multiplicative model instead of an additive model.

Square Root Transformation: It can be considered as a special case of power transformation, where the data is raised to one-half power. This type of transformation is applied when the data consists of small whole numbers whose mean is proportional to variance and that follow Poisson distribution. Negative numbers can be transformed by addition of a constant to the negative values.

Arc Sine Square Root Transformation: These transformations are appropriate for binomial data and involve taking arcsine of square-root of the values.

Inverse Transformation: Takes inverse of the data or raises the data to power -1. Transforms large data into small values and vice versa.

In their paper in 1964, Box & Cox suggested the use of power transformations to ensure that the usual assumptions of linear model hold. Box-Cox transformations have been shown to provide the best solutions and help researchers to easily find the appropriate transformation to normalise the data or equalise variance.

The Box-Cox power transformation is defined as:

The transformation parameter, λ, is chosen such that it maximizes the log-likelihood function. The maximum likelihood estimate of λ corresponds to the value for which the squared sum of errors from the fitted model is a minimum. This value of λ s determined by fitting a for various values of λ and choosing the value corresponding to the minimum squared sum of errors. t can also be chosen graphically from the Box-Cox normality plot. Value of λ = 1.00 indicates that no transformation needed and produces results identical to original data. A transformation parameter could also be estimated on the basis of enforcing a particular assumption of the linear model (Cochran & Cox,1957) like additivity.

In 1969, Cox & Draper reported that in data that could not be transformed into normal, the use of Box-Cox transformations resulted in obtaining symmetric data. Many modifications were subsequently suggested by Manly (1971), John & Draper (1980), Bickel & Doksum (1981) and Yeo & Johnson (2000).

## 5. Optimization

Once the best possible empirical model, that estimates the functioning of a system satisfactorily, has been formulated, the next step is to obtain the optimal region of operation for the system under consideration. From a mathematical point of view, searching for optimal region of operation is akin to finding the factor levels at which the system response(s) is maximized or minimized, depending on the nature of output being studied. Experimental optimization, as carried out under Response Surface Methods is an iterative process; that is, the model fitted to the data from one set of experiments paves the path for conducting successive experimentation which results in improved models. This process can be seen as searching for the optimal region of operation. However, it should be borne in mind that the fitted responses are local approximations of the curvature of the solution space. Thus, the empirical model generated in our illustrative example holds true only for the region in which it is generated, and hence the optimization procedure should also be restricted within the space constrained. It is quite possible that outside the range investigated the behaviour of the system changes drastically and the empirical model and its predictions are no longer true. Hence, the screening experimentation should always be carried out over the maximum possible operating range.

Response Surface methodology is a collection of mathematical and statistical techniques that is useful for modelling, analysis and optimization of systems (Box and Wilson, 1951). It is a sequential method, in which the first stage involves screening runs to identify the factors which have significant effect on the target output. Then a first order model consisting of the selected factors is formulated and examined. If the current settings do not place the system in the optimal region of operation, then the experimenter moves the system towards such a region by an iterative optimization technique called the steepest ascent technique. Once the system reaches the vicinity of the optimal region of operation, a second set of data is collected, and a model is fitted to this data. Typically near optimal region the model created is of second or higher orders due to the inherent curvature of the solution space.

### 5.1. Single and multi-response processes

Until now, the entire discussion has been focussed on a single response, namely throughput of a feeding system. In more complex systems and situations, engineers are often faced with problems requiring multi-response optimization. In addition to maximizing the feed rate, there can be other tasks like minimizing the cost. Single-response optimization is comparatively straightforward. On the other hand, in multi-response optimization it is not always possible to find operating conditions that simultaneously fulfil the process objectives.

Simultaneous optimization combines all the targeted response requirements into one composite requirement by assigning weights to the responses.

Dual Response Surface Method (Meyers and Carter, 1973) is based on an algorithm for obtaining the optimal solutions of to this problem by assuming a primary response and a constraint response which both of them can be fitted as a quadratic model.

Desirability function approach (Harrington, 1965) is used to simultaneously optimize multiple objectives. It is one of the more popular methods used to tackle the problem of multi-objective optimization. Each response is assigned a desirability value between 0 and 1 and its value represents the closeness of a response to its ideal value. If a response falls outside the acceptable intervals, the desirability is 0, and if a response falls within the ideal intervals or the response reaches its ideal value, the desirability is 1. If the response falls within the tolerance intervals, but not the ideal interval, or when it fails to reach its ideal value the desirability lies between 0 and 1 (Raissi & Farsani, 2009). A composite desirability function is created that combines the individual desirability values and converts a multi-response problem into a single-response one.

### 5.2. Confirmation of optimum

In addition to theoretical validation of the goodness of the model in prediction of response, when the analysis of the experiment is complete, one must verify that the predictions are good in practice. These experiments are called confirmation runs. This sage of conducting confirmation runs may involve a few random experiments or it may involve the use of a smaller design to carry out systematic analysis depending on the scale and complexity of the process.

Typically after the experiments have been conducted, analysis of data carried out and confirmation runs selected, there is a not insignificant probability that the value of system configurations and environmental factors may have changed. Thus it is very important for the experimenter to ensure that proper controls and checks are in place before confirmation runs are carried out.

The interpretation and conclusions from an experiment may include a "best" setting to use to meet the goals of the experiment. Even if this "best" setting were included in the design, it should be run it again as part of the confirmation runs to make sure nothing has changed and that the response values are close to their predicted values. In an industrial setting, it is very desirable to have stable process, thus multiple confirmation runs are often conducted.

The purpose of performing optimization of feed rate is to maximize the throughput for a given value the input parameters. From the regression model and the related graphs and statistics, it can be deduced the feedrate of a reciprocating fork-hopper feeder generally increases with the increase in population. Hence the population was kept at its maximum value. The optimization is carried out for both the speed and the frequency of strokes. While carrying out optimization for the speed, the frequency of strokes was kept at the maximum since throughput increases with increasing frequency of strokes and while carrying out optimization for speed the frequency of strokes was kept at maximum.

## 6. Conclusion

In this chapter, a general process for optimization of part feeding systems is demonstrated by taking a Reciprocating- Fork Hopper feeder as an example to clarify specific techniques. Exhaustive testing of the system is cumbersome because there exist too many treatments to be tested. Statistical methods based on experimental designs of tests, regression analysis and optimization techniques can be used to carry out this task more effectively and efficiently. To this end the implementation of a two level full factorial design and its attendant analyses are shown and explained. This work should serve as a guideline for drawing statistically valid conclusions about the system under consideration.