Formulae for calculating statistics for weighted linear regression (WLR).
The aim of this chapter is to show checking the underlying assumptions (the errors are independent, have a zero mean, a constant variance and follows a normal distribution) in a regression analysis, mainly fitting a straight‐line model to experimental data, via the residual plots. Residuals play an essential role in regression diagnostics; no analysis is being complete without a thorough examination of residuals. The residuals should show a trend that tends to confirm the assumptions made in performing the regression analysis, or failing them should not show a tendency that denies them. Although there are numerical statistical means of verifying observed discrepancies, statisticians often prefer a visual examination of residual graphs as a more informative and certainly more convenient methodology. When dealing with small samples, the use of the graphic techniques can be very useful. Several examples taken from scientific journals and monographs are selected dealing with linearity, calibration, heteroscedastic data, errors in the model, transforming data, time‐order analysis and non‐linear calibration curves.
- least squares method
- residual analysis
- transforming data
“Since all models are wrong the scientist cannot obtain a “correct” one by excessive elaboration. On the contrary following William of Occam he should seek an economical description of natural phenomena. Just as the ability to devise simple but evocative models is the signature of the great scientist so overelaboration and overparameterization is often the mark of mediocrity”.
(Box, G.E.P. Science and Statistics, J. Am. Stat. Ass. 1976, 71, 791–796)
The purpose of this chapter is to provide an overview of checking the underlying assumptions (errors normally distributed with zero mean and constant variance (
|Explained sum of squares:||Weighted residuals:|
|Residual sum of squares:||Correlation coefficient:|
Sum of squares about the mean:
|Standard errors: |
The calculated regression line
corresponds to the model
Note that the model error is given by
An assumption that the errors are normally distributed is not required to obtain the parameter estimates by the least squares method. However, for inferences and estimates (standard errors,
A standardized residual is the residual divided by the standard deviation of the regression line
The tendencies followed by the residuals should confirm the assumptions we have previously made , or at least do not deny them. Remember the sentence of Fischer [22, 23]: ‘
|‘Most assumptions required for the least squares analysis of data using the general linear model can be judged using residuals graphically without the need for formal testing’||Darken |
|‘Graphical methods have an advantage over numerical methods for model validation because they readily illustrate a broad range of complex aspects of the relationship between the model and the data’.||NIST/SEMATECH |
|‘There is no single statistical tool that is a powerful as a well‐chosen graph’||Huber |
|‘Although there are statistical ways of numerically measuring some of the observed discrepancies, statistician themselves prefer a visual information of the residual plots as being more informative and certainly more convenient’||Belloto and Sokoloski |
|‘Eye‐balling can give diagnostic insights no formal diagnostic will ever provide’||Chambers et al. |
|‘Graphs are essential to good statistical analysis’||Anscombe |
|‘One picture says more than a thousand equations’||Sillen |
The main forms of representation  of residuals are (i) global; (ii) in temporal sequence, if its order is known; (iii) faced to the adjusted values,
The following points can be verified in the representation of the residuals: (i) the form of the representation, (ii) the number of positive and negative residuals should be equivalent of vanishing median, (iii) the sequence of residual signs must be randomly distributed between + and −, and (iv) it is possible to detect spurious results (outliers); their magnitudes are greater than the rest of the residuals.
Residual plots appear more and more frequently [32–39] in papers published in analytical journals. In general, these plots as well as those discussed in this chapter are very basic and can undergo some criticism. For example, the residuals are not totally distributed independent of
|Standardized residuals||Identification of heteroscedasticity|
|Jack‐knife residuals||Identification of outliers|
|Recursive residuals||Identification of autocorrelations|
Despite the frequency with which the correlation coefficient is referred to in the scientific literature as a criterion of linearity, this assertion is not free from reservations [1, 45–49] as evidenced several times throughout this chapter.
The study of linearity not only implies a graphic representation of the data. It is also necessary to carry out a statistical check, for example, the analysis of the variance [50–54], which requires repeated measurements. This implies the fulfilment of two requirements: the homogeneity (homoscedasticity) of the variances and the normality of the residuals. Incorporating replicates to the calibration estimation offers a possibility to look at the calibration not only in the context of fitting but also of the uncertainty of measurements . However, if replicate measurements are not made, and an estimate of the mean square error (replication variance) is not available, the regression variance
may be compared with the estimated variance around the mean of the
by means of an
Several examples taken from scientific journals and monographs are selected in order to illustrate this chapter: (1) linearity calibration methods: fluorescence data  as an example; (2) calibration graphs: the question of intercept  or non‐intercept; (3) errors are not in the data, but in the model: the CO2 vapour pressure  versus temperature dependence; (4) the heteroscedastic data: high‐performance liquid chromatography (HPLC) calibration assay  of a drug; (5) transforming data: preliminary investigation of a dose‐response relationship [61, 62]; the microbiological assay of vitamin B12; (6) the variable that has not yet been discovered; the solubility of diazepan  in propylene glycol; and (7) no models perfect: nickel(II) by atomic absorption spectrophotometry.
2. Linearity in calibration methods: fluorescence data as example
Calibration is a crucial step, an essential part, the key element, the soul of every quantitative analytical method [38, 40, 63–69], and influences significantly the accuracy of the analytical determination. Calibration is an operation that usually relates an output quantity to an input quantity for a measuring system under given condition (The International Union of Pure and Applied Chemistry (IUPAC)). The topic has been the subject of a recent review  focused on purely practical aspects and obviating the mathematical and metrological aspects. The main role of calibration is transforming the intensity of the measured signal into the analyte concentration in a way as accurate and precise as possible. Guidelines for calibration and linearity are shown in Table 4 [70–81].
|Scientific Association or Agency||Reference|
|International Union of Pure and Applied Chemistry (IUPAC)||Guidelines for calibration in analytical chemistry |
|International Organization for Standardization (ISO)||ISO 8466‐1:1990 ; ISO 8466‐2:2001 ; ISO 11095:1996 ; ISO 28037:2010 ; ISO 28038:2014 ; ISO 11843‐2: 2000 ; ISO 11843‐5: 2008 |
|LGC Standards Proficiency Testing||LGC/VAM/2003/032 |
|International Conference on Harmonization (ICH)||Guideline Q2A |
|Clinical Laboratory Standard Institute (CLSI)||EP6‐A |
|Association of Official Analytical Chemists (AOAC)||AOAC Guidelines 2002 |
|European Union||EC 2002/657 |
Linearity is the basis of many analytical procedures. It has been defined as  the ability (within a certain range) to obtain test results that are directly proportional to the concentration (amount) of analyte in the sample. Linearity is one of the most important characteristics for the evaluation of accuracy and precision in assay validation, and as seldom is the case where a calibration curve is perfectly linear, it is crucial to access linearity during method validation. Such evaluation is also recommended in regulatory guidelines [78–81]. Although it may seem that everything has been said on the subject of linearity, it is still an open question and subject to debate. It is therefore not surprising that some proposals are made from time to time to resolve this issue [54, 82–92].
However, in calibration, statistical linearity tests between the variables are rarely performed in analytical studies. When dealing with regression models, the most convenient way of testing linearity beside a visual assessment is plotting the residual sequence in the concentration domain. A simple nonparametric statistical test for linearity, known as ‘the sign test’ [9, 16, 28], is based on the examination of the residuals (
The residuals should be distributed in a random way. That is, the number of plus and minus residuals sign should be equal with the error symmetrically distributed (null hypothesis for the assay) when the variables are connected through a true linear relationship. The probability to get a random residual signs pattern is related to the number of runs in the sequence of signs. Intuitively and roughly speaking, the more these changes are randomly distributed  the best is the fit. A run is a sequence of the same sign with independence of its length. A pattern of residual signs of the kind [+ ‐ ‐ + + ‐ + ‐ + ‐ +], from independent measurements, is considered as random, whereas a pattern like this [‐ ‐ ‐ + + + + + + ‐ ‐] is not. Though a formal statistical test may carry out  with the information afforded by the residual plot, it is necessary a number of points greater than is usual in calibrate measurements.
The fluorescence in arbitrary units of a series of standards is shown in Table 5. To these data that appear to be curved, a straight line may be fitted (Figure 1, top) which results in an evident lack of fit, though the correlation coefficient (
|Concentration (μM)||Fluorescence (arbitrary units)||Concentration (μM)||Fluorescence (arbitrary units)|
The pattern of the sign of the residuals indicates that fitting the fluorescence data by a straight‐line equation is inadequate, higher‐order terms should possibly be added to account for the curvature. Note that the straight‐line model is not adequate even though the reduced residuals are less than 1.5 in all cases. When an erroneous equation is fitted to the data [95–97], the information contained in the form of the residual plots is a valuable tool, which indicates how the model equation must be modified to describe the data in a better way. A curved calibration line may be fitted to a power series. The use of a quadratic (second‐degree) equation is enough in this case to obtain a good fit: the scattering of the residuals above and below the zero line is similar, as shown in Figure 1 (middle). Then, when no obvious trends in the residuals are apparent, the model may be considered to be an adequate description of the data. The simplest model or the model with the minimum number of parameters that adequately fit the data in question is usually the best choice. ‘
In summary, when it is assumed a correct relationship between the response and the independent variable (s), the residual plot should resemble that of Figure 2 (left). All residuals should fall into the gravelled area, with a non‐discernable pattern, that is, random. If the representation of the residuals resembles that of Figure 2 (right), where curvature is appreciated, the model can probably be improved by adding a quadratic term or higher‐order terms, which should better describe the model with the required curvature.
Calibration curves with a non‐linear shape also appear in analytical chemistry [99–104]. When the data in the
|Beer law||Absorption spectrophotometry|
|Scheibe‐Lomakin||Emission spectroscopy ESI‐MS; ELSD; CAD|
|Wagenaar et al.||Atomic absorption spectrophotometry, liquid chromatography/MS/MS|
|Andrews et al.||Atomic absorption spectrophotometry|
Quadratic curve‐fitting calibration data are more appropriate [104, 109–117] than straight‐line linear regression, in the case of some quantification methods. Matrix‐related non‐linearity is typical of methods such as LC‐MS/MS. In order to provide an appropriate validation strategy for such cases, the straight‐line fit approximation has been extended to quadratic calibration functions. When such quadratic terms are included [10, 118–120], precautions should be taken because of the consequent multicollinearity problems.
However, the use of quadratic regression model is considered as less appropriate or even viewed with suspicion by some regulatory agencies and, as a result, not often used in regulated analysis. In addition, the accuracy around the upper limit of quantitation (ULOQ) can be affected if the curve range is extended to the point where the top of the curve is flat.
Statistical tests may also be considered for providing linearity, like Mandel’s test  for comparison errors of residuals of quadratic and linear regression by means of an
3. Calibration graphs: the question of intercept or non‐intercept
Absorption spectrometry is an important analytical technique, and, to be efficient, the calibration must be accomplished with known samples. Data for the calibration of an iron analysis, in which the iron is complexed with thiocyanate, are shown in Table 7. The absorption of the iron complex is measured and depicted versus iron concentration in ppm. The standard deviation of the regression line,
|Concentration, Fe (ppm)||Absorbance||Concentration, Fe (ppm)||Absorbance|
The regression line is first computed by forcing it to pass through the coordinate origin (
A few high or low points  can alter the value of the correlation coefficient in a great extension. Larger deviations present at larger concentrations tend to influence (weight) the regression line more than smaller deviations associated with smaller concentrations, and thus the accuracy in the lower end of the range is impaired. It is therefore very convenient [122–124] to analyse the plotted data and to make sure that they cover uniformly (approximately equally spaced) the entire range of signal response from the instrument (85). Data should be measured at random (to avoid confusing non‐linearity with drift). The individual solutions should be prepared from the same stock solution, thus avoiding the introduction of random errors from weighing small quantities from individual standards. Depending on the location of the outliers, the correlation coefficient may increase or decrease. In fact, a strategically situated point can make the correlation coefficient varies practically between −1 and +1 (Figure 4), so precautions should be taken when interpreting its value. However, points of influence (e.g. leverage points and outliers) (Table 8) are rejected only when there is an obvious reason for their anomalous  behaviour. The effect of outliers is greater as the sample size decreases. Duplicate measurements, careful scrutiny of the data while collecting and testing discrepant results with available samples may aid to solve problems  with outliers.
|Gross errors||Caused by outliers in the measured variable or by the leverage points extremes|
|Golden points||Special chosen points which have been very precisely measured to extend the prediction capability of the system|
|Latently influential points||Consequence of a poor regression model|
|Outliers||Differs from the other points in values of the |
|Leverage points||Differ from the other points in values of the |
If the regression analysis is made without the 32.8 ppm influence point forcing to pass through the origin, the correlation coefficient reaches the 0.999 91 value (Figure 5, top left). This point was not considered because a high deviation standard values were above the new line. Perhaps, the problem observed with the 32.8 ppm point is due to the fact that sulphocyanide (thiocyanate) is not in enough excess to complex all the iron present. However, the inspection of residuals (+ + + + ‐) shows systematic, non‐random deviations (Figure 5, bottom left), which may indicate an incorrect or inadequate model. Systematic errors of analysis translate into (systematic) deviations from the fit equation (negative residuals correspond to low estimated values, and positive residuals to high). An erroneous omission of the intercept term in the model may be the cause of this effect. The standard deviation of the regression line improves notably, from 0.0026 to 0.0017, when the intercept is introduced (Figure 5, top right) in the model (correlation coefficient equals 0.999 97), the residual pattern being now random (‐ ‐ + ‐ +) (Figure 5, bottom right). The calibration is then appropriate and linear, at least up to 8 ppm. However, the intercept value, 0.0027, is of the same order of magnitude as the standard deviation,
Residual analysis of small sample sizes has  some complications. Firstly, residuals are not totally distributed in an independent way from
4. Error is not in the data, but in the model: the CO2 vapour pressure versus temperature case
The linear variation of physical quantities is not a universal rule, although it is often possible to find a coordinate transformation  that converts non‐linear data into linear ones. The vapour pressure,
|Temperature (°K)||Vapour pressure||Temperature (°K)||Vapour pressure|
This requires a transformation of the data. If we define
this form is linear,
The resulting graph (Figure 6, middle solid line) examined, appears to be fine, like calculated statistics, and so there is no reason at first to esteem any problem. Results lead to a correlation coefficient of 0.999 988 76. This almost perfect adjustment is indeed very poor when attention is paid to the potential quality of the fit as shown by the sinusoidal pattern of residuals [+ + ‐ + ‐ + + + + + ‐ ‐], which are incorporated in the figure to the resulting least squares regression line. As the details of measurements are unknown, it is not possible to test for systematic error in the experiments. The use of an incorrect or an inadequate model is the reason, which explains in this case the systematic deviations. The Clausius‐Clapeyron equation does not exactly describe the phenomenon when the temperature range is wide. Results similar to those shown in Figure 6 are also obtained by applying weighted linear regression by using weighting factors defined by [6, 7, 127–129]
on the basis of the transformation used.
The error does not lie in the data then, but in the model. We may try to improve the latter by using a more complete form of the equation
The results now obtained (analysis by multiple linear regression) depicted in Figure 6 (bottom) are better than those obtained by using the single linear regression equation, with the residuals randomly distributed. Values of ln
5. The heteroscedastic data: HPLC calibration assay of a drug
In those cases in which the experimental data to be used in a given analysis are more reliable than others [6, 61, 63], gross errors may be involved when the conventional method of least squares is applied in a direct way. The assumption of uniform or regular variance of
is minimized in the weighted least squares procedure. The idea underlying weighted least squares is to attribute the greatest worth [2, 40, 132, 66, 101, 135, 136] to the most precise data. The greater the deviation from the homoscedasticity, the greater the profit that can be extracted from the use of the weighted least squares procedure. The homoscedasticity hypothesis is usually justified in analytical chemistry in the framework of the calibration. However, when the range of abscissa values (concentration) covers several orders of magnitude, for example, in the study of (calibration) drug concentrations in urine or in other biological fluids, the accuracy of
or a constant relative variance (radioactive accounts, Poisson distribution). Photometric absorbances by Beer’s law cover a wide concentration range and like chromatographic analysis in certain conditions, tend to be heteroscedastic. Inductive plasma‐coupling emission spectrometry coupled to mass spectrometry (ICPMS) requires weighted least squares estimates even when the calibration covers a relatively small concentration range. The standard deviation (absolute precision) of measurements
It is possible to derive  relationships between precision and concentration through the concentration range essayed so that chemical methods are applied to found analytes present at varying concentrations. A number of different relationships [2, 139–142] have been proposed (Table 10) for different authors, and ISO 5725 gives  indications to assist in obtaining a given
The advantages of the least squares method may be impaired if the appropriate weights are not included in the equations, despite being a powerful tool. The least squares criterion is highly sensitive to outliers, as we have seen in Figure 4. An undesirable paradox may often occur consisting in the fact that the experimental data of worst quality contribute most to the estimation of the parameters. Although replication may be severely limited [15, 132], it possesses the advantage to provide a certain kind of robust regression . The most common method of performing a weighted regression is using weights values reciprocal to the corresponding variances values, that is,
The assumption of constant variance in the physical sciences may be erroneous [34, 149–157]. The data from a calibration curve (Table 11) relating to the readings of an HPLC assay to the drug concentration in ng/mL in blood  are shown in Figure 8. A regression model reasonable for the mean values is
The weighted least squares method requires a higher number of replicates than the conventional least squares method. The estimation of the minimum number of replicates varies between six and 20, according to different authors. In practice, it is often difficult to reach such high level of replication [2, 15] for different reasons, such as cost or availability of calibration, standards and reagents, time demands on previous operations, or by recording of the chromatograms.
In order to apply the weighted least squares analysis, it is mandatory to assign weighting factors to the corresponding observations. In fact, the weighting factor is related with the information contained in the
The general phenomenon of non‐constancy of variance is called as we have previously seen heteroscedasticity. It can be addressed [2, 15] by the weighted least squares method. A second method of dealing with heteroscedasticity is to transform the response  so that the variance of the transformed response is constant, proceeding then in the usual way, as in the following.
6. Transforming data: preliminary investigation of a dose‐response relationship
The non‐linear relationship between two variables may be sometimes handled as linear by means of  a transformation. A transformation consists in the application of a mathematical function to a set of data. The transformation leading finally to a straight‐line fit to the data can be carried out on a variable or on both. The transformation of data is sometimes understood as a device which statisticians use, a conviction founded on the preconceived idea that the natural scale of measurement  is something like sacrosanct. This is not like this, and in fact some measurements, for example, those of pH, are actually logarithmic, transformed values 
As much as the analyst wants the mould of nature to be linear, often in the curves truth is simply found [118, 160]. Real‐world systems sometimes do not fulfil the essential requirements for a rigorous or even an approximate validity of the method of analysis. In many cases, a transformation (change of scale) can sometimes be applied to the experimental data  in order to carry out a conventional analysis. Although it may seem that the best way to estimate the coefficients of a non‐linear equation is the direct use of a non‐linear regression program (NLR), NLR itself  is not without drawbacks and problems.
The data of turbidimetric measurements of the growth response of
The transformation 
can be used (Figure 12). The inspection of Figure 12, however, suggests the existence of a marked curvature. The graph of the residuals, the deviation of each point of the model, indicates that the straight line is incorrect, due to the observed systematic pattern. There is a tendency towards curvature, as it is not randomly distributed around zero. It should be assumed that the model is susceptible to improvement, requiring either higher‐order additional terms or a transformation of the data.
If a second‐degree polynomial is fitted to the response data as a function of the logarithm of the dose, the adjustment to the naked eye seems adequate (Figure 13, top). The representation of the residuals as a function of the abscissa values (Figure 13, bottom), however, adopts a funnel shape. The non‐random pattern of residuals carries the message that the assumption of homogeneous (regular or constant) variance is not satisfied, which would require the application of the weighted least squares method, rather than simple linear regression.
A simple inspection of Figure 14 (top) now shows that the linear regression is valid throughout the entire range. Both transformations to achieve homogeneity of variance and normality (Tables 13 and 14) go together (hand in hand) and then both postulates are (almost) often simultaneously fulfilled, fortunately, on applying an adequate transformation.
|Poisson (counts) (|
|Small counts (||or|
|Variance = (mean)2||ln|
|3/2||−1/2||Inverse square root|
The stabilization of variance usually takes precedence over improving normality . As stated by Acton  ‘
Linear regression is a linear (in the parameters) modelling process. However, non‐linear terms may be introduced into the linear mathematical context by performing a transformation [162, 163] of the variables (Table 15). Note that when a transformation is used, a transformation‐dependent weight (Table 16) should be used (in addition to any weight based on replicate measurements). When a non‐linear function is capable of being transformed into another one linear, it is called ‘
|Exponential grow model|
|Transformation||Weighting factor (*)|
7. The variable that has not yet been discovered: the solubility of diazepan in propylene glycol
The study of the solubility of diazepan in mixed solvents  requires the representation of Beer’s law of a set of data corresponding to the solubility of diazepan in propylene glycol. The experimental data are shown in Table 17.
The relationship obtained between absorbance and concentration is (Figure 15)
These data can be used to corroborate the previously made statement that the correlation coefficient is not necessarily a measure of the suitability of the model. The
In spite of the high coefficient of correlation (
When time is included in the model, Eq. (21) results
giving rise to a value of
When the residuals are calculated for this model and are plotted as a function of the concentration, a graph similar to that of Figure 15 (middle) is obtained (Figure 16, top). However, if the residuals are represented for this model as a function of time (which is reflected in the order in which the samples were measured), the resulting pattern is obtained in Figure 16 (bottom), in which it is observed that the independence of the error has been accommodated (compare Figure 15, bottom), and the fit has improved, although it could probably do so even more.
The order in the analysis time demonstrates the significant fact that a representation of the residuals allows the observation of the effect of the time that otherwise would not have been perceived. This is possible in the case of diazepam solubility because the researcher was careful to record the time to which the samples were measured.
The appearance of a pattern in residuals as a function of time in a study of Beer’s law could indicate that some contaminant is affecting, or that the light source is decaying, or perhaps that it has not yet been warmed. The pattern of the residuals indicates if there is a time‐dependent variable, but not the reason for that dependency, which must be ascertained, in its case.
8. Nickel by atomic absorption: all models are wrong
Nickel nitrate (II) hexahydrate reagent analysis (Merck) is used to prepare a standard solution of 1 g/L Ni. The salt of 5.0058 g is weighed into the analytical balance and brought into a 1‐L volumetric flask with ultrapure water. From this solution containing 1000 mg/L, a working solution contains 125 mg/L. Appropriate volumes of this solution (triplicates) are added to 25‐mL volumetric flasks to obtain the calibration curve, thinning with ultrapure water. The measurements are carried out in an ‘Analyst 200 Atomic Spectrometer’ operating in absorption mode with a Cu‐Fe‐Ni multi‐element Lumma lamp (Perkin Elmer), at 232 nm, with an acetylene air flame. The obtained absorbances, given below, are superior to those described in Perkin Elmer . The measurements depend on the flow, for example, of the nebulizer system, different in each case.
Absorbance data (in arbitrary units) in the triplicate of aqueous solutions of Ni2+ in mg/L (ppm) are compiled in Table 18. It has been tried to adjust Eq. (2), third‐degree and fourth‐degree polynomial models (Figure 17) (left figures with mean and right values with individual values), observing that as the degree of the polynomial increases, the goodness of the adjustment increases, although the residuals detect pattern. There are no perfect models, but models more appropriate than others [165, 166]. It is possible to use rational form polynomials with the SOLVER function of Excel. Even so, the residuals show a pattern similar to that presented when a fourth‐degree polynomial is fitted to the data.
|Ni2+ (ppm)||Absorbance (arbitrary units)||Ni2+ (ppm)||Absorbance (arbitrary units)|
9. Final comments
Calibration is an essential part of every quantitative analytical method, with the exception of primary methods of analysis (isotope dilution mass spectrometry, coulometry, gravimetry, titrimetry and a group of colligative methods). The correct performance of calibration is a vital part of method development and validation. Parameter estimation models are often employed to obtain information concerning chemical systems, forming on this way a fundamental part of analytical chemistry. In those cases in which a wrong equation is fitted to data, the form of the residuals plot contains useful information which helps to modify and improve the model in order to get a better explanation of the data. Examples extracted from the literature show how residual plots reveal any violation of the assumptions severe enough as to deserve correction. As a matter of fact, some authors [12, 25, 28, 59, 96] are in favour of using residuals graphically to evaluate the inherent assumptions in the least squares method.
If there is a true linear relationship between the variables with the error symmetrically distributed, the sign of residuals should be distributed at random between plus and minus with an equal number of each. A plot of residuals allows checking for systematic deviation between data and model. Systematic deviations may indicate either a systematic error in the experiment or an incorrect or inadequate model. A curvilinear pattern in the residuals plot shows that the equation being fitted should possibly contain higher‐order terms to account for the curvature. A systematic linear trend (descending or ascending) may indicate that an additional term in the model is required. The ‘fan‐shaped’ residual pattern shows that experimental error increases with mean response (heteroscedasticity) so the constant variance assumption is inappropriate. This phenomenon may be approached by the weighted least squares method or by transforming the response. Time‐order analysis proves sometimes the more noteworthy fact that a residual plot permits the observation of a time effect that otherwise might not have become known. However, note that there are no perfect models, but models that are more suitable than others.
Many more sophisticated methods have been devised (standardized, studentized, jack‐knife, predicted and recursive residuals). However, in spite of their worth and importance they are considered beyond the scope of this chapter, devoted to a primer on residuals. The analyses presented in this chapter were mainly done using an Excel spreadsheet.