Abstract
The mathematical representation of milk production against time represents one of the most successful applications of mathematical modeling in agriculture. Models provide summary information, which is useful in making management and breeding decisions. Several empirical mathematical functions have been proposed to describe the lactation curve of dairy cattle differing in mathematical properties, in the number of parameters and in their degree of relationships with the main features of a typical lactation pattern, such as peak yield, time at peak and persistency. This review gives an overview of the parametric models used to fit of lactation curves in dairy cattle. Parametric models are those that found large application to fit the lactation curves, basically due to their limited mathematical complexity, and their abilities to fit a large kind of curves. Models to describe the lactation curve have been classified into two main groups: linear and nonlinear models. Nonlinear parametric functions have represented the preferred tools for modeling lactation curves with the main aim of predicting yields and parameters describing the shape of the curve in addition to important parameters such as peak yield and persistency. Nonlinear models need iterative techniques to be solved. Different iterative methods frequently employed in nonlinear regression models are Marquardt, Newton, Gauss and Dud. Wood model was the most popular parametric model with the largest application can be found in the immediate and easy understanding of relationships between its parameters and main curvatures of the lactation pattern.
Keywords
- mathematical modeling
- parametric models
- nonlinear regression
- goodness of fit
- shape of lactation curve
1. Introduction
The lactation curves provide useful information for selection programs and for developing suitable management decisions and production strategies at the farm level. So, the modeling of the lactation curve is not a new research topic, the first reference of a model of lactation curve is attributed to Brody et al. [1]. Because of computational difficulties, and the limitations of computer means, early models of lactation curves based on simple logarithmic transformations of exponentials, polynomials, and other linear functions were developed [2]. Mathematical models are able to predict milk yields. The application of these models on the first lactations data can provide important predictive information. Indeed, predicting the evolution of milk production at the individual or herd level is a powerful tool for managing herd performance. Linear and nonlinear methods were used to estimate the parameters of production peak and inclining and declining phases of milk production during lactation [3]. The incomplete gamma function or Wood function proved relatively powerful in fitting the observed daily milk yield [4]. According to Wood [5], knowledge of the parameters of lactation curves can predict total production from a single control, regarding the number of controls available for prediction. In recent years lactation curve fitting functions have been implemented in dairy farm management software [6]. Moreover, lactation curve modeling is a tool for monitoring individual yields, feeding planning, early detection of diseases before clinical signs appear, and selecting animals for breeding [7]. Another frequently advanced interest in curve production modeling is the measure of persistency. The selection of animals with higher persistency (low decrease of production during the second phase of lactation) is interesting. Thus, cows showing higher yield at the peak of production followed by rapid decline are undesirable and will be easily detected and identified using the adjusted lactation curve. The cost of milk production depends largely on the lactation persistency. The unexpected drop in production after the peak increases the cost of production, because of production inequitably along the lactation [8]. Economically, cows with flat lactation curves are more persistent and produce milk at lower cost [9]. Indeed, the incidence of metabolic and reproductive disorders, arising from the physiological stress of high milk production, would be lowered. The animal may have a more stable diet, favoring in particular the proportion of fodder in the ration and thus reducing production costs [8]. Thanks to these interests and utilities of the lactation curve, the choice of one or more appropriate mathematical functions capable of effectively describing the evolution of milk production throughout lactation is a crucial point. Thus, the interest of the study of the lactation curve is reflected by a multiplicity of mathematical models. Really, the mathematical representation of milk production during the lactation period is one of the most successful applications of mathematical modeling in agriculture [10]. The choice of a model as well as the quantity and the quality of the information necessary for its estimation must, therefore, be reasoned according to the desired use. For example, for studies of the effect of environmental factors on the shape of the lactation curve and the estimation of the classical parameters of the curve by adjusting the data sets classified by groups of animals according to a well-defined factor, a simple model with fewer parameters can meet these objectives. The selection of a mathematical function must, therefore, be based on the ease of parameter estimation, its versatility (possible modeling of the different constituents of the milk and not just the quantity of milk), and the quality of the adjustment. Guo and Swalve [11] recommend the use of the model with the lowest possible number of parameters. The availability of data collected through individual lactation and the development of genetic evaluation methods based on elementary controls, as well as the evolution of the specific requirements of the dairy cattle industry, have oriented the interest of modelers toward more linear functions flexible and general, such as polynomials or splines [12].
2. Description of the standard lactation curve
Milk production evolves during lactation following a cycle that is similar in all dairy females and usually characterized by two different phases: an ascending phase from parturition to peak production (the maximum production) and a downward phase, from this peak to the dry period. The slope of this phase represents the persistency of lactation [13]. Knight and Wilde [14] explain that this phenomenon is related to the exponential increase in the volume of secretory cells, during gestation due to the phenomenon of hyperplasia (proliferation of cells) and between calving and the peak of lactation by hypertrophy (intensification of their activity). The descending phase of lactation is the longest during which the milk secretion gradually decreases until dry up. This second phase is explained by the involution of secretory cells but especially by the fall of their numbers. The phenotypic expression of these biological processes is represented by standard or typical form (Figure 1) of the lactation curve, obtained by plotting on the abscissa the time elapsed since calving and on the ordinate the corresponding daily production [15, 16].

Figure 1.
Lactation curve for dairy cattle (observed and adjusted yields) [
The general appearance of this curve is relatively constant between many domestic species. The lactation curve can be characterized by a number of parameters [13]:
The peak of production is the highest yield of lactation, and its date is expressed in week (Wood, 1967) or in day. When a mathematical model of adjustment of the lactation curve is available, parameters
3. Mathematical modeling of lactation curve
The interest of the lactation curve is reflected by a variety of mathematical models proposed to describe or predict it. These models are appreciated and used because they have a simple biological or economic interpretation [19].
3.1 Empirical models
An empirical model has a theory that refers only to the level of reality for which the phenomenon being considered is expressed. These so-called empirical models (models based exclusively on experience and observation and not on theory), whether linear or not linear, represent the huge majority of studies published in the bibliography [12]. In fact, empirical models have found great application in different fields of animal science, mainly because of their limited mathematical complexity. Most of them permit to estimate certain classic characteristics of the curve (date and level of the peak of production, persistency, and total production). Many empirical mathematical functions have been proposed to describe the lactation curve of dairy cattle [16, 20]. These functions differ in their mathematical properties, the number of parameters, and their degree of relationship to the main characteristics of a typical lactation structure, such as yield at peak time, persistency, and total yield. Nonlinear parametric models have been a preferred tool for modeling mean curves of homogeneous animal groups [12].
3.2 Parametric curves
To describe the temporal evolution of milk secretion (lactation curve), scientists generally use parametric curves where the variation over time is modeled using linear or nonlinear functions. Most of the mathematical functions proposed to fit lactation curves in dairy cattle are primarily aimed at describing the phenomenon of milk secretion. Their basic assumption is that lactation is characterized by a continuous and deterministic component with an increasing phase to a maximum, followed by a decreasing slope [12]. The mathematical tool used in this approach can be represented by an analytic function of general time:
where
The use of parametric models has several advantages. Indeed, these models allow algebraically to calculate characteristic parameters of the curve. For example, the yield and the date of the peak of production are obtained, respectively, as ordinate and abscissa of the point where the first derivative of the function
Production between two dates is obtained by calculating the integral of
Another advantage of parametric models is that they summarize distribution characteristics through a small number of parameters (in the majority of cases, three parameters). Rekaya et al. [21] highlighted that a function with a minimum number of parameters and a significant biological interpretation is the most desirable. Although increasing the number of parameters in the model improves the quality of fit for some functions, interpreting the parameters of the most complex models is difficult, and, in many cases, it is impossible to connect them with the classic characteristics of the lactation curve [13].
3.3 Exponential functions
Among the parametric models, lactation curves adjusted by exponential functions or integrating an exponential component into the model formula were widely used. The first attempt to develop a mathematical model to describe the lactation curve dates back to 1923. Brody et al. [1] used an exponential function in the following form:
This model has been adjusted to monthly lactation yields. Its expression highlights the scaling factor for adjusting production to initial level
While this is a great improvement over the first model, Cobby and Le Du [23] reported that this model underestimates milk yield in the middle of lactation and overestimates milk yield at the end of lactation. This model was followed by an exponential parabolic function introduced by Sikka [24] for modeling milk yield:
This model provided a good fit for lactation curves of primiparous cows, but it was less effective for multiparous cows, resulting in a bell-shaped curve that does not respect the regularity of the lactation curve around peak production. Then, numerous proposals were made to improve these aspects. Fischer [25] attempted to improve model 3 by replacing the exponential decline integrated in this model by means of a linear decline:
This model underestimated the maximum milk yield and also resulted in a relatively early estimate of the peak date [26]. The individuality of this model is that the ratio
This model seems to be the first attempt to develop a model that varies both directly and exponentially over time. The disadvantage of this model is that it does not consider the initial evolution to the peak of production. Since the abovementioned exponential models do not correctly translate the ascendant phase of the lactation curve, Wood [28] proposed to adjust the entire curve by an incomplete gamma-type function:
The incomplete gamma function is probably the most popular parametric model of the lactation curve. It generates the standard form of the lactation curve as the product of a constant, a power function, and an exponential decline function [12]. The disjunction of the Wood equation in its components emphasizes the direct relation of its parameters with the main elements of the shape of the lactation curve. In this expression, the power function
Despite the limits reported, the Wood model remains the most used function for modeling lactation curves [35]. In addition, it has been used to describe traits other than milk yield, such as fatty acids [37], and it has also been used for adjustment of extended lactations [34, 36]. Many other models have been reported in the literature especially after the appearance of critics of Wood’s model. To anticipate these restrictions, many authors have proposed improvements to Wood’s model in order to increase its flexibility. As a result, several derivatives of Wood’s function have appeared while noting improvements in the modeling of the lactation curve. Beever et al. [20] summarizes the improvements made by these derivatives in greater flexibility in modeling curve shapes and in improving the mathematical properties of the model by decreasing the correlation between model parameters [32]. Cobby and Le Du [23] reported that milk yield after the peak of production declines at a constant rate and proposed a modification of Wood’s gamma function, substituting the power function (
The advantage of this model over that of Wood [28] is that declines in milk production are modeled exponentially [16]. This model allows a better adjustment of the initial phase of the lactation curve with a good estimate of peak production [26]. Rowlands et al. [26] compared models 3, 7, and 8 and concluded that model 8 describes the initial evolution of milk yield up to 5 weeks better than model 7. They also observed that models 7 and 8 slightly underestimated the initial yield and model 3 slightly overestimated the peak of lactation. Model 7 provided the best position of the peak yield date. According to Olori et al. [38], this model has a parameter that cannot be estimated by linear regression, and this limits its use in practice. Dhanoa [32] proposed a slightly different writing of Wood’s model:
Such model reduces the correlations between the parameters of the curve in many cases. In addition, this model provides, with parameter
where
Another modification of Wood’s function was attempted by Jenkins and Ferrell [40] (1984) through placing the exponent of
This model has its limits since curve fitting results have shown that evolution to maximum yield is relatively slow [41]. Based on the model of Cobby and Le Du [23], Wilmink [42] proposed a linear model to describe the lactation curve:
According to Wilmink [42], parameter
while this proposal is more complex than Wood?s equation, this model reduces the extent of underestimation at the beginning of lactation and overestimation at the final stage of lactation. Franci et al. [44] have successfully used to adjust the lactation curves of dairy ewes, characterized by a rapid increase in milk yield to the peak of lactation.
4. Linear adjustment models
Dairy production can be considered as a linear combination of the time since calving [13]. Gaines [18] developed a simple linear first-degree model to measure the persistency:
In this model, parameter
After the application of this simple linear model, several researchers have attempted to adapt the parabolic or quadratic model whose general form is as follows:
Dave [47] used a quadratic form for modeling the lactation curve of water buffalo in first lactations:
Sauvant and Fehr [48] sought to adjust lactation curves of dairy goats by a third-degree polynomial:
In this model, the interest of the presence of the term of degree 3 with respect to the parabolic model is the introduction of an asymmetry in the curve. The limits of this model are at the level of its parameters
Other higher-degree polynomials have been used to model milk production. With these models, the parameters remain simple to estimate. However, interpretation and biological significance remain a major difficulty. In addition, the adjustments obtained by some authors are satisfactory, but as indicated by Masselin et al. [13], they cover a portion of the lactation curve. Nelder [50] suggested that if biological responses over time were to be modeled with quadratic regression, then it was better to first perform an inverse data transformation. However, a polynomial of the second order is unbounded and tends to be symmetrical with respect to its stationary point, while the characteristic lactation curve is not symmetrical with respect to the asymptote. To obtain an asymmetry, it will be necessary to estimate more parameters. An inverse quadratic polynomial is bounded and generally nonnegative and has no integrated symmetry. As a result of these suggestions, Nelder [50] proposed a polynomial function (called the inverse function) to adjust the response curves in multifactorial experiments, particularly in the context of modeling lactation curves. Day t production is described as follows:
Yadav et al. [51] were the first to value this design to model the lactation curve of dairy cattle and found it appropriate. According to Batra [52], this model gave a good fit for low-yielding lactations with an early lactation peak. Based on the coefficient R2, the same author observed that the function 18 gives a better fit than the gamma function when weekly milk control data were used. This function was also greater to parabolic and exponential Wood models for modeling average lactation curves using weekly data from Hariana cows [53]. Olori et al. [38] showed that model 18 underestimates the milk yield around the peak and overestimates it immediately afterwards. An inverse transformation of data as proposed by Nelder [50] to obtain the properties will allow to model the lactation curve with a quadratic polynomial. Singh and Gopal [54] increased the number of parameters by including the term
and quadratic cum log model:
These authors indicated that these models were superior to the Wood models and the inverse polynomial when fitted to the bi-weekly controlled performance data. At the same time as one of the limits, these models are undefined at
where
Linear model has a greater number of coefficients that allow the adjustment of large forms, while its parameters do not have a technical and biological meaning [12]. Two mathematical models (Ali and Schaeffer and Wilmink) have been used successfully to adjust individual curves [2, 56]. Both models have also been implemented in earlier versions of random regression models [57, 58, 59]. A potential problem is raised by the author authors of model 21, and it is that the parameters of the regression model are strongly correlated, which can strongly limit its use. However, the results reported in several studies using this model are contradictory. According to Jamrozik et al. [60], this model gives results very similar to those obtained with the Wilmink model, despite the fact that the Ali and Schaeffer model includes additional parameters. Guo and Swalve [11, 61] have found that this model is less efficient than others, notably that of Wilmink. Concerning the limits of this model, Macciotta et al. [56] found very strong correlations in absolute values (from 0.85 to 0.99) of lactation curve coefficients with a standard form [62]. The correlations obtained by these authors are much higher than those reported by Ali and Schaeffer [55]. Olori et al. [38] compared different functions and showed that the function of Ali and Schaeffer was slightly better than that of Wilmink. Quinn et al. [63] reported that this model is inappropriate for the description of milk component lactation curve (percentages of fat and protein). Schaeffer and Dekkers [64] proposed to adjust the lactation curves by a logarithmic model:
Guo and Swalve [11] introduced the mixed logarithmic model:
This model can be considered as inspired by model 19 suggested by Singh and Gopal [54], substituting time
5. Other models
For a competitive model, Papajcsik and Bodero [66] have introduced trigonometric functions and combined functions with increasing variation such as
The authors compared the effectiveness of 20 different mathematical models, including these six models, and concluded that model 24 and the Wood model better fit the data of the Holstein cows. Although, this study is cited in most of the following studies as an advantage for Wood’s function over the model proposed, but we note that this work directs the thinking of modelers to the possibility of the use of complex mathematical models and particularly of trigonometric mathematical functions. An approach was introduced by Grossman and Koops [67] who proposed that lactation could be visualized as a multiphase biological process, that is, visualizing lactation as a two-step process divided in a slant until the peak of lactation is established as the first phase and the progressive decrease in production after the peak as a second phase. The suggested multiphasic logistic function determines the total milk yield by obtaining the sum of the yield resulting from each phase of lactation:
where
5.1 Methods applied for adjusting the lactation curve
Nonlinear and linear estimation methods have been used to adjust lactation curves, where the method employed is determined by the nature of the model to be used. Some models can be developed using a nonlinear and linear estimation at the same time, such as the polynomial regression model of Ali and Schaeffer [55]. Others can be transformed into linear models. Wood [28] has already noted that the gamma function can be converted into a simple linear regression model by performing a logarithmic transformation to determine the initial values of the parameters
To estimate the parameters following an iterative procedure, it is necessary to have those initial values, which will be subjected to successive iterations. These initial values are adjusted, and the sum of the squares of the residues is reduced significantly to each iteration. The process of estimating the parameters continues until a convergence criterion is met, accepting that from this point on, a negligible or no improvement in data fit is possible [17]. A major difficulty of this procedure is to enter the initial values of the model parameters. If the pre-estimates are not correct, the process may not converge to the minimum of the sum of error squares. Moreover, it is not always possible to know if the process converges toward the best estimate of the minimum of the sum of error squares [70]. The initial values should be reasonably close to the estimates of parameters that are still unknown if the optimization procedure cannot converge. The consequence of a bad choice of the initial parameters is the obtaining of low values of the coefficient of determination, the standard errors that are high [71], and consequently a poor quality of adjustment of the model to data. Fadel [71] discussed a technique for identifying appropriate estimates of initial parameter values using the nonlinear procedure of the statistical analysis system (SAS, PROC NLIN). This technique was illustrated via a segmented nonlinear model with four parameters to estimate (
The principle of this technique consists in using a network of values for
Different iterative methods such as Marquardt, Gauss-Newton, and Does not Use Derivatives (DUD) are frequently used in nonlinear regression models. The simplest method, known as the DUD, does not require the specification of the partial derivatives with respect to the parameters of the mathematical model. It is an algorithm that brings the derivatives closer by differences. However, it is important to note that this algorithm does not give good estimates especially when the parameters are strongly correlated. Another method commonly used in nonlinear regression is the iterative Gauss-Newton method also available in SAS. The algorithm of this technique is based on Taylor series development near the initial parameter values [67]. Generally, Marquardt’s nonlinear regression was the most commonly used method to adjust lactation curves using nonlinear models [34]. The Marquardt method, which follows an intermediate path between the Gauss-Newton (Taylor series method) and Newton (second derivative) methods, was often considered better to achieve convergence when the parameter estimates were strongly correlated [34].
6. Lactation curve of milk constituents
Lactation curves relating milk yield and milk constituents would be considered for influencing different of lactation curves. However, it is necessary to consider factors that influence milk yield and milk constituents, for example, different genetic, effect of climate, and nutrition. These factors will cause changes in milk compositions and milk yield which may be the data for prediction of the lactation curve and lactation persistency [72]. This link has already been studied at the phenotypic and genetic level. The genetic correlations of milk yield with negative percent of fat and protein were negative (0.25 and 0.27, respectively), and the same sign was observed at the phenotypic level (0.28 and 0.39, respectively) [72]. The ordinary description of milk secretion refers to the appearance of changes in milk composition during lactation, i.e., the decrease in milk yield is accompanied by an increase in fat and protein contents. Milk composition can be used as a diagnostic and monitoring tool in nutritional assessment [73]. Several studies have shown a correlation between energy levels and milk composition using different traits such as fat/protein ratio (FPR), protein/fat ratio, fat/lactose ratio [74]. Higher FPR values are associated with a decrease in dry matter intake and an increase in fat mobilization on the negative energy balance phase after calving [73]. Thus, FPR changes in milk may be an indication of a cow’s ability to adapt to the requirements of milk production and postpartum reproductive efficiency [75]. The richness of the milk (fat and protein contents) follows an inverse curve to that of the milk secretion, mainly because of the effect of the dilution. It decreases rapidly during the first weeks, stabilizes at a minimal level (nadir point), and rises again due to less dilution. The relative composition of milk constituents changes profoundly during the first days after parturition. The concentration of immunoglobulins decreases rapidly after parturition in favor of caseins. The nadir point of the fat concentration curve is reached approximately 3 weeks after peak lactation of milk yield, while that of protein is established near the peak of lactation [15]. As a result, milk fat and protein are often modeled with the same functions as those used to model milk production, provided that they can take a convex form. Most of the models generated for the description of the lactation curve focused only on milk yield, although the first reference found on the adjustment of lactation curve adapted to milk composition was that of Wood [28] who studied its seasonal variation. Figure 2 illustrates the shape of the lactation curve of milk yield and its fat and protein composition, expressed as a percentage and adjusted by the incomplete gamma function of Wood.

Figure 2.
Lactation curves for milk production, percent fat, and percent milk protein (Wood, 1976).
7. Conclusion
Mathematical models allow the lactation curve to be represented in terms of a set of parameters to be estimated. Various models have been used to study the lactation in dairy cattle. Each function has advantages and disadvantages. Parametric models such as Wood and Wilmink models have several advantages. Indeed, parametric models offer the possibility to calculate primary and secondary parameters, which have a biological meaning and are therefore easy to interpret. These parameters are key elements to study the effect of the environment factors on the shape of the lactation curves. Recently the increased availability of records per individual lactations and the genetic evaluation based on test day records has oriented the interest of modelers toward general linear functions, as polynomials or splines. But these functions present some computational difficulties especially at the level of the lactation curves parameters.