Summary statistics for parameter estimation of the simulated data of the model in (4.4) for different sample sizes.
The joint models for longitudinal data and time-to-event data have recently received numerous attention in clinical and epidemiologic studies. Our interest is in modeling the relationship between event time outcomes and internal time-dependent covariates. In practice, the longitudinal responses often show non-linear and fluctuated curves. Therefore, the main aim of this chapter is to use penalized splines with a truncated polynomial basis to parameterize the non-linear longitudinal process. Then, the linear mixed effects model is applied to subject-specific curves and to control the smoothing. The association between the dropout process and longitudinal outcomes is modeled through a proportional hazard model. Two types of baseline risk functions are considered, namely a Gompertz distribution and a piecewise constant model. The resulting models are referred to as penalized spline joint models; an extension of the standard linear joint models.
- survival data
- longitudinal data
- joint models
- time-dependent covariates
- random effects
The joint models for longitudinal data and time-to-event data are aimed to measure the association between the longitudinal marker level and the hazard rate for an event. The longitudinal data are collected repeatedly for several subjects. In this data, there are two types of covariates, namely, time-independent covariates and time-dependent covariates. Furthermore, there are also two different categories of time-dependent covariates, namely, external and internal covariates. In clinical studies, internal time-dependent longitudinal outcomes are often applied to monitor disease progression and failure time.
In modern survival analysis, Cox  has been considered as a very popular joint model to be used for time-independent covariates. These models measured the effect of time-independent covariates on the hazard rate for an event. Subsequently, the extended Cox model was developed for external time-dependent covariates. However, these latter models cannot handle longitudinal biomarkers. Therefore, Rizopoulos  introduced joint models for internal time-dependent covariates and the risk for an event based on linear mixed-effects models and relative risk models.
The basic assumption for the standard joint models proposed by Rizopoulos  is that the hazard rate at a given time of the dropout process is associated with the expected value of the longitudinal responses at the same time. The whole history of response has an influence on the survival function. Thus, it is crucial to obtain good estimates for the subject-specific trajectories in order to have an accurate estimation of the survival function. In addition, an important feature that we need to account for is that many observations in the sample often show non-linear and fluctuated longitudinal trajectories in time. Each observation has its own trajectory. Therefore, flexibility is needed for subject-specific longitudinal submodels in the joint models to improve the predictions.
There are several previous works to flexibly model the subject-specific longitudinal profiles in the joint models. Brown et al.  applied B-splines with multidimensional random effects. In particular, Brown et al.  assumed that both subject and population trajectories have the same number of basis functions. By doing this, the number of parameters in the longitudinal submodel is reasonably large. If we have to deal with the roughness of the fit for this model, the computational problems will increase especially when the dimension of the random effects vector is large. Ding and Wang  proposed the use of B-splines with a single multiplicative random effect to link the population mean function with the subject-specific profile. This simple model can gain an easy estimation for parameters, however may not be appropriate for many practical applications . Rizopoulos  considered more flexible models using natural cubic splines with the expansion of the random effects vector. The roughness of the fit is still not mentioned in these models.
In this chapter, we present new approaches to model non-linear shapes of subjects-specific evolutions for joint models by extending the standard joint models of Rizopoulos . In particular, we implement penalized splines using a truncated polynomial basis for the longitudinal submodel. Following this, the linear mixed-effects approach is applied to model the individual trajectories and impose smoothness over adjacent coefficients respectively. The ECM algorithm is used for parameter estimation. In addition, corresponding standard errors are calculated using the observed information matrix. However, as the matrices of random effects covariates in our models are different from the matrices of random effects covariates in the standard joint models, the JM package of Rizopoulos  cannot be used for our models. Therefore, a set of R codes are written for the penalized spline joint models to implement the proposed procedures on the simulated data and a case study respectively.
The chapter is organized as follows. Section 2 describes the penalized splines with truncated polynomial basis for the joint models. In this section, the two models are specified as penalized spline joint model with hazard rate at base line having Gompertz distribution (referred to as Model 1) and penalized spline joint model with a piecewise-constant baseline risk function (referred to as Model 2). The joint likelihood, score functions and the ECM algorithm for estimation are presented in Section 3. We then validate the proposed algorithm using extensive simulation studies and then apply it for AIDS data in Section 4. Finally, Section 5 gives concluding remarks.
2. The penalized spline joint models
In this section, we introduce the joint models using penalized spline with truncated polynomial basis. The proposed parametrization is based on the standard joint models of Rizopoulos  and the regression model of a longitudinal response using penalized spline.
Notations in this section are taken from Rizopoulos . Let be the true survival time and be the censoring time for the subject . denotes the observed failure time for the subject , which is defined as . If an subject is not censored, this means that we have observed its survival time, we will have . If an subject is censored, this means that we lose its follow up, or the subject has died from other causes, we will have . Furthermore, we define the event indicator as . The observed data for survival outcome are .
For a longitudinal response, suppose that we have subjects in the sample and the actual observed longitudinal data for each subject-at time point is . We measure the subject at time points. Thus, the longitudinal data consists of the measurements taken at time points We denote the true and unobserved value of the longitudinal outcome at time as . We assume the relation between and as
When survival function is assumed to have a specific parametric form associating with a longitudinal submodel, estimations for parameters of interest are usually based on the likelihood function . In the maximum likelihood method, there are different treatments for different types of covariates in the longitudinal submodel. Here, we present the two different categories of time-dependent covariates and the estimation techniques for these covariates will be introduced in the following sections. We let the time-dependent covariate for the subject at time be denoted by . We let denote the covariate history of the subject up to time . According to Kalbfleisch and Prentice , the exogenous covariates are the covariates satisfying the condition:
for all such that and . An equivalent definition is
On the other hand, endogenous time-varying covariates are the ones that do not satisfy the condition in (2.2). In particular,
In the penalized spline regression models [8, 9], the observed longitudinal covariate is modeled using the truncated power functions with a general power basis of degree . Moreover, the longitudinal response is also parameterized as a linear mixed-effects model to specify the individual curves and impose the amount of smoothing. As a result, the coefficients of the knots will be constrained to handle smoothing. In particular, the longitudinal submodel for the subject at time point is
Here, the set is known as the truncated power basis of degree . Moreover, are fitted knots, for which is chosen following Ruppert et al. , Chapter 5), Appendix D. The function is the smooth function which reflects the overall trend of the population. The function is the smooth function which reflects the individual curves. To constrain the coefficient of knots, the vector in the function is treated as random effects. Therefore, is a row vector of fixed effects and is a vector of random effects for the subject. The assumptions for the random effects for the subject are , , and they are independent of one another. We can now rewrite (2.4) as
We let and note that . In order to allow greater flexibility, we assume that , where . By doing this, the dimension of the vector of random effects, , decreases to . Consequently, the dimension of the multi-integrals in the log-likelihood function in (3.2) will also decrease. This presentation is crucial for reducing the computational problems while coding. The model in (2.5) now becomes:
The model in (2.6) can be rewritten in matrix notation as:
Here, is the matrix of observed longitudinal data; is the matrix of fixed effect covariates; is the matrix of random effect covariates and is the matrix of error.
Postulating a proportion hazard model, the penalized spline joint models for longitudinal and time-to-event data is defined by
where is the hazard at baseline and is a vector of baseline covariates (such as treatment indicator, gender of a patient, etc.). Furthermore, denotes the history of the true unobserved longitudinal process up to time point .
Using (2.7), the longitudinal submodel for the subject is given by
where the covariance matrix of random effects is given as
To complete the specification of the model in (2.8), we now need to define the form for the baseline risk function . Motivated by the fact that in real life, is usually unknown. Therefore, two options are adopted to determine the form of the function in this chapter. First, a standard option is to use a known parametric distribution for the risk function. For this option, the Gompertz distribution is chosen. Second, the piecewise constant model is chosen when is considered completely unspecified.
Therefore, the proposed penalized spline joint models considered in this chapter are as follows:
Model 1: Penalized spline joint model with hazard rate at base line having Gompertz distribution
Model 2: Penalized spline joint model with a piecewise-constant baseline risk function
where denotes a split of the time scale, with being larger than the largest observed time and denotes the value of the baseline hazard in the interval . In both models, , , , and are given in (2.7).
3. Parameter estimation
After defining the two penalized spline joint models, we now present the joint likelihood and score functions of the parameters in the models. The ECM algorithm is also presented in this section.
3.1. Likelihood and score functions
Following Rizopoulos , we assume that the vector of time-independent random effects underlies both the longitudinal and survival processes. This means that
where denotes the full parameter vector with denoting the parameter vector for the survival outcomes. Furthermore, is the parameter vector for longitudinal outcomes and is the vector-half of the variance matrix of random effects. In addition, we assume that the hazard rate at time conditional on the covariate path depends on the current value of longitudinal outcomes and the censoring mechanism is independent of the true event times and future longitudinal measurements. Under these assumptions, the log-likelihood formulation of the penalized spline joint models can be written as
where the conditional density for survival part has the form of
Here, is the survival function at time .
Moreover, the density for the longitudinal part with the random effects is given by
where denotes the dimensionality of the random effects vector.
We consider the log likelihood of the () over the unknown , and
The function for maximizing the log likelihood involves the density function of survival time and least squares with a penalty term, which is
According to Ruppert et al. , the term is called a roughness penalty and the variance components matrix defined as . Using a Lagrange multiplier argument, the variance components matrix is the condition to constrain the coefficients of the knots . These will restrict the influence of the variables and will lead to smoother spline functions.
Using (3.2), the score vector for the penalized spline joint models can be expressed as:
The requirement for numerical integration with respect to the random effects is one of the main difficulties in the joint models . The maximum likelihood estimation (MLE) is typically obtained using standard maximization algorithms such as expectation maximization algorithm or Newton-Raphson algorithm.
3.2. The ECM algorithm
The EM algorithm has been widely used in the joint models, such as for the standard joint model of Rizopoulos  and for the generalized linear mixed joint model . The ECM algorithm is a natural extension of EM algorithm for which the maximization process on the M-step is conditional on some functions of the parameters under estimation. It also can reduce computer time. The ECM algorithm will be used to obtain the maximum likelihood estimates of the penalized spline joint models following McLachlan and Krishnan  in this chapter.
In these models, the random effects are considered as missing data. Hence, it is difficult to estimate directly the parameter vector that maximizes the observed data log likelihood in (3.2). Alternatively, we can estimate the parameter vector that maximizes the expected value of the complete data log-likelihood which is , where is the parameter vector given at the iteration.
The following are the steps of this algorithm.
Step 1: Initialization
We first initialize the parameters. We assume that there are parameters in the models and the starting value of the parameter vector is . Based on these initial values, we calculate the log-likelihood using (3.2).
Step 2: The E-step for the penalized joint models
We fill in the missing data and replace the log-likelihood function of the observed data with the expected function of the complete data log-likelihood as follows:
Step 3: The conditional M-step for the penalized joint models.
This step will be implemented in four stages as follows:
3.1 Given the current value of the parameter vector at the iteration , we calculate the log likelihood at .
3.2 Propose the new value for the first parameter which maximizes . Then, we calculate the log likelihood at where .
3.3 Set if , otherwise set .
3.4 Similarly, based on the value of the parameter vector , we update the new value for the second parameter and continue updating for the last parameter, and set .
Step 4: Iterate among steps 2–3 until the algorithm numerically converges.
To update the new values for parameters in the conditional M-step, we have the closed-form estimates for the measurement of error variance and the covariance matrix of the random effects respectively by maximizing the expected function . Unfortunately, we cannot obtain closed-form expressions for the remaining of the parameters. We thus employ the one-step Newton-Raphson approach to get the updates for , , and respectively as detailed in Appendix B.
Following Louis , standard errors for the parameter estimates can be calculated by using the estimated observed information matrix
4. Empirical results
This section presents two simulation studies for Model 1, whereas Model 2 will be applied for a case study only. In Section 4.1, we simulate data from Model 1 with three internal knots in the longitudinal submodel and Gompertz distribution for the baseline risk function. In Section 4.2, we simulate data from Model 1 having Gompertz distribution for the baseline risk function and non-linear logarithm subject-specific trajectories. The ECM algorithm, written in R code, is applied to estimate the true values of parameters in both cases.
4.1. Simulation study 1
4.1.1. Data description
Recall the penalized spline joint Model 1 of (2.10) with three internal knots in longitudinal submodel and Gompertz distribution for the baseline risk function in the form of
where is the hazard function at baseline having Gompertz distribution, is baseline covariate and denotes the true and unobserved value of the longitudinal at time . The form of is given by
where is the vector of random effects and is assumed to have a normal distribution with mean zero and the diagonal covariance matrix . denote the three internal knots put into the model. The observed longitudinal value at time point for the subject is of the form
where the error variable is assumed to come from a normal distribution with mean zero and variance .
From this model, the vector of all parameters of the models in (4.1) and (4.2) is , where denotes the parameter vector for the survival outcomes. Furthermore, is the parameter vector for longitudinal outcomes and is the variance matrix of random effects.
To simulate the observed survival time of the joint model in (4.1), we applied the methods adapted by Bender et al. , Austin  and Crowther and Lambert  to generate the true survival time. We further assumed that the censoring mechanism is exponentially distributed with parameter . The observed survival time was the minimum of the censoring time and the true survival time. We generated the survival time for subjects with the parameters: ,and . Then we generated the longitudinal responses using (4.2). The simulated model is therefore
The sample of simulated data is presented in Appendix A. The curve of Kaplan-Meier estimate for the survival function of simulated data (left panel) and the longitudinal trajectories for the whole simulated sample (right panel) are presented in Figure 1. The dashed lines in the left panel correspond to 95% pointwise confidence intervals. It is clear from the plot of Kaplan-Meier estimator that the survival probability starts from 1 and decreases gradually until at the month of the study. After this, it is nearly zero after 6 months or so. The right panel is the longitudinal trajectories for the first 100 subjects reflecting the form as in (4.2).
4.1.2. Parameter estimation
The ECM algorithm, as described in Section 3.2, is now implemented to estimate all parameters in (4.4). The initial values of the parameters were set at , respectively. However, these initial values can also be set randomly. The traces of each of these parameters are presented in Figures 2 and 3, respectively. The traces of estimates show the way how the algorithm updates new values of the parameters. In addition, they also demonstrate the convergence of the algorithm after 10–20 iterations. In particular, the parameters and converge linearly to the true values while the parameters and oscillate before converging to the true values.
We now run the simulation for 30 independent samples with different sample sizes (and ). Then, we calculate the means, standard deviations (SD) and mean square error (MSE) of parameters as presented in Table 1. The point estimates of each parameter are reasonably close to the true values when the sample sizes are 300 and 500. This is also supported by the values of SD and MSE which decrease gradually when the sample size increases. In addition to this, we also compare the parameter estimates with different censoring rates (20% and 40%) for a sample size of 500 in 5, Appendix E. The result shows that when the sample size is large the censoring rate has little influence on the estimates.
4.2. Simulation study 2
4.2.1. Data description
We now perform a simulation study on proportional hazard model having Gompertz distribution at baseline and non-linear subject-specific trajectory. In particular, the model is in the form of
where is the hazard function at baseline having Gompertz distribution, is baseline covariate and denotes the true and unobserved value of the longitudinal at time . The observed longitudinal value at time point for the subject has the non-linear form
where . In the model of (4.6), the mean longitudinal response of the population is assumed to have a non-linear logarithm curve. Different subjects are assumed to have different intercepts and slopes. In particular, we assume that having a bivariate normal distribution with mean and covariance matrix . The true values of the other parameters we put in the model were , respectively. In addition, the censoring mechanism is assumed exponentially distributed with a parameter of .
Based on the model in (4.5) and the simulation study 1, we simulated survival times for 500 subjects with 35% censoring rate. In particular, the ending time for the study was 5 months and all subjects alive by the end of the study (i.e. time 5) were assumed to be censored. This design was also reflected of many clinical studies in real life. In this sample, there were 329 uncensored subjects and 1387 observations for 500 subjects. For each subject, 1–5 longitudinal measurements were recorded. On average, there were three longitudinal measurements per subject. In Figure 4, the Kaplan-Meier estimate for survival curve is presented for the simulated data of (4.5) with 95% pointwise confidence intervals in the left panel. Moreover, the subject-specific longitudinal profiles for six randomly selected subjects is drawn in the right panel. It can be seen that some of the subjects in this dataset showed non-linear evolutions in their longitudinal values. Each subject has its own trajectory.
4.2.2. Parameter estimation
We will be using Model 1 in (4.1) and in (4.2) to handle the non-linear longitudinal trajectory in the simulated data in (4.5). In this model, we put three internal knots at 25, 50 and 75%, respectively, of the follow up time. Then, the ECM algorithm, as explained in Section 3, is implemented once again to estimate all parameters in the model.
The results for parameter estimation are presented in Table 2. The means, standard deviations and 95% confidence intervals of parameter estimates are calculated for 30 independent samples. The point estimates for are reasonably close to the true values. Similarly, the 95% CIs include the true values of .
|Parameter||True value||Estimate||SD||95% CI|
Based on the estimated values of parameters, we generate back the estimated survival time by approximating values of random effects from linear mixed-effects function. The detail of the generation is explained in Appendix C. Then, we use the Kaplan-Meier estimate to compare between the survival function of the simulated dataset (the black solid line) and the estimated survival function (the dashed line) which are presented in the left panel of Figure 5.
Moreover, we also draw the smooth and predicted longitudinal profiles for 12 patients chosen randomly in the right panel of Figure 5. The dot points are the true observed longitudinal values from simulated data. The solid lines are the smooth longitudinal profiles of the true observed longitudinal values using the loess smoother and the dashed lines are the predicted profiles of 12 randomly selected individuals. It is clear that the Kaplan-Meier estimates from simulated data overlaps the Kaplan-Meier estimates based on the predicted value in the left panel of Figure 4. The penalized spline regression model in (2.10) was a good fit for subject-specific curves in the right panel of Figure 5.
In summary, simulation studies have shown the stability of the algorithm and the goodness of fit of the penalized spline models. From the simulation study 1, it is shown that the updating process through the ECM algorithm converges quickly to the true values of the parameters. In addition, the simulation study 2 shows that the model can well predict the survival function and individual trajectories respectively.
4.3. The AIDS data set
In the AIDS dataset, there were 467 patients with advanced human immunodeficiency virus infection during antiretroviral treatment who had failed or were intolerant to zidovudine therapy. Patients in the study were randomly assigned to receive either didanosine drug (ddI) or zalcitabine drug (ddC). CD4 cells are a type of white blood cells made in the spleen, lymph nodes and thymus gland and are part of the infection-fighting system. CD4 cell counts were recorded at the time of study entry as well as at 2, 6, 12 and 18 months thereafter. The detail regarding the design of this study can be found in Abrams et al. . By the end of the study, there were 188 patients died, resulting in about 59.7% censoring. There were 1405 longitudinal responses recorded.
Previously, Rizopoulos  used his standard joint model for the AIDS data which consider the variability between subjects mostly depend on the intercept. However, the model could not predict observed longitudinal data accurately. When the time unit is changed from month to year in the data, the variability between subjects depends not only on the intercept but also on the variable. In addition, the longitudinal trajectories plot also shows many non-linear curves as depicted in the right panel of Figure 6.
Given the non-linearity, it is appropriate to apply our models, Model 1 and Model 2, for the AIDS data. In particular, we use the two joint models in (2.10) and (2.11) with the four internal knots are placed at 20, 40, 60, 80%, respectively of the observed failure times for hazard rate at baseline. Then, the ECM algorithm is implemented to estimate all parameters in the two models. A summary of statistics for parameter estimation using Model 1 and Model 2 is presented in Table 3.
|Model 1||Model 2|
|Parameter||Estimate||Std. error||z-value||p-value||Parameter||Estimate||Std. error||z-value||p-value|
Following Rizopoulos , in Model 1 and Model 2, the univariate Wald tests are applied for the fixed effects in the longitudinal submodel, the regression coefficient and the association parameter respectively. The results from Table 3 show that the point estimates of are all statistically significant for both models at a significance level of 5%.
We conduct the Kaplan-Meier estimate of the survival function from the observed survival time (the light solid line) and the dot lines correspond to 95% pointwise confidence intervals in Figure 6 (left panel). The predicted survival function from Model 1 is the dashed line and the predicted survival function from Model 2 is the bold solid line. The plots show that Model 2 works very well in this case as shown in Figure 7. Moreover, Model 2 is also preferred in practice because usually is considered as unspecified in order to avoid the impact of misspecifying the distribution of survival times.
Based on the model of longitudinal regression in (4.2), we also draw the smooth and predicted longitudinal profiles for nine patients from the AIDS dataset as depicted in Figure 7 (right panel). The dot points are the true observed longitudinal values. The solid lines are the smooth longitudinal profiles using the loess smoother and the dashed lines are the predicted profiles of nine randomly selected individuals. Most of the predicted profiles are quite close to the observed ones.
In this chapter, two joint models using a penalized spline with a truncated polynomial basis have been proposed to model a non-linear longitudinal outcome and a time-to-event data. The use of a truncated polynomial basis gives us an intuitive and obvious way to model non-linear longitudinal outcome. By adding some penalties for the coefficients of the knots and using linear mixed-effects models, the smoothing is controlled and the individual curves are specified.
We have conducted a sensitivity analysis on the assumption of normality for either random effects or errors. The t-distribution with the degree of freedom 5 is applied for each of them. The results show that the estimates of parameters are sensitive when both of terms are not normally distributed.
The main findings we may derive from this chapter are, at least, threefold: (1) the ECM algorithm provides a reasonable quick convergence algorithm for the proposed models; (2) the fitted joint models are able to measure the association between the internal time-dependent covariates and the risk for an event and (3) the two models provide a good prediction for both the longitudinal and survival functions, as presented in empirical results.
The limitations of this study are, at least, threefold: (1) the number of internal knots is limited to three due to computational costs; (2) the polynomial power functions can form an ill-conditioned basis for the models and (3) the estimation results are sensitive when both random effects and error are not normally distributed.
Based on the limitations, our future work will focus on using new methods for approximating the integrals to reduce the computational problems or relaxing the normality assumption. Furthermore, we will apply a different basis for joint models, that is the penalized B-spline. In terms of parameter estimation, we are considering a different approach to estimate the parameters in the models using a Bayesian approach, via Markov chain Monte Carlo (MCMC) algorithms.
One sample of simulated data of the penalized spline joint model in (4.4) is presented in Table 4 for the first three patients. The subjects are measured bimonthly and the entry time is 0 for all subjects. Obstime variable includes the time points at which these measurements are recorded. Time variable includes the observed survival times when subject meets an event. is a time-constant binary random variable with parameter . Column y contains the longitudinal responses. Death variable is the event status indicator. This variable receives value 1 when the true survival time is less than or equal to the censoring time and 0 otherwise. We define the four random effects variables which are , , and . For the longitudinal process, there are 1902 of observations for 500 subjects. For each subject, 1-7 longitudinal measurements are recorded. On average, there are four longitudinal measurements per subject. For the event process, there are 297 subjects who meet for an event which is equivalent to 59.4% of the whole sample.
|Parameter||True value||Censored (20%)||Censored (40%)|
The integrals with respect to the random effects in (3.7) do not have closed-form solutions. Therefore, in this chapter, we implement the Gaussian-Hermite quadrature rule as in Rizopoulos  to approximate the integrals. In our simulation study and R coding, we use the Gaussian-Hermite quadrature rule with 10 quadrature points.
The updating formulas of the parameters in Step 3 have different forms for each parameter following Rizopoulos . We have the closed-form estimates for the measurement error variance in the longitudinal model and the covariance matrix of the random effects as follows:
where and . The updating formula for is
Unfortunately, we cannot obtain closed-form expressions for the fixed effects and the parameters of the survival submodel . We thus employ the one-step Newton-Raphson approach to obtain the updated , ,and . In particular, we have
where is the score vector corresponding to parameter and the score vector has the form of
There are four cases for simulating survival time of the model (4.1) as follows.
When the survival time , we calculate the cumulative hazard function . Based on the relation between the survival function , cumulative hazard function and cumulative distribution , we have
Following (5.4), we set
where is a random variable with . The survival time is the solution of the equation
The condition is equal to
When , we calculate the cumulative hazard function . The survival time is the solution of the equation
where is a value of . The condition is equal to
When , we calculate the cumulative hazard function . The survival time is the solution of the equation
where is a value of . The condition is equal to
When , the cumulative hazard function has the form . Survival time is the solution of the equation
In particular, Ruppert et al.  introduced a default choices for knot location and number of knots. The idea is to choose sufficient knots to resolve the essential structure in the underlying regression function. But for more complicated penalized spline models, there are computational advantages to keeping the number of knots relatively low. A reasonable default is to choose the knots to ensure that there are a fixed number of unique observations, say 4–5, between each knot. For large data sets, this can lead to an excessive numbers of knots; therefore, a maximum number of allowable knots (say, 20–40 total) are recommended.
According to Ruppert et al. , the choice for knot position is
sample quantile of the unique for .
The simple choice of K is
See Table 5.