Open access peer-reviewed chapter

Penalized Spline Joint Models for Longitudinal and Time-To-Event Data

Written By

Huong Thi Thu Pham and Hoa Pham

Submitted: December 1st, 2017 Reviewed: February 26th, 2018 Published: June 6th, 2018

DOI: 10.5772/intechopen.75975

Chapter metrics overview

965 Chapter Downloads

View Full Metrics

Abstract

The joint models for longitudinal data and time-to-event data have recently received numerous attention in clinical and epidemiologic studies. Our interest is in modeling the relationship between event time outcomes and internal time-dependent covariates. In practice, the longitudinal responses often show non-linear and fluctuated curves. Therefore, the main aim of this chapter is to use penalized splines with a truncated polynomial basis to parameterize the non-linear longitudinal process. Then, the linear mixed effects model is applied to subject-specific curves and to control the smoothing. The association between the dropout process and longitudinal outcomes is modeled through a proportional hazard model. Two types of baseline risk functions are considered, namely a Gompertz distribution and a piecewise constant model. The resulting models are referred to as penalized spline joint models; an extension of the standard linear joint models.

Keywords

  • survival data
  • longitudinal data
  • joint models
  • time-dependent covariates
  • random effects

1. Introduction

The joint models for longitudinal data and time-to-event data are aimed to measure the association between the longitudinal marker level and the hazard rate for an event. The longitudinal data are collected repeatedly for several subjects. In this data, there are two types of covariates, namely, time-independent covariates and time-dependent covariates. Furthermore, there are also two different categories of time-dependent covariates, namely, external and internal covariates. In clinical studies, internal time-dependent longitudinal outcomes are often applied to monitor disease progression and failure time.

In modern survival analysis, Cox [1] has been considered as a very popular joint model to be used for time-independent covariates. These models measured the effect of time-independent covariates on the hazard rate for an event. Subsequently, the extended Cox model was developed for external time-dependent covariates. However, these latter models cannot handle longitudinal biomarkers. Therefore, Rizopoulos [2] introduced joint models for internal time-dependent covariates and the risk for an event based on linear mixed-effects models and relative risk models.

The basic assumption for the standard joint models proposed by Rizopoulos [2] is that the hazard rate at a given time of the dropout process is associated with the expected value of the longitudinal responses at the same time. The whole history of response has an influence on the survival function. Thus, it is crucial to obtain good estimates for the subject-specific trajectories in order to have an accurate estimation of the survival function. In addition, an important feature that we need to account for is that many observations in the sample often show non-linear and fluctuated longitudinal trajectories in time. Each observation has its own trajectory. Therefore, flexibility is needed for subject-specific longitudinal submodels in the joint models to improve the predictions.

There are several previous works to flexibly model the subject-specific longitudinal profiles in the joint models. Brown et al. [3] applied B-splines with multidimensional random effects. In particular, Brown et al. [3] assumed that both subject and population trajectories have the same number of basis functions. By doing this, the number of parameters in the longitudinal submodel is reasonably large. If we have to deal with the roughness of the fit for this model, the computational problems will increase especially when the dimension of the random effects vector is large. Ding and Wang [4] proposed the use of B-splines with a single multiplicative random effect to link the population mean function with the subject-specific profile. This simple model can gain an easy estimation for parameters, however may not be appropriate for many practical applications [5]. Rizopoulos [5] considered more flexible models using natural cubic splines with the expansion of the random effects vector. The roughness of the fit is still not mentioned in these models.

In this chapter, we present new approaches to model non-linear shapes of subjects-specific evolutions for joint models by extending the standard joint models of Rizopoulos [2]. In particular, we implement penalized splines using a truncated polynomial basis for the longitudinal submodel. Following this, the linear mixed-effects approach is applied to model the individual trajectories and impose smoothness over adjacent coefficients respectively. The ECM algorithm is used for parameter estimation. In addition, corresponding standard errors are calculated using the observed information matrix. However, as the matrices of random effects covariates in our models are different from the matrices of random effects covariates in the standard joint models, the JM package of Rizopoulos [6] cannot be used for our models. Therefore, a set of R codes are written for the penalized spline joint models to implement the proposed procedures on the simulated data and a case study respectively.

The chapter is organized as follows. Section 2 describes the penalized splines with truncated polynomial basis for the joint models. In this section, the two models are specified as penalized spline joint model with hazard rate at base line having Gompertz distribution (referred to as Model 1) and penalized spline joint model with a piecewise-constant baseline risk function (referred to as Model 2). The joint likelihood, score functions and the ECM algorithm for estimation are presented in Section 3. We then validate the proposed algorithm using extensive simulation studies and then apply it for AIDS data in Section 4. Finally, Section 5 gives concluding remarks.

Advertisement

2. The penalized spline joint models

In this section, we introduce the joint models using penalized spline with truncated polynomial basis. The proposed parametrization is based on the standard joint models of Rizopoulos [2] and the regression model of a longitudinal response using penalized spline.

Notations in this section are taken from Rizopoulos [2]. Let Ti be the true survival time and Ci be the censoring time for the ith subject i=1n. Ti denotes the observed failure time for the ith subject i=1n, which is defined as Ti=minTiCi. If an ith subject is not censored, this means that we have observed its survival time, we will have TiCi. If an ith subject is censored, this means that we lose its follow up, or the subject has died from other causes, we will have Ti>Ci. Furthermore, we define the event indicator as δi=ITiCi. The observed data for survival outcome are Tiδi,i=1,,n.

For a longitudinal response, suppose that we have n subjects in the sample and the actual observed longitudinal data for each subject-i at time point t is yit. We measure the ith subject at ni time points. Thus, the longitudinal data consists of the measurements yij=yitijj=1ni taken at time points tij. We denote the true and unobserved value of the longitudinal outcome at time t as mit. We assume the relation between yit and mit as

yit=mit+εit,E1

where εitN0σε2.

When survival function St is assumed to have a specific parametric form associating with a longitudinal submodel, estimations for parameters of interest are usually based on the likelihood function [2]. In the maximum likelihood method, there are different treatments for different types of covariates in the longitudinal submodel. Here, we present the two different categories of time-dependent covariates and the estimation techniques for these covariates will be introduced in the following sections. We let the time-dependent covariate for the ith subject at time t be denoted by yit. We let Yit=yis0s<t denote the covariate history of the ith subject up to time t. According to Kalbfleisch and Prentice [7], the exogenous covariates are the covariates satisfying the condition:

PrsTi<s+dsTisYis=PrsTi<s+dsTisYit,E2

for all s,t such that 0<st and ds0. An equivalent definition is

PrYitYisTis=PrYitYisTi=s,st.E3

On the other hand, endogenous time-varying covariates are the ones that do not satisfy the condition in (2.2). In particular,

PrYitYisTisPrYitYisTi=s,st.

In the penalized spline regression models [8, 9], the observed longitudinal covariate is modeled using the truncated power functions with a general power basis of degree p. Moreover, the longitudinal response is also parameterized as a linear mixed-effects model to specify the individual curves and impose the amount of smoothing. As a result, the coefficients of the knots will be constrained to handle smoothing. In particular, the longitudinal submodel for the ith subject at time point tij is

yij=ftij+gitij+εtij,εitijN0σε2,ftij=β0+β1tij++βptijp+k=1KupktijKk+p,gitij=vi0+vi1tij+vi2tij2++viptijp+k=1KwipktijKk+p.E4

Here, the set 1tijtijptijK1+ptijKK+p is known as the truncated power basis of degree p. Moreover, K1,,KK are fitted K knots, for which K is chosen following Ruppert et al. [9], Chapter 5), Appendix D. The function f. is the smooth function which reflects the overall trend of the population. The function gi. is the smooth function which reflects the individual curves. To constrain the coefficient of knots, the vector up1upKT in the function f. is treated as random effects. Therefore, βT=β0βp is a p+1×1 row vector of fixed effects and biT=up1upKvi0vipwip1wipK is a p+2K+1×1 vector of random effects for the ith subject. The assumptions for the random effects for the ith subject are vi0vipTN0, upkN0σu2, wipkN0σw2 and they are independent of one another. We can now rewrite (2.4) as

yitij=ftij+gitij+εitij=β0+β1tij+β2tij2++βptijp+k=1K(upk+wipk)tijKk+p+vi0+vi1tij+vi2tij2++viptijp+εitij.E5

We let uipk=upk+wipk and note that uipkN0σu2+σw2. In order to allow greater flexibility, we assume that uip1uipKTN0D, where D=DiagD11DKK. By doing this, the dimension of the vector of random effects, biT=vi0vipuip1uipK, decreases to p+K+1×1. Consequently, the dimension of the multi-integrals in the log-likelihood function in (3.2) will also decrease. This presentation is crucial for reducing the computational problems while coding. The model in (2.5) now becomes:

yitij=ftij+gitij+εitij=β0+β1tij+β2tij2++βptijp+k=1KuipktijKk+p+vi0+vi1tij+vi2tij2++viptijp+εitij.E6

The model in (2.6) can be rewritten in matrix notation as:

y=Xβ+Zb+ε,E7

where

X=X1Xn,Z=X1000X2000XnZ1000Z2000Zn,
Xi=1ti1ti12ti1p1tinitini2tinip,Zi=ti1K1+pti1KK+ptiniK1+ptiniKK+p,
bT=v10v1pvn0vnpu1p1u1pKunp1unpK,βT=β0βp.

Here, y is the i=1nni×1 matrix of observed longitudinal data; X is the i=1nni×p+1 matrix of fixed effect covariates; Z is the i=1nni×p+K+1n matrix of random effect covariates and ε is the i=1nni×1 matrix of error.

Postulating a proportion hazard model, the penalized spline joint models for longitudinal and time-to-event data is defined by

hitMitwi=limdt0PrtTi<t+dtTitMitwi/dt=h0texpγTwi+αmit,E8

where h0t is the hazard at baseline and wi is a vector of baseline covariates (such as treatment indicator, gender of a patient, etc.). Furthermore, Mit=mis0s<t denotes the history of the true unobserved longitudinal process up to time point t.

Using (2.7), the longitudinal submodel for the ith subject is given by

mit=mit+εit,εitN0σε2yit=XiTtβ+XiTtvi+ZiTtui+εitviN0,uiN0D,E9

where the covariance matrix of random effects biT=vi0vipuip1uipK is given as

G=Covbi=00D.

To complete the specification of the model in (2.8), we now need to define the form for the baseline risk function h0.. Motivated by the fact that in real life, h0. is usually unknown. Therefore, two options are adopted to determine the form of the function h0. in this chapter. First, a standard option is to use a known parametric distribution for the risk function. For this option, the Gompertz distribution is chosen. Second, the piecewise constant model is chosen when h0. is considered completely unspecified.

Therefore, the proposed penalized spline joint models considered in this chapter are as follows:

Model 1: Penalized spline joint model with hazard rate at base line having Gompertz distribution

hitMitwi=λ1expλ2texpγTwi+αmitmit=XiTtβ+XiTtvi+ZiTtui.E10

Model 2: Penalized spline joint model with a piecewise-constant baseline risk function

hi(t|Mit,wi)=q=1QξqIνq1<tνqexpγTwi+αmitmit=XiTtβ+XiTtvi+ZiTtui,E11

where 0=ν0<ν1<<νQ denotes a split of the time scale, with νQ being larger than the largest observed time and ξq denotes the value of the baseline hazard in the interval νq1νq. In both models, Xi, Zi, β, vi and ui are given in (2.7).

Advertisement

3. Parameter estimation

After defining the two penalized spline joint models, we now present the joint likelihood and score functions of the parameters in the models. The ECM algorithm is also presented in this section.

3.1. Likelihood and score functions

Following Rizopoulos [2], we assume that the vector of time-independent random effects bi underlies both the longitudinal and survival processes. This means that

pTiδiyibiθ=pTiδibiθpyibiθpyibiθ=jpyitijbiθ,E12

where θ=θtTθyTθbTT denotes the full parameter vector with θt=γTαθh0TT denoting the parameter vector for the survival outcomes. Furthermore, θy=βTσε2T is the parameter vector for longitudinal outcomes and θb=vechG is the vector-half of the variance matrix of random effects. In addition, we assume that the hazard rate at time t conditional on the covariate path depends on the current value of longitudinal outcomes and the censoring mechanism is independent of the true event times and future longitudinal measurements. Under these assumptions, the log-likelihood formulation of the penalized spline joint models can be written as

lθ=lθTiδiyi=ilogbipTiδibiθtβpyibiθypbiθbdbi,E13

where the conditional density for survival part has the form of

pTiδibiθtβ=hTiMiTiwiθtβδiSTiMiTiwiθtβ=h0texpγTwi+αmitδiexp0Tih0sexpγTwi+αmisds.E14

Here, St is the survival function at time t.

Moreover, the density for the longitudinal part with the random effects is given by

pyibiθypbiθb=jpyitijbiθypbiθb=12πσε2ni2expyitijXiTtijβXiTtijviZiTtijui22σε2×2πqb2detG1/2expbiTG1bi/2,E15

where qb denotes the dimensionality of the random effects vector.

We consider the log likelihood of the (Ti,δi,yi,bi) over the unknown θt, β and bi

1loglθTiδiyibi=logpTiδibiθtβ+logpyibiβ+logpbiG.

The function for maximizing the log likelihood involves the density function of survival time and least squares with a penalty term, which is

logpTiδibiθtβyiXiβXiviZiuiTyiXiβXiviZiuiσε2biTG1bi.E16

According to Ruppert et al. [9], the term σε2biTG1bi is called a roughness penalty and the variance components matrix defined as F=σε2G1. Using a Lagrange multiplier argument, the variance components matrix is the condition to constrain the coefficients of the knots ui. These will restrict the influence of the variables tKk+p and will lead to smoother spline functions.

Using (3.2), the score vector for the penalized spline joint models can be expressed as:

Sθ=iθTlogpTiδibiθtβpyibiθypbiθbdbi=iθTlogpTiδibiθtβp(yibiθy)p(biθb).pbiTiδiyiθdbi.E17

The requirement for numerical integration with respect to the random effects is one of the main difficulties in the joint models [2]. The maximum likelihood estimation (MLE) is typically obtained using standard maximization algorithms such as expectation maximization algorithm or Newton-Raphson algorithm.

3.2. The ECM algorithm

The EM algorithm has been widely used in the joint models, such as for the standard joint model of Rizopoulos [2] and for the generalized linear mixed joint model [10]. The ECM algorithm is a natural extension of EM algorithm for which the maximization process on the M-step is conditional on some functions of the parameters under estimation. It also can reduce computer time. The ECM algorithm will be used to obtain the maximum likelihood estimates of the penalized spline joint models following McLachlan and Krishnan [11] in this chapter.

In these models, the random effects bi are considered as missing data. Hence, it is difficult to estimate directly the parameter vector θ that maximizes the observed data log likelihood lθ in (3.2). Alternatively, we can estimate the parameter vector θ that maximizes the expected value of the complete data log-likelihood which is ElogpTiδiyibiθTiδiyiθit, where θit is the parameter vector given at the ith iteration.

The following are the steps of this algorithm.

Step 1: Initialization

We first initialize the parameters. We assume that there are m parameters in the models and the starting value of the parameter vector is θ0=θ10θm0. Based on these initial values, we calculate the log-likelihood using (3.2).

Step 2: The E-step for the penalized joint models

We fill in the missing data and replace the log-likelihood function of the observed data with the expected function of the complete data log-likelihood as follows:

Qθθit=ilogpTiδiyibiθ.pbiTiδiyiθitdbi=ilogpTiδibiθ+logp(yibiθ)+logp(biθ).pbiTiδiyiθitdbi.E18

Step 3: The conditional M-step for the penalized joint models.

This step will be implemented in four stages as follows:

3.1 Given the current value of the parameter vector at the ith iteration θit=θ1itθ2itθmit, we calculate the log likelihood at lθit=ilogbipTiδiyibiθitdbi.

3.2 Propose the new value for the first parameter θ1prop which maximizes Qθθit. Then, we calculate the log likelihood at lθprop where θprop=θ1propθ2itθmit.

3.3 Set θ1it=θprop if lθproplθit, otherwise set θ1it=θit.

3.4 Similarly, based on the value of the parameter vector θ1it, we update the new value for the second parameter and continue updating for the last parameter, θmit and set θit+1=θmit.

Step 4: Iterate among steps 2–3 until the algorithm numerically converges.

To update the new values for parameters in the conditional M-step, we have the closed-form estimates for the measurement of error variance σ2 and the covariance matrix of the random effects respectively by maximizing the expected function Qθθit. Unfortunately, we cannot obtain closed-form expressions for the remaining of the parameters. We thus employ the one-step Newton-Raphson approach to get the updates for βit+1, γit+1, αit+1 and θh0it+1 respectively as detailed in Appendix B.

Following Louis [12], standard errors for the parameter estimates can be calculated by using the estimated observed information matrix

v̂arθ̂=Iθ̂1,

where

Iθ̂=i=1nSiθθθ=θ̂.
Advertisement

4. Empirical results

This section presents two simulation studies for Model 1, whereas Model 2 will be applied for a case study only. In Section 4.1, we simulate data from Model 1 with three internal knots in the longitudinal submodel and Gompertz distribution for the baseline risk function. In Section 4.2, we simulate data from Model 1 having Gompertz distribution for the baseline risk function and non-linear logarithm subject-specific trajectories. The ECM algorithm, written in R code, is applied to estimate the true values of parameters in both cases.

4.1. Simulation study 1

4.1.1. Data description

Recall the penalized spline joint Model 1 of (2.10) with three internal knots in longitudinal submodel and Gompertz distribution for the baseline risk function in the form of

hit=h0texpγxi+αmit=λ1expλ2texpγxi+αmit,E19

where h0t is the hazard function at baseline having Gompertz distribution, xi is baseline covariate and mit denotes the true and unobserved value of the longitudinal at time t. The form of mit is given by

mit=β0+β1t+ui1tK1++ui2tK2++ui3tK3++vi0,E20

where bi=u11u12u13vi0T is the vector of random effects and is assumed to have a normal distribution with mean zero and the diagonal covariance matrix D=DiagD11D22D33D44. K1,K2,K3 denote the three internal knots put into the model. The observed longitudinal value at time point t for the ith subject is of the form

yit=mit+εit,E21

where the error variable εit is assumed to come from a normal distribution with mean zero and variance σ2.

From this model, the vector of all parameters θ of the models in (4.1) and (4.2) is θ=θtTθyTθbTT, where θt=γαλ1λ2T denotes the parameter vector for the survival outcomes. Furthermore, θy=β0β1σε2T is the parameter vector for longitudinal outcomes and θb=D is the variance matrix of random effects.

To simulate the observed survival time Ti of the joint model in (4.1), we applied the methods adapted by Bender et al. [13], Austin [14] and Crowther and Lambert [15] to generate the true survival time. We further assumed that the censoring mechanism is exponentially distributed with parameter λ. The observed survival time was the minimum of the censoring time and the true survival time. We generated the survival time Ti for n=500 subjects with the parameters: β0=5,β1=2,λ1=0.1,λ2=0.5,γ=0.5,α=0.05,δ=2 and D=Diag2224. Then we generated the longitudinal responses mit using (4.2). The simulated model is therefore

hit=0.1exp0.5texp0.5xi+0.05mitmit=5+2t+ui1t1++ui2t2++ui3t3++vi0.E22

The sample of simulated data is presented in Appendix A. The curve of Kaplan-Meier estimate for the survival function of simulated data (left panel) and the longitudinal trajectories for the whole simulated sample (right panel) are presented in Figure 1. The dashed lines in the left panel correspond to 95% pointwise confidence intervals. It is clear from the plot of Kaplan-Meier estimator that the survival probability starts from 1 and decreases gradually until at the 5th month of the study. After this, it is nearly zero after 6 months or so. The right panel is the longitudinal trajectories for the first 100 subjects reflecting the form as in (4.2).

Figure 1.

Kaplan-Meier estimate of the survival function of the simulated data of (4.4) (left panel). Longitudinal trajectories of the first 100 subjects from the simulated sample of (4.4) (right panel).

4.1.2. Parameter estimation

The ECM algorithm, as described in Section 3.2, is now implemented to estimate all parameters in (4.4). The initial values of the parameters were set at β0=1,β1=1,λ1=0.05,λ2=0.1,γ=0.1,α=0.01,σ=1,D11=3,D22=3,D33=3,D44=3, respectively. However, these initial values can also be set randomly. The traces of each of these parameters are presented in Figures 2 and 3, respectively. The traces of estimates show the way how the algorithm updates new values of the parameters. In addition, they also demonstrate the convergence of the algorithm after 10–20 iterations. In particular, the parameters β0,β1,λ2,α,σ,D22 and D33 converge linearly to the true values while the parameters λ1,γ,D11, and D44 oscillate before converging to the true values.

Figure 2.

The traces of parameters β0,β1,λ1,λ2,γ,α for 100 iterations.

Figure 3.

The traces of parameters σ,D11,D22,D33,D44 for 100 iterations.

We now run the simulation for 30 independent samples with different sample sizes (n=200,300 and 500). Then, we calculate the means, standard deviations (SD) and mean square error (MSE) of parameters as presented in Table 1. The point estimates of each parameter are reasonably close to the true values when the sample sizes are 300 and 500. This is also supported by the values of SD and MSE which decrease gradually when the sample size increases. In addition to this, we also compare the parameter estimates with different censoring rates (20% and 40%) for a sample size of 500 in 5, Appendix E. The result shows that when the sample size is large the censoring rate has little influence on the estimates.

ParameterTrue valuen=200n=300n=500
EstimateSDMSEEstimateSDMSEEstimateSDMSE
β054.210.720.764.680.500.325.100.300.27
β121.690.750.571.860.750.282.100.570.18
λ10.10.120.130.000.120.120.000.110.100.00
λ20.50.500.150.020.570.140.010.480.140.02
γ0.50.500.170.030.490.120.040.510.090.01
α0.050.030.040.000.040.050.000.040.040.00
σ22.060.130.012.020.060.002.020.060.00
D1122.870.920.622.590.730.532.270.800.22
D2222.030.420.162.210.460.232.100.430.05
D3322.100.370.170.340.500.342.220.590.10
D4445.241.820.764.320.740.604.240.630.18

Table 1.

Summary statistics for parameter estimation of the simulated data of the model in (4.4) for different sample sizes.

4.2. Simulation study 2

4.2.1. Data description

We now perform a simulation study on proportional hazard model having Gompertz distribution at baseline and non-linear subject-specific trajectory. In particular, the model is in the form of

hit=h0texpγxi+αmit=λ1expλ2texpγxi+αmit,E23

where h0t is the hazard function at baseline having Gompertz distribution, xi is baseline covariate and mit denotes the true and unobserved value of the longitudinal at time t. The observed longitudinal value at time point t for the ith subject has the non-linear form

yit=mit+εit=5log1+t+bi1t+bi0+εit,E24

where εitN0σ2. In the model of (4.6), the mean longitudinal response of the population is assumed to have a non-linear logarithm curve. Different subjects are assumed to have different intercepts and slopes. In particular, we assume that bi=bi0bi1T having a bivariate normal distribution with mean μ=32 and covariance matrix D=Diag11. The true values of the other parameters we put in the model were λ1=0.01,λ2=0.1,γ=0.5,α=0.2,σ=2, respectively. In addition, the censoring mechanism is assumed exponentially distributed with a parameter of λ=0.25.

Based on the model in (4.5) and the simulation study 1, we simulated survival times Ti for 500 subjects with 35% censoring rate. In particular, the ending time for the study was 5 months and all subjects alive by the end of the study (i.e. time 5) were assumed to be censored. This design was also reflected of many clinical studies in real life. In this sample, there were 329 uncensored subjects and 1387 observations for 500 subjects. For each subject, 1–5 longitudinal measurements were recorded. On average, there were three longitudinal measurements per subject. In Figure 4, the Kaplan-Meier estimate for survival curve is presented for the simulated data of (4.5) with 95% pointwise confidence intervals in the left panel. Moreover, the subject-specific longitudinal profiles for six randomly selected subjects is drawn in the right panel. It can be seen that some of the subjects in this dataset showed non-linear evolutions in their longitudinal values. Each subject has its own trajectory.

Figure 4.

Kaplan-Meier estimate of the survival function of the simulated data of (4.5) (left panel). Longitudinal trajectories for the six randomly selected subjects of (4.6) (right panel).

4.2.2. Parameter estimation

We will be using Model 1 in (4.1) and in (4.2) to handle the non-linear longitudinal trajectory in the simulated data in (4.5). In this model, we put three internal knots at 25, 50 and 75%, respectively, of the follow up time. Then, the ECM algorithm, as explained in Section 3, is implemented once again to estimate all parameters in the model.

The results for parameter estimation are presented in Table 2. The means, standard deviations and 95% confidence intervals of parameter estimates are calculated for 30 independent samples. The point estimates for λ1,λ2,γ,α,σ2 are reasonably close to the true values. Similarly, the 95% CIs include the true values of λ1,λ2,γ,α,σ2.

ParameterTrue valueEstimateSD95% CI
β03.3990.673[3.158;3.640]
β14.3300.142[4.280;4.380]
λ10.010.0130.021[0.007;0.021]
λ20.10.0830.184[0.017;0.148]
γ0.50.6400.386[0.486;0.778]
α0.20.1860.142[0.136;0.237]
σ21.9930.061[1.971;2.015]
D110.9770.190[0.909;1.044]
D221.3650.183[1.300;1.430]
D331.9760.154[1.921;2.031]
D441.3930.196[1.322;1.463]

Table 2.

Summary statistics for parameter estimation of the simulated data of the model in (4.5) and (4.6).

Based on the estimated values of parameters, we generate back the estimated survival time by approximating values of random effects from linear mixed-effects function. The detail of the generation is explained in Appendix C. Then, we use the Kaplan-Meier estimate to compare between the survival function of the simulated dataset (the black solid line) and the estimated survival function (the dashed line) which are presented in the left panel of Figure 5.

Figure 5.

Kaplan-Meier estimate of the survival function from simulated failure times (the solid line) with 95% confidence intervals (dot lines), from Model 1 (4.5) (the dashed line) (left panel). Observed longitudinal trajectories (the solid line) and predicted longitudinal trajectories (the dashed line) for the twelve randomly selected subjects (right panel).

Moreover, we also draw the smooth and predicted longitudinal profiles for 12 patients chosen randomly in the right panel of Figure 5. The dot points are the true observed longitudinal values from simulated data. The solid lines are the smooth longitudinal profiles of the true observed longitudinal values using the loess smoother and the dashed lines are the predicted profiles of 12 randomly selected individuals. It is clear that the Kaplan-Meier estimates from simulated data overlaps the Kaplan-Meier estimates based on the predicted value in the left panel of Figure 4. The penalized spline regression model in (2.10) was a good fit for subject-specific curves in the right panel of Figure 5.

In summary, simulation studies have shown the stability of the algorithm and the goodness of fit of the penalized spline models. From the simulation study 1, it is shown that the updating process through the ECM algorithm converges quickly to the true values of the parameters. In addition, the simulation study 2 shows that the model can well predict the survival function and individual trajectories respectively.

4.3. The AIDS data set

In the AIDS dataset, there were 467 patients with advanced human immunodeficiency virus infection during antiretroviral treatment who had failed or were intolerant to zidovudine therapy. Patients in the study were randomly assigned to receive either didanosine drug (ddI) or zalcitabine drug (ddC). CD4 cells are a type of white blood cells made in the spleen, lymph nodes and thymus gland and are part of the infection-fighting system. CD4 cell counts were recorded at the time of study entry as well as at 2, 6, 12 and 18 months thereafter. The detail regarding the design of this study can be found in Abrams et al. [16]. By the end of the study, there were 188 patients died, resulting in about 59.7% censoring. There were 1405 longitudinal responses recorded.

Previously, Rizopoulos [2] used his standard joint model for the AIDS data which consider the variability between subjects mostly depend on the intercept. However, the model could not predict observed longitudinal data accurately. When the time unit is changed from month to year in the data, the variability between subjects depends not only on the intercept but also on the obstime variable. In addition, the longitudinal trajectories plot also shows many non-linear curves as depicted in the right panel of Figure 6.

Figure 6.

Kaplan-Meier estimate of the survival function of the AIDS data (left panel). Longitudinal trajectories for CD4 cell count of the first 100 patients for two groups (right panel).

Given the non-linearity, it is appropriate to apply our models, Model 1 and Model 2, for the AIDS data. In particular, we use the two joint models in (2.10) and (2.11) with the four internal knots are placed at 20, 40, 60, 80%, respectively of the observed failure times for hazard rate at baseline. Then, the ECM algorithm is implemented to estimate all parameters in the two models. A summary of statistics for parameter estimation using Model 1 and Model 2 is presented in Table 3.

Model 1Model 2
ParameterEstimateStd. errorz-valuep-valueParameterEstimateStd. errorz-valuep-value
β07.870.06127.07<0.001β07.810.07114.34<0.001
β1−1.690.11−14.77<0.001β1−1.620.12−12.99<0.001
γ0.220.112.060.039γ0.310.103.030.002
α−0.200.01−15.84<0.001α−0.240.01−18.15<0.001
λ11.680.07λ11.040.11
λ20.330.00λ21.790.23
σ2.360.36λ31.380.38
D112.180.14λ41.670.42
D221.040.07λ52.480.66
D330.850.06σ2.620.45
D4411.870.78D111.020.07
D220.970.06
D330.990.07
D4411.400.75

Table 3.

Summary statistics for parameter estimation of the AIDS data of Model 1 and Model 2 respectively.

Following Rizopoulos [2], in Model 1 and Model 2, the univariate Wald tests are applied for the fixed effects β=β0β1T in the longitudinal submodel, the regression coefficient γ and the association parameter α respectively. The results from Table 3 show that the point estimates of β0,β1,γ,α are all statistically significant for both models at a significance level of 5%.

We conduct the Kaplan-Meier estimate of the survival function from the observed survival time (the light solid line) and the dot lines correspond to 95% pointwise confidence intervals in Figure 6 (left panel). The predicted survival function from Model 1 is the dashed line and the predicted survival function from Model 2 is the bold solid line. The plots show that Model 2 works very well in this case as shown in Figure 7. Moreover, Model 2 is also preferred in practice because h0. usually is considered as unspecified in order to avoid the impact of misspecifying the distribution of survival times.

Figure 7.

Kaplan-Meier estimates of the survival function from observed failure times, from model 1 and from model 2 (left panel). Observed longitudinal trajectories (the solid line) and predicted longitudinal trajectories (the dashed line) for the 12 randomly selected patients (right panel).

Based on the model of longitudinal regression in (4.2), we also draw the smooth and predicted longitudinal profiles for nine patients from the AIDS dataset as depicted in Figure 7 (right panel). The dot points are the true observed longitudinal values. The solid lines are the smooth longitudinal profiles using the loess smoother and the dashed lines are the predicted profiles of nine randomly selected individuals. Most of the predicted profiles are quite close to the observed ones.

Advertisement

5. Discussion

In this chapter, two joint models using a penalized spline with a truncated polynomial basis have been proposed to model a non-linear longitudinal outcome and a time-to-event data. The use of a truncated polynomial basis gives us an intuitive and obvious way to model non-linear longitudinal outcome. By adding some penalties for the coefficients of the knots and using linear mixed-effects models, the smoothing is controlled and the individual curves are specified.

We have conducted a sensitivity analysis on the assumption of normality for either random effects or errors. The t-distribution with the degree of freedom 5 is applied for each of them. The results show that the estimates of parameters are sensitive when both of terms are not normally distributed.

The main findings we may derive from this chapter are, at least, threefold: (1) the ECM algorithm provides a reasonable quick convergence algorithm for the proposed models; (2) the fitted joint models are able to measure the association between the internal time-dependent covariates and the risk for an event and (3) the two models provide a good prediction for both the longitudinal and survival functions, as presented in empirical results.

The limitations of this study are, at least, threefold: (1) the number of internal knots is limited to three due to computational costs; (2) the polynomial power functions can form an ill-conditioned basis for the models and (3) the estimation results are sensitive when both random effects and error are not normally distributed.

Based on the limitations, our future work will focus on using new methods for approximating the integrals to reduce the computational problems or relaxing the normality assumption. Furthermore, we will apply a different basis for joint models, that is the penalized B-spline. In terms of parameter estimation, we are considering a different approach to estimate the parameters in the models using a Bayesian approach, via Markov chain Monte Carlo (MCMC) algorithms.

Advertisement

One sample of simulated data of the penalized spline joint model in (4.4) is presented in Table 4 for the first three patients. The subjects are measured bimonthly and the entry time is 0 for all subjects. Obstime variable includes the time points at which these measurements are recorded. Time variable includes the observed survival times when subject meets an event. x is a time-constant binary random variable with parameter p=0.5. Column y contains the longitudinal responses. Death variable is the event status indicator. This variable receives value 1 when the true survival time is less than or equal to the censoring time and 0 otherwise. We define the four random effects variables which are Z1=obstimeK1+, Z2=obstimeK2+, Z3=obstimeK3+ and Z4=1. For the longitudinal process, there are 1902 of observations for 500 subjects. For each subject, 1-7 longitudinal measurements are recorded. On average, there are four longitudinal measurements per subject. For the event process, there are 297 subjects who meet for an event which is equivalent to 59.4% of the whole sample.

IdObstimeTimexyDeathZ1Z2Z3Z4
10.04.9701.4110.00.00.01
10.54.9706.4510.00.00.01
11.04.9704.1010.00.00.01
11.54.9701.5010.50.00.01
12.04.9704.0711.00.00.01
12.54.9706.1611.50.50.01
13.04.9703.6012.01.00.01
13.54.9708.3212.51.50.51
14.04.9706.3213.02.01.01
20.02.7906.8110.00.00.01
20.52.7907.7710.00.00.01
21.02.7909.7510.00.00.01
21.52.79011.0410.50.00.01
22.02.7907.2011.00.00.01
30.01.900−1.8400.00.00.01
30.51.9001.1200.00.00.01
31.01.9000.7800.00.00.01

Table 4.

A snapshot of simulated data for penalized spline joint model in (4.4).

ParameterTrue valueCensored (20%)Censored (40%)
EstimateSDMSEEstimateSDMSE
β054.850.300.255.100.300.27
β121.860.450.202.100.570.18
λ10.10.130.120.000.110.100.00
λ20.50.520.070.000.490.140.02
γ0.50.480.100.000.510.090.00
α0.050.050.020.000.040.040.00
σ22.020.050.002.020.060.00
D1122.210.670.172.270.800.22
D2222.160.270.092.100.430.05
D3322.260.270.012.220.600.10
D4444.200.530.204.240.630.18

Table 5.

Summary statistics for parameter estimation of the simulated data of the model in (22) for different censoring rates.

The integrals with respect to the random effects in (3.7) do not have closed-form solutions. Therefore, in this chapter, we implement the Gaussian-Hermite quadrature rule as in Rizopoulos [5] to approximate the integrals. In our simulation study and R coding, we use the Gaussian-Hermite quadrature rule with 10 quadrature points.

The updating formulas of the parameters in Step 3 have different forms for each parameter following Rizopoulos [2]. We have the closed-form estimates for the measurement error variance σε2 in the longitudinal model and the covariance matrix of the random effects as follows:

Ĝit+1=1nibiTbipbiTiδiyiθitdbi=1Nivb˜iit+b˜iitb˜iitT,E25

where b˜i=EbiTiδiyiθ=bipbiTiδiyiθdbi and v˜bi=varbiTiδiyiθ=bib˜ipbiTiδiyiθdbi. The updating formula for σε2 is

σ̂εit+12=1niWTWpbiTiδiyiθitdbi,E26

where W=yiXiβXiuiZivi.

Unfortunately, we cannot obtain closed-form expressions for the fixed effects β and the parameters of the survival submodel γ,α,andθh0. We thus employ the one-step Newton-Raphson approach to obtain the updated βit+1, γit+1,αit+1 and θh0it+1. In particular, we have

Sθ=Qθθitθθ̂it+1=θ̂itSθ̂itθ1Sθ̂it,E27

where Sθ is the score vector corresponding to parameter θ and the score vector has the form of

Sθ=Qθθitθ=iθTlogpTiδibiθitp(yibiθit)p(biθit).pbiTiδiyiθitdbi.

There are four cases for simulating survival time Ti of the model (4.1) as follows.

When the survival time t<K1, we calculate the cumulative hazard function Hit=0thisds. Based on the relation between the survival function Sit, cumulative hazard function Hit and cumulative distribution Fit, we have

Sit=expHit=1Fit.E28

Following (5.4), we set

u=1FiTi,E29

where u is a random variable with uUni01. The survival time t is the solution of the equation

U=expHit=exp0thisds.

The condition t<K1 is equal to

logU<0K1hsds.

When K1t<K2, we calculate the cumulative hazard function Hit=0K1hisds+K1thisds. The survival time t is the solution of the equation

U=exp0K1hisds+K1thisds,

where U is a value of uUni01. The condition K1t<K2 is equal to

logU<0K1hisds+K1K2hisds.

When K2t<K3, we calculate the cumulative hazard function Hit=0K1hisds+K1K2hisds+K2thisds . The survival time t is the solution of the equation

U=exp0K1hisds+K1K2hisds+K2thisds,

where U is a value of uUni01. The condition K2t<K3 is equal to

logU<0K1hisds+K1K2hisds+K2K3hisds.

When K3t, the cumulative hazard function has the form Hit=0K1hisds+K1K2hisds+K2K3hisds+K3thisds. Survival time t is the solution of the equation

U=exp0K1hisds+K1K2hisds+K2K3hisds+K3thisds.

In particular, Ruppert et al. [9] introduced a default choices for knot location and number of knots. The idea is to choose sufficient knots to resolve the essential structure in the underlying regression function. But for more complicated penalized spline models, there are computational advantages to keeping the number of knots relatively low. A reasonable default is to choose the knots to ensure that there are a fixed number of unique observations, say 4–5, between each knot. For large data sets, this can lead to an excessive numbers of knots; therefore, a maximum number of allowable knots (say, 20–40 total) are recommended.

According to Ruppert et al. [9], the choice for knot position is

Kk=k+1K+2th sample quantile of the unique xi for k=1,,K.

The simple choice of K is

K=min14×number of uniquexi35.

See Table 5.

References

  1. 1. Cox DR. Regression models and life-tables. Journal of the Royal Statistical Society. Series B (Methodological). 1972;34(2):187-220
  2. 2. Rizopoulos D. Joint Models for Longitudinal and Time-To-Event Data with Applications in R. Chapman & Hall/CRC Biostatistics series; 2012
  3. 3. Brown ER, Ibrahim JG, DeGruttola V. A flexible B-spline model for multiple longitudinal biomarkers and survival. Biometrics. 2005;61(1):64-73
  4. 4. Ding J, Wang J-L. Modeling longitudinal data with nonparametric multiplicative random effects jointly with survival data. Biometrics. 2008;64(2):546-556
  5. 5. Rizopoulos D. Fast fitting of joint models for longitudinal and event time data using a pseudo-adaptive Gaussian quadrature rule. Computational Statistics and Data Analysis. 2011;56:491-501
  6. 6. Rizopoulos D. JM: An R package for the joint modelling of longitudinal and time-to-event data. Journal of Statistical Software. 2010;35(9):1-33
  7. 7. Kalbfleisch J, Prentice R. The Statistical Analysis of Failure Time Data. 2nd ed. New York: Wiley; 2002
  8. 8. Durban M, Harezlak J, Wand M, Carroll R. Simple fitting of subject-specific curves for longitudinal data. Statistics in Medicine. 2005;24(8):1153-1167
  9. 9. Ruppert D, Wand M, Carroll R. Semiparametric Regression. Cambridge: Cambridge University Press; 2003
  10. 10. Viviani S, Alfo M, Rizopoulos D. Generalized linear mixed joint model for longitudinal and survival outcomes. Statistics and Computing. 2014;24(3):417-427
  11. 11. McLachlan G, Krishnan T. The EM Algorithm and Extensions. Vol. 382. John Wiley & Sons; 2007
  12. 12. Louis TA. Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological). 1982:226-233
  13. 13. Bender, Augustin Blettner, Bender, Bender R, Augustin T, Blettner, M. 2005. Generating survival times to simulate cox proportional hazards models, Statistics in Medicine. 24(11): 1713‐1723
  14. 14. Austin PC. Generating survival times to simulate cox proportional hazards models with time-varying covariates. Statistics in Medicine. 2012;31(29):3946-3958
  15. 15. Crowther MJ, Lambert PC. Simulating biologically plausible complex survival data, Statistics in Medicine. 2013;32(23):4118‐4134
  16. 16. Abrams D, Goldman A, Launer C, Korvick J, Neaton J, Crane L, Grodesky M, Wakefield S, Muth K, Kornegay S. Comparative trial of didanosine and zalcitabine in patients with human immunodeficiency virus infection who are intolerant of or have failed zidovudine therapy. New England Journal of Medicine. 1994;330:657-662

Written By

Huong Thi Thu Pham and Hoa Pham

Submitted: December 1st, 2017 Reviewed: February 26th, 2018 Published: June 6th, 2018