Posterior estimates of regression coefficients with standard errors.
Abstract
Random effects models have been widely used to analyze correlated data sets, and Bayesian techniques have emerged as a powerful tool to fit the models. However, there has been scarce literature that systematically reviews and summarizes the recent advances of Bayesian analyses of random effects models. This chapter reviews the use of the Dirichlet process mixture (DPM) prior to approximate the distribution of random errors within the general semiparametric random effects models with parametric random effects for longitudinal data setting and failure time setting separately. In a survival setting with clusters, we propose a new class of nonparametric random effects models which is motivated from the accelerated failure models. We employ a beta process prior to tact clustering and estimation simultaneously. We analyze a new data set integrated from Alzheimer’s disease (AD) study to illustrate the presented model and methods.
Keywords
- beta process
- Dirichlet process mixture
- clustered data
- longitudinal data
- random effects
- survival outcome
- nonparametric transformation model
1. Introduction
Random effects models have been widely used as a powerful tool for analyzing correlated data [1, 2]. The model features a finite number of random terms acting as latent variables to model unobserved factors; see [3] for a comprehensive review. Some authors have further proposed semiparametric mixed effect models by allowing for infinite dimensional random effects [4, 5]. Most of the aforementioned works draw inferences using frequentist approaches, while Bayesian approaches have been largely ignored because of the lack of computational feasibility and expediency. With the advent of the “supercomputer” era, Bayesian analyses have recently sparked much interest in the setting of random effects models for clustered data or longitudinal settings. However, there is scarce literature that has systematically reviewed the Bayesian works in the area.
By extending the traditional random effects models, recent research focus has shifted to study heterogeneous random effects or nonparametric distributions of random effects, which arise because of skewness of data, missing covariates, or unmeasurable subject-specific covariates [6]. The extended random effects models, termed semiparametric random effects models, improve statistical performance with added interpretability. Bayesian techniques, which provide a convenient means to model non-Gaussian distributions, have recently been proposed for semiparametric random effects model in a variety of settings ([7, 8], among others). The discreteness of the Dirichlet process makes it impossible as a prior for estimating a density. However, as a remedy by convolving with a kernel, Dirichlet process mixture plays an important role [9].
For censored outcome data, transformation models, which transform the time-to-event responses using a monotone function and link them to the covariates of interest, have surged as a strong competitor of the Cox model [10]. Moreover, the transformation model framework is fairly general. The Cox model and the proportional odd model [11] can be viewed as nonparametric transformation linear models with some specific error terms; see [12, 13, 14]. For correlated data, the transformation model naturally extends the semiparametric random effects model by directly incorporating random effects to the transformation functions, treating them as realizations of an underlying random function. Bayesian analyses have found much use in this new area. For example, the beta process has been found to be a reasonable candidate for the prior of the monotone transformation function [15, 16, 17].
This chapter focuses on the Bayesian analysis of the transformed linear model with censored data and in a clustered setting. In many biomedical studies, the observations are naturally clustered. For example, patients in observational studies can be grouped in analysis according to a variety of factors, such as age, race, gender, and hospital, in order to reduce the confounding effects. Following Mallick and Walker [18], we explore using a mixture of beta distributions and the beta process as the candidates for the prior distribution of the random transformation function [17, 19, 20].
The rest of this chapter is structured as follows. Section 2 reviews the use of the Bayesian approach to infer parametric random effects models. In the setting of survival analysis, Section 3 proposes a beta process prior to fit random effects model with nonparametric transformation functions, and Section 4 applies the method to study the progression of Alzheimer’s disease (AD). Section 5 concludes the chapter with future research directions.
2. Dirichlet process mixture prior
In parametric random effects models, we considered the situation that the distribution form of the random error term is unknown. Dirichlet process mixture (DPM) is used as the prior for the baseline distribution in that error terms used to be continuous random variables in most situations.
2.1 Linear mixed effects model
With a longitudinal data set
where
where
Replacing the Dirichlet process by an equivalent Pólya urn representation, [8] employed an empirical likelihood approach with the moment constraints and developed a posterior adjusted Gibbs sampler for more precise estimation. The algorithm is computationally feasible.
2.2 Accelerated failure time model
We shift gears to study survival outcomes with a cluster structure. Denote the data set by
where
where
3. Beta process prior
We now present a nonparametric random effects model for the clustered survival data with nonparametric monotone link functions. We employ a beta process as the prior for the baseline function.
Let
Within each cluster,
where
To proceed, let the coefficient vector
where
Here
We develop a Bayesian inference procedure based on model (6). We assume that the regression coefficient
where
where
with
We are in a position to select an appropriate stochastic process used as the prior of
Teh et al. [20] showed that a sample from
where
In practice, we need to approximate samples of
with
The approximated posterior based on the truncated DP is
where
The samples for
4. An application to Alzheimer’s disease neuroimaging initiative
Alzheimer’s Disease Neuroimaging Initiative (ADNI) is a multisite cooperative study for the purpose of improving the prevention and treatment of Alzheimer’s disease. The subjects in the study fall into three groups, cognitively normal (CN) individuals, mild cognitive impairment (MCI) patients, and early AD patients. ADNI provides a rich array of patients’ information, including functional magnetic resonance imaging (fMRI), positron emission tomography (PET), longitudinal functional cognitive tests scores, blood samples, genetics data, and censored failure time outcomes. Details of the study can be found at http://adni.loni.usc.edu.
We focus on the MCI group. MCI is recognized as a transitional stage between normal cognition and Alzheimer’s disease. The failure time is defined to be the time that a MCI patient is diagnosed with AD, which will be censored if a MCI patient remains at the MCI stage at the end of the follow-up time. Wide heterogeneities are exhibited among the failure times, which may be due to demographics and a variety of functional clinical biomarkers, such as the brain areas of the hippocampus, ventricles, and entorhinal cortex. The goal of the analysis is to study the impact of risk factors on progression to AD.
Using the same data as analyzed by [14], we demonstrate our methodology by modeling the failure time (the observed time of AD diagnosis from MCI stage in year) of 281 MCI patients on gender (0 = female, 1 = male), years of education, the number of apolipoprotein E alleles (0, 1, or 2), and the baseline hippocampal volume.
As age is a strong confounder but the functional form of its impact has not reached consensus, we elect to model its impact nonparametrically. Specifically, we use age to form two strata (below and above the median age) and use model (6) to estimate the stratum-specific transformation functions and the effects of other covariates. For comparisons, we also fit model (6) with age as a continuous variable and with a common transformation function. That is, we do not assume the data are clustered. For both models, the regression errors
Figure 1
illustrates the estimated transformation function
RID | AGE | PTGENDER | PTEDUCAT | APOE4 | Hipp. | |
---|---|---|---|---|---|---|
PM | −0.9635 | 0.0069 | −0.1453 | −0.0231 | −0.1817 | 0.2710 |
SE | 1.3288 | 0.0841 | 1.2331 | 0.1835 | 0.8616 | 0.5333 |
Figure 2 illustrates the estimated transformation functions with age-stratified data, and Table 2 summarizes the posterior means and standard errors of the other regression coefficients.
RID | PTGENDER | PTEDUCAT | APOE4 | Hipp. | |
---|---|---|---|---|---|
PM | −0.6399 | −0.0706 | −0.0072 | −0.1349 | 0.1919 |
SE | 0.9273 | 0.8491 | 0.1267 | 0.6098 | 0.3716 |
The left curve is relatively flat, while the right curve has a sharper slope. This is consistent with the recognition that AD is an aging disease: elder people above a certain age threshold tend to progress faster from MCI to AD.
Both Tables 1 and 2 show that none of the biomarkers are significant, whereas they are statistically significant in the analysis of [14]. One possible conjecture is that our nonparametric transformation functions may have well captured the effects of unobserved confounders, which may leave little to be explained by the observed covariates. More thorough investigation is warranted.
5. Future directions
Following [12], we can extend the transformation model (6) by allowing the error function
where
In a slightly different context, we may also consider clustering observations by developing a new nested beta-Dirichlet process prior with companion MCMC algorithms. As there are limited works on functional random effects models that accommodate clustering structures observed, for example, from neural studies, we may propose a nested Dirichlet process [19] as the prior of Dirichlet process to cluster cumulative distribution functions successfully. We envision that such a nested Bayesian procedure will provide substantial computational expedience for practitioners and can certainly be applied to studies that cover beyond the neurodegenerative and aging diseases.
Acknowledgments
Shen’s research is partially supported by Beijing Natural Science Foundation 1192006 and National Natural Science Foundation of China; Liu’s research is partially supported by General Research Fund, Research Grants Council, Hong Kong, 15327216, and the Hong Kong Polytechnic University grant YBTR. Data collection and sharing for this project were funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research Development, LLC.; Johnson & Johnson Pharmaceutical Research Development LLC.; Lumosity; Lundbeck; Merck Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.
Data used in preparation of this article were obtained from the Alzheimers Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.
References
- 1.
Harville DA. Extension of the Gauss–Markov theorem to include the estimation of random effect. The Annals of Statistics. 1976; 4 :384-395 - 2.
Laird NM, Ware JH. Random-effects models for longitudinal data. Biometrics. 1982; 38 (2):963-974 - 3.
Li Y. Random effect models. In: Pham H, editor. Springer Handbook of Engineering Statistics. London: Spring-Verlag; 2006. pp. 687-704 - 4.
Zeger SL, Diggle PJ. Semiparametric models for longitudinal data with application to CD4 cell numbers in HIV seroconverters. Biometrics. 1994; 50 (3):689-699 - 5.
Li Y, Lin X, Müller P. Bayesian inference in semiparametric mixed models for longitudinal data. Biometrics. 2010; 66 (1):70-78 - 6.
Li Y, Mller P, Lin X. Center-adjusted inference for a nonparametric Bayesian random effect distribution. Statistica Sinica. 2011; 21 :1201-1223 - 7.
Li Z, Xu X, Shen J. Semiparametric Bayesian analysis of accelerated failure time models with cluster structures. Statistics in Medicine. 2017; 36 (25):3976-3989 - 8.
Shen J, Yu H, Yang J, Liu C. Semiparametric Bayesian analysis for longitudinal mixed effects models with non-normal AR(1) errors. Statistics and Computing. 2019; 29 (3):571-583 - 9.
Ghosal S, van der vaart A. Fundamentals of Nonparametric Bayesian Inference. Cambridge: Cambridge University Press; 2017 - 10.
Cox DR. Regression models and life tables (with discussion). Journal of the Royal Statistical Society: Series B. 1972; 34 (2):187-220 - 11.
Hanson T, Yang M. Bayesian semiparametric proportional odds models. Biometrics. 2007; 63 (1):88 - 12.
Horowitz JL. Semiparametric estimation of a regression model with an unknown transformation of the dependent variable. Econometrica. 1996; 64 (1):103-137 - 13.
Linton O, Sperlich S, Keilegom IV. Estimation of a semiparametric transformation model. Annals of Statistics. 2008; 36 (2):686-718 - 14.
Li K, Luo S. Functional joint model for longitudinal and time-to-event data: An application to Alzheimer’s disease. Statistics in Medicine. 2017; 36 (25):3560-3572 - 15.
Müller P, Mitra R. Bayesian nonparametric inference why and how. Bayesian Analysis. 2013; 8 (2):269-302 - 16.
Kalbfleisch JD. Non-parametric bayesian analysis of survival time data. Journal of the Royal Statistical Society. 1978; 40 (2):214-221 - 17.
Hjort N. Nonparametric bayes estimators based on beta processes in models for life history data. Annals of Statistics. 1990; 18 (3):1259-1294 - 18.
Mallick B, Walker S. A Bayesian semiparametric transformation model incorporating frailties. Journal of Statistical Planning and Inference. 2007; 112 (1):159-174 - 19.
Rodriguez A, Dunson DB, Gelfand AE. The nested Dirichlet process. Journal of the American Statistical Association. 2008; 103 (483):1131-1154 - 20.
Teh YW, Görür D, Ghahramani Z. Stick-breaking construction for the Indian buffet process. In: Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics; Vol. 2; 2007. pp. 556-563 - 21.
Geyer CJ. Introduction to Markov chain Monte Carlo. In: Brooks S, Gelman A, Jones G, Meng X-L, editors. Handbook of Markov Chain Monte Carlo. Boca Raton: Chapman and Hall/CRC; 2011. pp. 3-48