Simulation results of 10,000 replicates for λ = 0, .5, 1, 2, 5 and each prior.
Abstract
Dose-response models are applied to animal-based cancer risk assessments and human-based clinical trials usually with small samples. For sparse data, we rely on a parametric model for efficiency, but posterior inference can be sensitive to an assumed model. In addition, when we utilize prior information, multiple experts may have different prior knowledge about the parameter of interest. When we make sequential decisions to allocate experimental units in an experiment, an outcome may depend on decision rules, and each decision rule has its own perspective. In this chapter, we address the three practical issues in small-sample dose-response studies: (i) model-sensitivity, (ii) disagreement in prior knowledge and (iii) conflicting perspective in decision rules.
Keywords
- dose-response models
- model-sensitivity
- model-averaging
- prior-sensitivity
- consensus prior
- Bayesian decision theory
- individual-level ethics
- population-level ethics
- Bayesian adaptive designs
- sequential decisions
- continual reassessment method
- c-optimal design
- Phase I clinical trials
1. Introduction
Dose-response modeling is often used to learn about the effect of an agent on a particular outcome with respect to dose. It is widely applied to animal-based cancer risk assessments and human-based clinical trials. A sample size is typically small; so many statistical issues can arise from a limited amount of data. The issues include the impact of a misspecified model, prior-sensitivity, and conflicting ethical perspectives in clinical trials. In this chapter, we focus on cases when an outcome variable of interest is binary (a predefined event happened or not) when an experimental unit is exposed to a dose. Main ideas are preserved for cases when an outcome variable is continuous or discrete.
There are two different approaches to statistical inference. One approach is called frequentist inference. In this framework, we often rely on the sampling distribution of a statistic and large-sample theories. Another approach is called Bayesian inference. It is founded on Bayes’ Theorem, and it allows researchers to express prior knowledge independent of data. In a small-sample study, Bayesian inference can be more useful than frequentist inference because we can incorporate both researcher’s prior knowledge and observed data to make inference for the parameter of interest. Bayesian ideas are briefly introduced for dose-response modeling with a binary outcome in Section 2.
In a small-sample study, we often rely on a parametric model to gain statistical efficiency (i.e., less variance in parameter estimation), but our inference can be severely biased by the use of a wrong model. To account for model uncertainty, it is reasonable to specify multiple models and make inference based on “averaged-inference.” In this regard, Bayesian model averaging (BMA) is a useful method to gain robustness [1]. The BMA method has a wide range of application, and we focus its application to animal-based cancer risk assessments in Section 3.
In clinical trials, study participants are real patients, and therefore, we need to carefully consider ethics. There are conflicting perspectives of individual- and population-level ethics in early phase clinical trials. Individual-level ethics focuses on the benefit of trial participants, whereas population-level focuses on the benefit of future patients, which may require some level of sacrifice from trial participants. We compare the two conflicting perspectives in clinical trials based on Bayesian decision theory, and we discuss a compromising method in Section 4 [2, 3].
A sample size for an early phase (Phase I) clinical trial is often less than 30 subjects. Dose allocations for first few patients and statistical inference for future patients heavily depend on researcher’s prior knowledge in sparse data. When multiple researchers have different prior knowledge about a parameter of interest, one compromising approach is to combine their prior elicitations and average them (i.e., consensus prior) [4, 5]. When we average the prior elicitations, there are two different approaches to determine the weight of each prior elicitation, weights determined before observing data and after observing data. We discuss operating characteristics of the two different weighting methods in the context of Phase I clinical trials in Section 5.
2. Bayesian inference
In statistics, we address a research question by a parameter, which is often denoted by
The function
where
2.1. Example
Suppose we observe
It is known as the beta distribution with shape parameters
where
where
If the researcher fixed
2.2. Example
This example is simplified from Shao and Small [6]. In dose-response studies, we model
which is known as a logistic regression model. It is commonly assumed that a dose-response curve increases with respect to dose, so we assume
To express prior knowledge about
with an arbitrarily large value of
where
In an animal-based studies, one parameter of interest is the median effective dose, which is denoted by ED50. It is the dose, which satisfies
and it can be shown that
In 1997, International Agency for Research on Cancer classified 2,3,7,8-Tetrachlorodibenzo-p-dioxin (known as TCDD) as a carcinogen for humans based on various empirical evidence [8]. In 1978, Kociba et al. presented the data on male Sprague-Dawley rats at four experimental doses 0, 1, 10 and 100 nanograms per kilogram per day (ng/kg/day) [9]. In the control dose group, nine of 86 rats developed tumor (known as hepatocellular carcinoma); three of 50 rats developed the tumor at dose 1; 18 of 50 rats developed the tumor at dose 10; and 34 of 48 rats developed the tumor at dose 100 [6]. Without loss of generosity, we let
where
3. Bayesian model averaging
In a small sample, we borrow the strength of a parametric model to gain efficiency in parameter estimation. However, an assumed model may not describe the true dose-response relationship adequately. The impact of model misspecification is not negligible particularly in a poor experimental design. In such a limited practical situation, Bayesian model averaging (BMA) can be a useful method to account for model uncertainty. It is widely applied in practice, and in this section, we focus on the application to cancer risk assessment for the estimation of a benchmark dose [1, 6, 10, 11].
Let
In Eq. (12), the posterior density function
In Eq. (13), the prior model probability
In the BMA method, all
3.1. Example
This example is continued from the example in Section 2.2. Recall
or equivalently
In practice,
with the restrictions 0 <
Let
However, we are not able to calculate the 5th percentile of the model-averaged posterior distribution based on the given statistics. In fact, we need to approximate the posterior distribution
4. Application of Bayesian decision theory to Phase I trials
In a Phase I cancer trial, the main objectives are to study the safety of a new chemotherapy and to determine an appropriate dose for future patients. Since trial participants are cancer patients, dose allocations require ethical considerations. Whitehead and Williams discussed several Bayesian approaches to dose allocations [14]. One decision rule is devised from the perspective of trial participants (individual-level ethics), and another decision rule is devised from the perspective of future patients (population-level ethics). However, a decision rule, which is devised from the population-level ethics, is not widely accepted in current practice [15]. Instead, there are some proposed decision rules, which compromise between the individual- and population-level perspectives [3, 16]. In this section, we discuss the two conflicting perspectives in Phase I clinical trials and a compromising method based on Bayesian decision theory.
Assume a dose-response relationship follows a logistic model
where
A choice of
4.1. Parameter of interest: maximum tolerable dose
Let
At the end of a trial (observing
4.2. Prior density function: conditional mean priors
A consequence of sequential decisions heavily depends on a prior density function
Suppose a researcher selects two arbitrarily doses, say
Using the Jacobian transformation from
It is known as conditional mean priors under the logistic model [7].
4.3. Posterior density function: conjugacy
For notational convenience, we let
where
4.4. Loss functions for individual- and population-level ethics
A loss function, which reflects the perspective of individual-level ethics, is as follows:
This loss function is analogous to the original continual reassessment method proposed by O’Quigley et al. [17]. The square error loss attempts to treat a trial participant at
From the perspective of population-level ethics, Whitehead and Brunier proposed a loss function, which is equal to the asymptotic variance of the maximum likelihood estimator for
where
where
is the gradient vector, the partial derivatives of
where
with the weight defined as
4.5. Loss function for compromising the two perspectives
Kim and Gillen proposed to accelerate the compromising process by modifying
where
is an accelerating factor [3]. It has two implications. First, the compromising process is accelerated toward the individual-level ethics as the trial proceeds (i.e.,
4.6. Simulation
To study the operating characteristics of
Let
Let
Table 1 summarizes simulation results of 10,000 replicates for each prior. For all three priors, we observe similar tendencies. First,
Prior | ||||||
---|---|---|---|---|---|---|
1 | 0 | 0.0964 | 0.0019 | 0.0126 | 2.4353 | 0.4318 |
.5 | 0.1034 | 0.0024 | 0.0118 | 2.0997 | 0.2298 | |
1 | 0.1082 | 0.0028 | 0.0113 | 1.8969 | 0.1714 | |
2 | 0.1100 | 0.0031 | 0.0112 | 1.6929 | 0.1211 | |
5 | 0.1157 | 0.0035 | 0.0106 | 1.3128 | 0.0596 | |
2 | 0 | 0.1665 | 0.0054 | 0.0065 | 4.1217 | 0.9889 |
.5 | 0.1705 | 0.0056 | 0.0065 | 3.9598 | 0.9877 | |
1 | 0.1727 | 0.0060 | 0.0068 | 3.9025 | 0.9670 | |
2 | 0.1751 | 0.0066 | 0.0072 | 3.8707 | 0.9291 | |
5 | 0.1763 | 0.0067 | 0.0073 | 3.8442 | 0.9068 | |
3 | 0 | 0.2743 | 0.0048 | 0.0103 | 6.1875 | 0.1600 |
.5 | 0.2673 | 0.0048 | 0.0093 | 6.3954 | 0.1430 | |
1 | 0.2606 | 0.0046 | 0.0083 | 6.6194 | 0.1165 | |
2 | 0.2562 | 0.0045 | 0.0077 | 6.8035 | 0.1020 | |
5 | 0.2499 | 0.0044 | 0.0068 | 7.0274 | 0.0760 |
In summary, when we emphasize more on population-level ethics, we have a smaller variance in the estimation for future patients (with a greater absolute bias, potentially due to Jensen’s Inequality), and the distribution of
5. Consensus prior
In Bayesian inference, researchers are able to utilize information, which is independent of observed data. It allows researchers to incorporate any form of information, such as one’s experience and existing literature, which may be particularly useful in a small-sample study. On the other hand, we concern subjectivity and prior sensitivity in sparse data. Furthermore, it is possible to have disagreement among multiple researchers’ prior elicitations about a parameter
Suppose there are
where
For a prior weighting scheme, we denote
where
Samaniego discussed self-consistency when compromised inference is used through the prior weighting scheme
be the prior expectation, the mean of the prior density function
Self-consistency can be achieved under simple models. For example, let
where
5.1. Binomial experiment
Let
Let
for the
If we allow individual-specific prior elicitation
so the self-consistency is satisfied.
For the posterior weighting scheme given data
where
If we desire an equal strength from each researcher’s prior elicitation, we may fix
Whether self-consistency is satisfied, the practical concern is the quality of estimation such as bias, variance and mean square error. Assuming
5.2. Applications to Phase I trials under logistic regression model
In this section, we apply the prior weighting scheme and the posterior weighting scheme to Phase I clinical trials under the logistic regression model. We consider the three priors considered in Section 4.6. We denote Prior 1, 2 and 3 by
The prior means were
For simulation study, we consider three simulation scenarios with sample size
Table 2 provides the simulation results of 10,000 replicates for each scenario under the prior weighting scheme and under the posterior weighting scheme. Since the posterior weighting scheme adaptively updates
Scenario | Method | |||||
---|---|---|---|---|---|---|
1 | Prior weighting | 0.0967 | 0.0014 | 0.0121 | 1.1090 | 0.0398 |
Posterior weighting | 0.1853 | 0.0073 | 0.0075 | 2.7304 | 0.5900 | |
2 | Prior weighting | 0.2018 | 0.0059 | 0.0059 | 3.8432 | 0.9042 |
Posterior weighting | 0.2048 | 0.0110 | 0.0110 | 4.2848 | 0.8920 | |
3 | Prior weighting | 0.2929 | 0.0071 | 0.0157 | 7.1090 | 0.0568 |
Posterior weighting | 0.1951 | 0.0133 | 0.0133 | 4.6036 | 0.8646 |
The simulation results are analogous to the simpler model in Section 5.1. When the true parameter is not well surrounded by prior guesses, the posterior weighting scheme is preferable with respect to mean square error due to smaller bias. When the true parameter is well surrounded by prior guesses, the prior weighting scheme is beneficial with respect to mean square error due to smaller variance.
As a final comment, we shall be careful about the strength of individual prior elicitations when we implement the posterior weighting scheme in Phase I clinical trials. The strength of individual prior elicitations depends on (i) the hyper-parameters
When researchers determine consensus prior elicitations before initiating a trial, the multiplicative term
6. Concluding remarks
In this chapter, we have discussed Bayesian inference with averaging, balancing, and compromising in sparse data. In the cancer risk assessment, we have observed that low-dose inference can be very sensitive to an assumed parametric model (Section 3.1). In this case, the Bayesian model averaging can be a useful method. It provides robustness by using multiple models and posterior model probabilities to account for model uncertainty. In the application of Bayesian decision theory to Phase I clinical trials, we have observed that the sequential sampling scheme heavily depends on a loss function. A loss function, which is devised from individual-level ethics, focuses on the benefit of trial participants, and a loss function, which is devised from population-level ethics, focuses on the benefit of future patients. It is possible to balance between the two conflicting perspectives, and we can adjust a focusing point by the tuning parameter (Sections 4.5 and 4.6). Finally, the use of a weighted posterior estimate can be a compromising method when two or more researchers have prior disagreement. We have compared the prior and posterior weighting schemes in a small-sample binomial problem (Section 5.1) and in a small-sample Phase I clinical trial (Section 5.2). The prior weighting scheme (data-independent weights) outperforms when prior estimates surround the truth, and the posterior weighting scheme (data-dependent weights) outperforms when the truth is not well surrounded by prior estimates. One method does not outperform the other method for all parameter values, so it is important to be aware of their bias-variance tradeoff.
References
- 1.
Raftery AE, Madigan D, Hoeting JA. Bayesian model averaging for linear regression models. Journal of the American Statistical Association. 1997; 92 :171-191 - 2.
Whitehead J, Williamson D. Bayesian decision procedures based on logistic regression models for dose-finding studies. Journal of Biopharmaceutical Statistics. 1998; 8 :445-467 - 3.
Kim SB, Gillen DL. A Bayesian adaptive dose-finding algorithm for balancing individual- and population-level ethics in Phase I clinical trials, Sequential Analysis. 2016; 35 (4):423-439 - 4.
Samaniego FJ. A Comparison of the Bayesian and Frequentist Approaches to Estimation. New York: Springer; 2010 - 5.
Kim SB, Gillen DL. An alternative perspective on consensus priors with applications to Phase I clinical trials. Jacobs Journal of Biostatistics. 2016; 1 (1):006 - 6.
Shao K, Small MJ. Potential uncertainty reduction in model-averaged benchmark dose estimates informed by an additional dose study. Risk Analysis. 2011; 31 :1156-1175 - 7.
Bedrick EJ, Christensen R, Johnson W. A new perspective on priors for generalized linear models. Journal of the American Statistical Association. 1996; 91 (436):1450-1460 - 8.
International Agency for Research on Cancer. IARC Monographs on the Evaluation of Carcinogenic Risks to Humans. Vol. 69. Lyon: IARC; 1997. ISBN 92-832-1269-X - 9.
Kociba RJ, Keyes DG, Beyer JE, Carreon RM, Wade CE, Dit-tenber DA. et al. Results of a two-year chronic toxicity andoncogenicity study of 2,3,7,8-tetrachlorodibenzo-p-dioxin in rats. Toxicology and Applied Pharmacology. 1978; 46 (2):279-303 - 10.
Hoeting JA, Madigan D, Raftery AE, Volinsky CT. Bayesian model averaging: a tutorial. Statistical Science. 1999; 14 (4):382-417 - 11.
Simmons SJ, Chen C, Li X, Wang Y, Piegorsch WW, Fang Q, Hu B, Dunn GE. Bayesian model averaging for benchmark dose estimation. Environmental and Ecological Statistics. 2015; 22 (1):5-16 - 12.
Crump KS. A new method for determining allowable daily intakes. Fundamental and Applied Toxicology. 1984; 4 :854-871 - 13.
EPA (US Environmental Protection Agency). Benchmark dose technical guidance, EPA/100/R-12/001, Risk Assessment Forum. Washington, DC: U.S. Environmental Protection Agency; 2012 - 14.
Whitehead J, Williamson D. Bayesian decision procedures based on logistic regression models for dose-finding studies. Journal of Biopharmaceutical Statistics. 1998; 8 :445-467 - 15.
O’Quigley J, Conaway M. Continual reassessment and related dose-finding designs. Statistical Science. 2010; 25 :202-216 - 16.
Bartroff J, Lai TL. Incorporating individual and collective ethics into Phase I cancer trial designs. Biometrics. 2011; 67 :596-603 - 17.
O’Quigley J, Pepe M, Fisher L. Continual reassessment method: A practical design for Phase 1 clinical trials in cancer. Biometrics. 1990; 46 :33-48 - 18.
Whitehead J, Brunier H. Continual reassessment method: Bayesian decision procedures for dose determining experiments. Statistics in Medicine. 1995; 14 :885-893