Typical values from .
Groundbreaking new drugs called direct acting antivirals have been introduced recently for the treatment of chronic Hepatitis C virus infection. We introduce a mathematical model for Hepatitis C dynamics treated with the direct acting antiviral drug, telaprevir, alongside traditional interferon and ribavirin treatments to understand how this combination therapy affects the viral load of patients exhibiting different types of response. We use sensitivity and identifiability techniques to determine which model parameters can be best estimated from viral load data. Parameter estimation with these best estimable parameters is then performed to give patient-specific fits of the model to partial virologic response, sustained virologic response and breakthrough patients.
- hepatitis C dynamics
- inverse problem
- subset selection
- sensitivity analysis
- identifiability analysis
- automatic differentiation
Over 200–300 million people worldwide are infected with a virus called Hepatitis C (HCV) that affects the liver, which was discovered in 1989 . It is usually spread by blood-to-blood contact via intravenous drug use, poorly sterilized medical equipment and transfusions. Scarring of the liver and ultimately cirrhosis are just a few of the more severe complications associated with HCV .
Six different genotypes of HCV exist due to the highly error prone RNA polymerase with the most common being genotype 1 that has the lowest levels of response to standard treatment [3, 4]. Genotype 1 patients have about a 50% chance for sustained virologic response (SVR), while non-genotype 1 patients have about an 80% chance for SVR . The clinical data used for this study were provided by the University of Sao Paulo, School of Medicine in Sao Paulo, Brazil and consist of genotype 1 patients.
One of the first treatments for HCV was 6–12 months monotherapy with interferon glycoproteins. Interferon is naturally secreted from our bodies to fight off infection and monotherapy treatment with them is associated with around 10% SVR . The addition of ribavirin (RBV), a drug believed to render some of the virus non-infectious, increased SVR to around 30% . RBV monotherapy is not recommended because it does not give a significant benefit to SVR . Until recently, the most common therapy was a combination of pegylated Interferon (IFN) and RBV for 24–48 weeks that yielded about a 45% SVR [5, 6]. One of the major differences between IFN and standard inteferon glycoproteins is that the pegylation allows the drugs to stay in the body longer . There have also been clinical trials with RBV monotherapy before and after IFN + RBV therapy as described in . Recently, new drugs called direct-acting antiviral agents (DAAs) have raised the chance for SVR for HCV patients.
DAAs give an increase to about an 80% chance for SVR for genotype 1 . According to the FDA, DAAs are drugs that interfere with specific steps in the HCV replication cycle by taking advantage of the biological makeup of HCV . HCV is a single-stranded RNA molecule that is several nucleotides in length. During HCV’s life cycle, it is translated into a polyprotein that is composed into structural and nonstructural proteins that aid in replication. During post-translational processing, DAAs called protease inhibitors block a key protease from the replication process and hinders further infection [10, 12]. Among the protease inhibitors available are boceprevir, telaprevir and simeprevir. Simeprevir is recommended over telaprevir and boceprevir because of both improved efficacy and less side effects, but telaprevir continues to be used because of its cost efficiency [13, 14].
Integration of mathematical modeling of viral dynamics with clinical data has led to further understanding of how different treatment strategies dictate viral load dynamics. One of the first mathematical models was given by Neumann et al. which attempted to describe HCV dynamics with interferon monotherapy . Improvements were made to the Neumann’s model to better describe different mechanisms in the liver during treatment including the regeneration of liver cells. Adjustments were also made to include the standard of care, IFN, and RBV. Some of these modifications can be found in [5, 15]. In particular, Snoeck et al.  had data after the end of the treatment phase so that the model can give a more accurate representation of its prediction of SVR. The introduction of DAAs has ushered in more mathematical models that include this type of therapy . For example, mathematical models have been proposed using telaprevir monotherapy [17, 18, 19, 20] and in combination with IFN and RBV  that uses Bayesian feedback to estimate the parameters in the model. The challenges that come with modeling DAAs is that since they are relatively new, there is not as much data available . It can be difficult to predict SVR because of a lack of data after the treatment phase ends due to how recent the drugs have been approved.
This chapter introduces a novel approach for the development of a mathematical model describing HCV dynamics given the triple-drug combination treatment of IFN, RBV, and the DAA telaprevir. In Section 2, we describe how we adapted a previously known HCV model to include telaprevir and the available clinical data. Section 3 describes the a priori analysis of sensitivity and identifiability and its incorporation into the parameter estimation problem. Section 4 gives the parameter estimation results using several patient specific clinical data including partial virologic response, sustained virologic response and breakthrough. Finally, concluding remarks are provided in Section 5.
2. Mathematical models of HCV dynamics
The original model for HCV dynamics in Neumann et al.  was frequently used to assess viral-load profiles after short-term treatment and is given by
where and denote the concentrations of healthy and infected hepatocytes, and represents viral concentration in the liver fluid. One of the key contributions of the model was the understanding of the mechanism of IFN. It was unknown whether it acted through (i.e., inhibiting the infection of healthy liver cells) or (i.e., reducing virion production in infected cells). In , it is determined that it is through which inhibits production of the virus. The drawback to (1) is that it cannot describe patients exhibiting breakthrough, relapse, and most importantly SVR. These responses are reasons that early viral response does not uniformly predict responses in the long term. Another important aspect is the handling of viral load measurements below the lower limit of quantification (LLOQ). Previous analysis omitted the data below LLOQ, but it can contain critical information regarding long-term treatment outcome. Snoeck et al.  present a mathematical model for the dynamics of HCV with the drug treatment combination of IFN and RBV that attempts to address both the long-term responses and the use of the LLOQ. The model described in  is given by the following system of nonlinear differential equations
where (uninfected hepatocytes), (infected hepatocytes), (infectious virions) and (noninfectious virions) are natural states (international units IU/mL). This model was adapted from a standard model of viral infection . The number of uninfected hepatocytes increases each day with reproduction rate and regeneration rate . That number decreases each day as those hepatocytes die naturally at a rate or infected at a rate . The maximum number of hepatocytes per mL is . The number of infected hepatocytes increases when the healthy liver cells are infected and when the infected cells regenerate themselves. That number decreases when they die off naturally at a rate . Infected hepatocytes produce both infectious and noninfectious virions at a rate . Virions are naturally cleared at a rate . IFN inhibits virus production while RBV renders some of the virus noninfectious. The drug efficacies of IFN and RBV are represented by and , respectively. The bounds for IFN and RBV are and where the more effective the drug is, the closer the efficacy of the drug will be to 1. Snoeck uses data that extend beyond treatment for patients so the terms and in (2) account for the exponential decays of the efficacies of the drugs after treatment has ceased. The exponential decay of the drug efficacies is given by
where is the efficacy decay rate, marks the end of treatment, and
The drug efficacies and are related to the drug dosage levels by the following expressions
where Dose is the weekly subcutaneous dose of IFN and ED is the estimated weekly dose that causes 50% inhibition of virion production. represents the daily dose of RBV/kg body weight, and represents the estimated daily dose in mg/kg that makes 50% of the virions noninfectious. Biologically, all state variables and parameters are non-negative. Typical values for model parameters used by Snoek et al.  are given in Table 1.
2.1. HCV model with DAA
Snoeck’s model is adapted to incorporate the DAA, telaprevir. Recall that a DAA targets specific parts of the genome of the virus to inhibit both replication and infection. The hindrance of replication of the virus in the infected hepatocytes results in the virus not being produced by those cells. This means that the DAA should be implemented as part of the infection term, , for inhibiting infection and viral production terms, and , for inhibiting replication of the virus in (2). However, after simulations and analysis, it is concluded in this study that the obstruction of the infection and replication of the virus by telaprevir can be described solely as an amplifier for mitigating the production of virions alongside IFN. With this assumption, the model in  is modified to include the triple drug combination of IFN, RBV and telaprevir as follows:
where represents the exponential decay of the telaprevir efficacy and is defined similarly as for and (see (3) and (4)). In , existence and uniqueness of solutions to this updated HCV dynamical model were established, and a steady-state stability analysis was also performed.
2.2. Treatment schedule
The data in this research uses the treatment schedule timeline as follows (also summarized in Figure 1).
The patient is treated with the triple drug combination of IFN + RBV + telaprevir for the first 12 weeks.
If at 12 weeks, viral load 1000 IU/mL, then discontinue treatment. Otherwise, continue 12-week treatment of IFN + RBV.
If at 24 weeks, viral load LLOQ (12–15 IU/mL), then discontinue treatment. Otherwise, continue 12-week treatment of IFN + RBV.
If at 36 weeks, viral load LLOQ, then discontinue treatment. Otherwise, continue 12-week treatment of IFN + RBV.
End of treatment at 48 weeks.
3. Subset selection
The forward problem refers to using a model to predict the future behavior of a system given a set of parameters. The inverse problem, on the other hand, is the parameterization of a model from empirical data [22, 23, 24]. There have been extensive studies about parameter selection while solving the inverse problem for biological models and other applications that can be found in [3, 22, 25, 26, 27] and references therein. In this study, we use a simple algorithm to choose a subset of parameters to be estimated from clinical data based on both sensitivity and identifiability as follows:
Start with the full parameter set .
Remove parameters that are not locally sensitive to attain .
Remove parameters that are not locally identifiable from to obtain sensitive and identifiable parameter set
Since these are local analyses, this procedure is repeated over a large number of parameter sets and the parameters that appear most often in are the parameters that are estimated. All other parameter values are fixed to values from the literature. A biological and structural explanation for some of the fixed parameters is given in the next section.
3.1. Fixed parameters
The assumptions for fixed parameters are the same as in . Since the maximum number of hepatocytes in the liver is and HCV RNA is distributed in plasma and extracellular fluids with a volume of ml, then . is obtained from hepatocyte turnover being every 300 days and can be deduced in the absence of liver disease. is always fixed because appears in and making and , impossible to estimate uniquely. The rest of the parameters will be considered in the sensitivity analysis.
3.2. Sensitivity analysis
A sensitivity analysis is the process of understanding how the model output is affected by changes in the parameters. Sensitivity analyses are used in many branches of mathematics such as statistics, partial differential equations (PDEs), and control design [28, 29]. The parameters that give the most change in the output are said to be sensitive parameters. This is important in the forward problem because it allows an understanding of which parameters will give useful information. Once the parameters have been identified, a sensitivity analysis for the inverse problem is usually performed to determine the sensitive parameters. Parameters with minimal impact are fixed from the literature. There are two different types of sensitivity analysis: global and local. A global sensitivity analysis heavily depends on the structure of the model and quantifies how uncertainties in outputs can be apportioned to uncertainties in inputs. We refer the reader to  for a more comprehensive discussion. Our study uses a local sensitivity analysis that depends on the prescribed values of the parameters.
3.2.1. Sensitivity equations
The sensitivity analysis presented in this section uses a derivative-based approach. Consider the general form of an ODE model and a function of its output
whereby the vectors and contain the variables and parameters of the model, respectively. Since we are concerned with how our model output, , is influenced by changes to our parameters, , then we consider the partial derivative of , , with respect to . One approach to computing this partial derivative is by solving the associated sensitivity equations. Differentiating both sides of the output Eq. (8) with respect to the parameter yields
since and . The two components and can be directly calculated from , but can be cumbersome to do by hand depending on the complexity of the function . Thus, one can employ automatic differentiation to evaluate these derivatives. Since any mathematical function can be decomposed into elementary functions, automatic differentiation numerically implements the chain rule and basic arithmetic equations repeatedly to compute the total derivative of a function with accuracy to working machine precision . This is achieved with table lookups and tabulating all the functional compositions [32, 33]. An automatic differentiation (AD) code developed by Martin Fink in MATLAB was employed . Finally, to calculate , it is noted that is continuous in and . Since exists, by taking the partial derivative with respect to of the state equations and reversing the order of differentiation , we obtain
Similar to and , and are calculated using automatic differentiation. From (10), the sensitivity equations are given by the following coupled system of differential equations
Solving the sensitivity equations yields , which, in turn, gives from (9).
3.2.2. Model considerations and sensitivity results
The sensitivities of each parameter are ranked to obtain which parameters are most sensitive. Since there is a large range of parameter and viral load values, each parameter, , is log scaled in association with the state variable, , that is,
is considered instead of . This allows a comparison of the sensitivities of each parameter using similar magnitudes. The norm is used to nondimensionalize the sensitivities over time so the following sensitivity coefficient is considered for each parameter
Eq. (12) is defined to be the relative ranking sensitivity of each variable in with respect to each individual parameter .
Since the local sensitivity analysis depends on values in , independent sets of parameters that have a log-normal distribution are created from the population-based model fit in Snoeck et al. . That is, a sequence of independent parameter sets is generated from this distribution using the typical values from  as the mean. To determine pseudo-global sensitivities, a sensitivity coefficient, , is computed for each parameter in the th parameter set. Then, if parameter sets are to be analyzed, then an average for all the parameter sets is computed by
A cutoff is determined based on the ranking of the averages attained in (13). Those parameters above the cutoff are further examined in the identifiability analysis. This method is a version of what is referred to as Morris Screening in . Similar to the work done here, the Morris algorithm  averages local derivative approximations to provide more global sensitivity measures. The difference being that the variance in the parameter sets is also considered. Here that variance would be given by
As explained in , while the mean (13) quantifies the individual effect of the input on the output, the variance (14) estimates the combined effects of the input due to nonlinearities or interactions with other inputs. The reader is referred to [30, 36] and references therein for a more detailed analysis of Morris Screening. It is noted that only the marginal distributions are given in , so computations are ignorant of any covariances between parameters. The data that were used contain only the viral load observations. So the sensitivities of are of interest. Therefore, (8) is considered where
Two different sets of time points are used during this analysis. The first and second set of time points come from the partial virologic response (PVR) case and Breakthrough case, respectively. This will provide a better illustration of sensitivities given that treatment decays in the Breakthrough case, but does not in PVR. The sensitivity rankings are given in Figure 2 for over 2000 (a) and 400 (b) parameter sets, respectively. Error bars that are two standard deviations from the mean are included. The sensitive parameters for the PVR and Breakthrough time points are and , respectively. These parameters are considered in the identifiability analysis. Note that is always considered in the identifiability analysis due to there not being a value from the literature to fix it to for this model. It is used to determine whether it affects the identifiability of other parameters.
3.3. Identifiability analysis
After deciding which parameters are sensitive, consideration is given to understanding which sensitive parameters can uniquely be identified from the data. In this study, we employed a sensitive-based approach for local identifiability analysis. To this end, we consider the parameters contained in which minimize the cost function
with denoting the model output and denoting the corresponding data value at time point for , where is the number of data values. Similar to , let us assume that is the minimum of this cost function. Then by using a Taylor series expansion around , we obtain
If we only consider the first two elements of under the assumption that and substitute this expression into the cost function we find that
where we used the fact that is the minimum of the cost function so that . Let
be an sensitivity matrix relating to the sensitivities of the output with and , where denotes the number of parameters. The cost function of (15) is rewritten in terms of this sensitivity matrix
where . Rearranging , we formulate the cost function in terms of :
If we suppose that is an eigenvector of with , then we have
We note that if is an eigenvector with eigenvalue , then the cost function to second-order approximation is The least squares cost function does not change values when moving from to , with arbitrary. Thus, the parameters are locally unidentifiable at . If has very small eigenvalues, this can also be a problem for parameter identification. There have been studies about how the Fisher Information Matrix () can be used for parameter identification [38, 39]. For example, in , they search all possible parameter combinations and choose them based on the rank of the sensitivity matrix, , and asymptotic standard error uncertainty. We use the following algorithm as described in  to determine which of the parameters in our model will be unidentifiable.
Create the matrix , compute its eigenvalues, and order them such that
If is less than some threshold (typically taken to be ), we say that there is a parameter that is unidentifiable.
The largest magnitude component of the eigenvector associated with the eigenvalue corresponds to the least identifiable parameter. Remove the corresponding column from S and repeat step 1.
After performing this procedure, we now have a set of sensitive and locally identifiable parameters to estimate. The rest of the parameters are set to “typical values” found in the literature. The identifiability algorithm is applied to all the parameter sets of sensitive parameters, and , obtained in the previous section. It is observed from Figure 3 that the parameters in are identifiable at least 50% of the time and the parameters in are identifiable at least 50% of the time. The parameters contained in and are those that will be estimated from the clinical data.
4. Parameter estimation
The parameters in and are estimated using the weighted sum of squares of errors (WSSE) given by
where is the weight for the error term at time , is data measurement of viral load at the th time point and is the model output with parameters . We used both sampling and gradient based methods to minimize this function implemented in MATLAB. The model was fit to three data sets; namely, PVR, ETR (end-of-treatment response) and Breakthrough. PVR represents when the patient has an initial positive reaction to the therapy, but then the viral load rebounds during treatment and never goes below detection. ETR represents when the viral load drops below detection and does not rebound. Breakthrough represents when the patient’s viral load drops below detection, but rebounds. In our data, the LLOQ is 15 IU/ml. When the data drop below the LLOQ, least squared estimation does not suffice as a statistically rigorous methodology. Instead, we employ the expectation maximization (EM)  to compute maximum likelihood estimates of our patient specific parameters. For a detailed description of the EM algorithm, we refer the reader to . The RBV dosage depends on the patient’s body weight and was sometimes modified during treatment due to different symptoms of the patients such as blood thinning. The patients experiencing PVR and Breakthrough had constant RBV dosage for the entire treatment while the patient exhibiting ETR had modified dosage. The RBV efficacy is fixed to from  for the PVR and Breakthrough patients. The efficacies for the ETR patient were modified based on time, , in days since initial treatment and are presented in Table 2.
The parameters not in or are fixed to the values in Table 3 from [5, 22]. As in , the infected steady state is used for the initial conditions for (7) because the patients considered had chronic infection. The values in Table 4 are obtained after estimating the parameters in and . These estimates produce the model fits (graphs on the left) and residuals (graphs on the right) in Figures 4, 5, 6. It is noted that in Figure 6, the ETR patient’s viral load goes to zero, and the residuals for censored data are set to zero.
In practice, the mathematical model is never exact (model misspecification), and the data contain noise (human errors, instrument errors). Hence, confidence and prediction intervals are used to understand the extent of uncertainty involved in estimating our parameters. In calculating these intervals, standard errors are computed from the model predictions using the parameters that have been estimated. Moreover, 95% parameter and predictive confidence intervals and prediction intervals for the PVR parameters (attached as half-widths in Table 4) and predictions are calculated using the asymptotic theory outlined in [22, 27, 30, 41, 42]. The predictive confidence intervals and prediction intervals are shown in Figure 7.
The higher values in and in the ETR patient lead us to believe that the immune response along with the drugs has a stronger impact on the mutation and clearance of the virus. It is known that the immune response is strongly correlated with the clearance of the virus. Since the initial conditions of (7) are at the infected steady state, introduction of the drugs could be a mechanism to jump start the immune response. We note that even when the virus is not cleared, telaprevir still has a strong impact on viral load decay. This behavior corresponds with how powerful DAAs can be in reducing viral load even when it rebounds. The rebound could be because of mutations which are neglected in this model as stated earlier. There is a dip at around the 150th day in the Breakthrough response that is unquantifiable due to lack of information regarding the other three states or a dynamic immune response. However, this type of dip is observed in [5, 27] where data are available around this time. We conjecture that this dip is due to the immune response being stimulated by the spike in viral load and infection. The residuals in the PVR fit in Figure 4 seem to be i.i.d. because the errors seem to be randomly distributed and are on both sides of the zero axis. This is unlike the Breakthrough fit in Figure 5 which have most of the residuals above the zero axis. The predictive confidence intervals and prediction intervals look almost the same because the variance is very small, and the model fits the data very well. The reader is referred to  for further details on differences between the predictive confidence intervals and prediction intervals.
The missing data between weeks 12–24, 24–36 and 36–48 for the ETR and Breakthrough patients make parameter estimation challenging. The predictions would also be more robust if information concerning states , , and were available. These issues should be considered when making remarks about the estimations and confidence measures. DAAs were introduced in 2011, so there is not as much data available, but in the future, we hope for a larger quantity of data to make more precise estimations.
This chapter describes a model for patients with HCV who are treated with IFN, RBV, and telaprevir combination therapy. The development of this model was motivated by the desire for a model that can be validated and calibrated using sensitivity and identifiability techniques while simultaneously incorporating the new DAA, telaprevir. The model can be used to accurately describe patients exhibiting PVR, ETR, and Breakthrough.