Over the past decade, we have utilized the Veterans Affairs (VA) Decision Support System (DSS) database to identify medications that are associated with a reduced incidence and progression of dementia. Our studies identify two general classes of medications that are associated with reductions in the incidence of dementia, compared to a group of patients with similar cardiovascular risk profiles (Li et al., 2010; Wolozin et al., 2000; Wolozin et al., 2007). We have shown that simvastatin, a medication that blocks synthesis of cholesterol, is associated with a significant reduction in the hazard ratio for incident dementia (HR 0.46, CI 0.44 – 0.48 p<0.0001) (Wolozin et al., 2007). Simvastatin has two primary effects. One effect is to lower cholesterol and a second effect, which is observed only at higher doses of simvastatin, is to reduce inflammation (Wolozin et al., 2006). The second major class of medications that are associated with reduced progression of dementia fall in the general category of angiotensin receptor blocking medications, which includes lisinopril, candesartan and valsartan (Li et al., 2010). In this work, the reduction in dementia incidence (HR, 0.81, C.I., 0.68 – 0.96, p=0.016) was not as striking as for statins, but we also examined nursing home admissions, where we observed a strong effect (HR, 0.51, CI 0.36 – 0.72, p=0.0001). In addition, we observed additive effects when angiotensin receptor blockers (ARBs) were used in addition to Angiotensin Converting Enzyme (ACE) inhibitors (Li et al., 2010). Finally, we have also examined the influence of coronary artery bypass on the incidence of dementia (Lee et al., 2005). This work was recently expanded upon to include an examination of the putative role of gaseous anesthetics on the incidence of dementia (no evidence supporting such a hypothesis was observed) (Vanderweyde et al., 2010).
This work derives from the immense size and richness of the DSS database, which is one of the largest integrated electronic health records in the world. The DSS began capturing data in October 2001 (fiscal year 2002), although some parts of the database go back to 1996 (Smith and Joseph, 2003). The database captures information on approximately 5 million subjects per year, with over 100 million medication prescriptions per year. The DSS database offers researchers a tool that is distinctly different than the classic cohorts developed at academic centers. Cohorts developed at academic centers tend to be smaller by a factor of 500 – 5000, often being in the range of 1000 – 5000 patients. These cohorts offer rigorous, uniform evaluations of well-characterized and carefully selected subjects using evaluators who are following rigidly defined protocols (Hayden et al., 2006; Launer et al., 2000; Seshadri et al., 2006). These types of cohorts are ideal for studying medications or other factors that are prevalent in the cohort (with a rate > 10%). However, such cohorts are unable to study most medications in a formulary because the prevalence of use is too low. The large size of the DSS database presents an outstanding resource for pharmaco-epidemiological studies where rates of use for many medications are 1% or less. We have used the DSS database for such studies, and have observed high concurrence with our studies upon investigation by other epidemiologists using other databases or cohorts.
The material in this chapter will explain the design and execution of pharmaco-epidemiological databases using the DSS database. We will define the elements available in the DSS database, the strengths and the weaknesses of dataset. The DSS dataset is particularly suited for studies meeting criteria that will be defined. An important preliminary point is that the size and security concerns related to using the DSS database create a unique set of requirements that must be considered and adhered to carefully. We will describe the elements that should be considered in defining optimal cohorts for the study, and the types of factors that should be considered in characterizing the demographic profiles of these cohorts. Cohorts, selection criteria and exclusion criteria all vary depending on the type of study, and the factors that should be considered in each of these decisions will be described. Studies can be performed with a focus on incidence or progression, and the considerations required for each such study will be described. Multiple models should be used for a particular study. The models can be varied based on the choice of covariates, exclusion criteria, comorbid illnesses allowed, concurrent medications allowed, as well as other elements. The considerations related to each of these choices will be discussed. The size of the DSS database offers the option for types of studies that are not feasible in smaller datasets. Medications can be compared by pharmacological properties, duration of exposure and medication switching. End points can include files in the DSS database, but increasingly can also include datasets not directly part of the DSS. For instance, the DSS data can be cross-referenced to Medicare, which can capture some off-plan elements such as nursing home utilization and use of health systems outside the system. After acquisition of the data, statistical models that can be applied include logistic regression, hazard rates and tests for sensitivity. Statistical models can also inform choice of subjects for cohorts, for instance by applying propensity analysis. Considerations for each of these choices will be examined.
The DSS database is also being continually upgraded and expanded. In the future, we can expect that computer based language algorithms will allow capture and quantification of written reports by the health care providers. Genetics will also become increasingly accessible. The VA is currently embarking on a plan to genotype DNA for 1 million subjects. Although this dataset will cover only a fraction of the subjects in the VA system, the large number of subjects analyzed will create a unique resource.
2. Experimental design: General considerations
The introduction above describes many of the inherent strengths of the DSS system. However, despite the apparent strengths, the DSS system also has some important weaknesses. The design of pharmaco-epidemiological studies of the VA DSS database must take into account these weaknesses. The VA pharmacy increasingly limits the medications that are used, trying to strongly encourage use of generic medications. This leads to two important considerations.
The first issue is that some medications commonly used outside the system, are not used inside the system. For instance, among statin users, atorvastatin and rosuvastatin are used by many patients not in the VA system, but most patients in the system use simvastatin (Wolozin et al., 2007). This limits the breadth of medications that can be analyzed using the DSS system. Formulation considerations also impact in other manners that are subtler. Changes in the types of medications on the formulary lead to changes in medication utilization by patients. Such changes alter the duration of time that patients might take a medication, which can lead to altered rates of disease. For instance, if entire groups of patients switch medications, the resulting studies could make it seem as if a particular disease is less prevalent among patients taking that medication, although the real reason is that the duration of exposure might be less than for other medications that were used throughout the analytic period. These types of biases can be adjusted for using careful analytic designs, but such analyses must be incorporated. Another large problem is that the clinical records have not been validated and are not uniform. The VA system comprises 23 VISNs, and multiple hospitals within each VISN. The size of the VA system leads to treatment diversity; physicians are trained differently. Diagnostic and/or treatment protocols can differ by VISN, as well as among physician within a VISN. Such variations can be particularly important when evaluating outcomes such as dementia, for which no widely utilized, definitive biomarker exists.
Other considerations include the demographics and health status of the patient population. The VA population is about 95% male, which means that most of the data will be represent a male cohort unless gender the study is designed to stratify by gender. VA patients tend to have more comorbid diagnoses than patients in other databases, such as medicare or Kaiser Permanente Database (Whitmer et al., 2009). Finally, a significant fraction of patients also utilize health care providers outside of the VA system to supplement their health plans. Such “out-of-network” services are not captured by the DSS database, and must be considered. Utilization of services out of the network can lead to disparities among patient populations, since wealthier patients are more likely to use “out-of-network” providers. In addition, use of providers out of the network can cause patients to be lost to follow-up.
The analytic approaches that we have designed attempt to take these considerations into account. The first two approaches can be implemented using only the DSS system. The third approach requires utilization of an additional database to validate the observations from the DSS system.
2.1. Analysis using two parallel strategies
One strategy is to match cohorts using a propensity analysis in which subjects are selected and matched based on risk factors for cardiovascular disease, dementia and the number of concurrent medication utilized. Regression analyses is then performed to determine the appropriate weighting co-efficient for each matching variable. A second strategy is to analyze cohorts based on descriptive or multivariate regressions using logistic and hazard rate models. For the descriptive analysis, one selects matched cohorts and characterizes the outcomes by odds ratios. For the multivariate cohort analysis one identifies a cohort based on age, diagnosis and medication use, and follows the cohort prospectively employing a multivariate regression analysis that includes comorbid diagnoses and other medications (to control for poly-pharmacy).
2.2. Using physiological responses to assess medication action
Epidemiological studies are supposed to derive from hypotheses that rely on a biological basis of action for medications in question. However, patients differ in their response to medications due to genetic factors (e.g., polymorphisms in metabolic genes), usage patterns (medication compliance) and environmental factors (e.g., other chemicals modifying drug metabolism). The medication response for each patient can be assessed by examining biomarkers that are directly linked to the mode of action of the medication. For instance, in the case of antihypertensive medications, such as angiotensin receptor blockers, one can analyze patients by degree of biomarker response, using blood pressure lowering to assess anti-hypertensive use. The physiological response is then be employed as a covariate in the analyses, to insure that the outcome is tethered to the physiological activity of the medication.
3. Establishing the cohorts: Propensity analysis
3.1. Definition of cohorts
Medications to be analyzed are grouped by action and compared as a group to determine whether the group is associated with differential outcomes. For instance, in the study of angiotensin receptor blockers, ideal comparison cohorts are those corresponding to subjects with similar health problems taking medications that exert similar effects but through a different mechanism of action. Beta-blockers provide a good example of a comparison cohort because beta-blockers are anti-hypertensive agents that act by a different set of receptors than angiotensin receptor blockers. ACE inhibitors provide another comparison cohort because they act on a different molecular target than angiotensin receptor blockers, however ACE inhibitors have the weakness that their ability to reduce angiotensin II levels would also reduce action at the angiotensin AT1 receptor (much like angiotensin receptor blockers). A more general cohort would be subjects taking cardiovascular medications other than angiotensin receptor blockers. If the group shows an effect, then it is possible to code the data in a manner that allows stratification of the data within each group by individual medications, and then quantify outcomes for each medication within each group. Comparing individual medications can be important because medications can differ in many important properties, such as brain penetration and off-target effects.
Regardless of the particular choice of medication cohort, it is important to consider poly-pharmacy. Subjects frequently take multiple medications. Poly-pharmacy can be taken into account by considering other medications as a covariate, or by excluding particular medications from each cohorts. The latter approach, though conceptually appealing, has the weakness that it frequently reduces the cohort size and therefore power of the study.
3.2. Medication utilization
A critical issue in studying pharmaceutical action, is to make sure that any subjects being studied actually use the medication as prescribed. Medication utilization can be assessed by requiring several criteria for inclusion of subjects in a cohort:
Subjects must exhibit utilization of the medication as demonstrated by 80% medication coverage over the first 6 months upon entry into the study, and for the 6 months preceding any event.
Subjects must be on each medication for at least 6 months prior to any outcome event.
Subjects must use the health care system at least once in the 6 months preceding an event or the end of the study; those who do not fulfill this requirement are considered lost to follow-up and removed from the study.
3.3. Components for propensity analysis: Co-variates and matching variables
Setting up a cohort requires assessment of key variables. In the case of our studies of angiotensin receptor blockers, and of statins, used the following criteria:
Age: A clear risk factor for dementia.
Co-morbidities: Major diseases that are known risk factors for AD. Subjects will receive an additional point for each ICD9 diagnosis: cardiovascular disease (ICD9 277.7, 429.2, 410-414, 428-429, 440, 444, 445), stroke (430-438) and diabetes (ICD9 250). Retinopathy (ICD9 362), neuropathy (ICD9 249.6, 250.6) and nephropathy (ICD9 249.4, 250.4) will also be included in the scoring because these are manifestations of microvascular disease. Note: hypertension (ICD9 401-405 and 459.3) are not used in the propensity scoring because it is present in all subjects taking anti-hypertensive medications and does not affect the outcome.
Number of prescription medications: Patients receive a point for each medication for which they have a prescription.
Cholesterol/blood pressure/HbA1C/GFR/BMI: For each laboratory value subjects are categorized into one of three groups for the propensity analysis: Normal (<1 Std. Dev. from Mean), High (1-2 Std. Dev. from Mean) and Very High (>2 Std. Dev. from Mean). Mean values for each subject will be determined based on values obtained during the first year in the study: HDL, LDL, triglyceride values, blood pressure values (diastolic and systolic as independent variables), HbA1C, BMI and estimated glomerular filtration rates (GFR, calculated as a measure of renal function based on the modification of diet in renal disease, MDRD, equation).
Smoking status: Smoking status is captured from VA/KP screening tools.
Prior healthcare utilization: Health utilization is determined for the year preceding entry into the study. For subjects who were already on anti-hypertensive medications prior to 2002 (the earliest year of the DSS database), health care utilization is assessed based on fiscal year 2002.
3.4. Scoring method
Propensity scores are used to reduce selection-to-treatment bias. A matched-propensity score approach reduces the dimension of the confounding variables by providing an estimate of the probability of receiving treatment, essentially reducing the number of variables to one.
The distribution of propensity scores is examined in the comparison samples to check for sufficient overlap and balance in the strongest predictors. If the balance is not sufficient, the samples may be further stratified, interactions and transformation of variables considered; if there is no overlap in some part of the distributions of propensity scores, patients with the non-overlapping propensity scores will be removed from the analyses (after noting why they are different in their likelihood to receive a particular treatment). A propensity score is the conditional probability that a patient receives a treatment. It is estimated from a logistic regression model: Expected(log-odds of treatment A compared to B) = b0 + bX where X is a vector of patient characteristics and conditions at the time of treatment assignment. If there are more than two treatments, three propensity scores and comparisons are made, A to B, A to C, and C to B. A greedy-match algorithm (
4. Outcomes: Incident dementia, nursing home admission and death
Each cohort is matched based on propensity to develop dementia, as described in the section above. The risk of dementia is compared among each cohort using a Cox proportional hazards model for the outcomes listed below. The hazard rates are adjusted using the covariates listed below. The risk of AD or dementia, nursing home admission and death among the general population is studied excluding subjects who have a pre-existing diagnosis of AD or dementia (based on analysis of medical records in the year preceding entry into the study). We will also study the risk of nursing home admission and death specifically among subjects with an existing diagnosis of AD or dementia.
Outcomes for each of the cohorts also are plotted with a log(-log(time)) vs. log(time) graph. Curves parallel for each of the groups, validate the application of the Cox proportional hazards model approach. Others models such as the Weibull model can be tested for purposes of sensitivity analysis, to determine if results are corroborated.
End of follow-up is defined as a particular date (for example, December 31, 2009), disenrollment from the health care system, nursing home admission, initiation of dialysis or death. Membership in the health care system can be determined from administrative databases. AD (ICD9 331.0) or dementia (ICD9 290, 294, 331.0) is ascertained based on ICD9 diagnoses. Nursing home admission is ascertained through the DSS and Medicare records. Note: the VA system recently established an agreement to allow purchasing of Medicare records and harmonization with the VA, this agreement is to being finalized between the VA and the Center for Medicare and Medicaid Services (CMS). Records are obtained to insure that we are capturing all nursing home admissions. Death is ascertained from linkage to the VA mortality files (to be obtained through the VA HERC which has the most complete records of VA mortality combining mortality records from multiple sources that include the VA BIRLS files and have been previously validated).
Establishing covariates increases the power of a study by identifying factors that could bias the outcomes, and then incorporating them into the model.
Age: Age is of course the major risk factor for any disease and must always be included as a covariate.
Health Index/Co-morbidities: The major disease risk factors that we will characterize are cardiovascular disease (ICD9 277.7, 429.2, 410-414, 428-429, 440, 444, 445), stroke (430-438) and diabetes (ICD9 250); hypertension (ICD9 401-405 and 459.3) will not be analyzed as a covariate because our preliminary studies indicate that it is present in all subjects taking anti-hypertensive medications and does not affect the outcome.
Microvascular disease: Retinopathy (ICD9 362), neuropathy (ICD9 249.6, 250.6) and nephropathy (ICD9 249.4, 250.4) are manifestations of microvascular disease.
Hospital utilization:Hospital utilization rates are calculated for subjects and include these as co-variates.
Cholesterol/blood pressure/HbA1C/GFR/BMI (all are co-variates): Captured variables include: HDL, LDL, triglyceride values, blood pressure values (diastolic and systolic as independent variables), HbA1C and BMI. Estimated glomerular filtration rates (GFR) are also captured and calculated as a measure of renal function based on the modification of diet in renal disease (MDRD) equation.
Duration of Insulin use: Insulin use is commonly associated with late stage Type II diabetes, which is strongly associated with dementia.
Diuretic use: Anti-hypertensive medications are frequently used in conjunction with diuretics. We include diuretic use as a co-variate a binary measure for diuretic use as one group (± diuretic).
Location: VA centers (VISN groups) where treatment occurred are identified, and these regions are used as co-variates. In the event that a subject utilized multiple centers, the first center identified in the record is the designated center for the analyses.
Gender: The gender ratio is determined (by counting the number of male and female subjects), and used as a co-variate.
Smoking status: This is captured from VA screening tools on the basis of the CDC BRFSS items previously administered in the VA.
Medications: Other cardiovascular medications on the VA formulary are analyzed as covariates to control for poly-pharmacy.
Ethnicity: Ethnicities can differ in medication responsiveness. Between 5 – 20% of subjects in the DSS database have defined ethnicities other than Caucasian. Categories of defined ethnicity (e.g., Caucasian,Asian, African-American and Hispanic.
Prior healthcare utilization: Health utilization is determined for the year preceding entry into the study. For subjects who were already on anti-hypertensive medications prior to 2002 (the earliest year of the DSS database), health care utilization is assessed based on fiscal year 2002.
4.4. Outcomes: Secondary analyses
Should one observe differences among the medication groups, one can examine the specific groups using the secondary analyses described below:
Dose-response: The mean daily exposure for each medication is derived for each subject by dividing the total use of each medication (cumulative dose based on prescription records for each medication) by the cumulative the days of use. Subjects will be stratified into a high dose and low dose groups around the median daily exposure. A Cox proportional hazard model is used to determine HRs for each medication with a model utilizing multivariate regression involving co-variates.
Medication switching: One approach to validating effects is to examine medication switching. In such a study, one identifies subjects who are using only one class of anti-hypertensive medications (concurrent use of diuretics will be allowed and will be analyzed as a covariate), and show stable use of the medication over the first 6 months following entry into the study. Subjects are followed and the type of medication used at the end of the study will be determined. Subjects using one particular anti-hypertensive medication, and who showed stable use of the medication over the 6 months preceding the end of the study will be selected for further analysis. The data are analyzed using a Cox proportional hazard model to determine whether switching from one group of anti-hypertensive medications to another group is associated with a change in the HR.
Ethnicity: Each cohort can also be stratified by ethnicity. Ethnicities to be examined will include the major ethnicities (to allow sufficient power for the analysis): Caucasian, African American, Hispanic, Asian (see section E.4., Power Analysis, for a quantification of ethnic distributions).
Gender: Gender can be included as a co-variate. However, because the DSS database is predominantly male, a study using gender stratification will be restricted to the KP database.
5.1. Incidence and progression
The approaches outlined above enable us to make assessments of the association of particular medications with measures of dementia, including incidence and progression. Population based epidemiological studies typically examine incidence of a disease, but employing outcomes that are commonly associated only with more severe forms of dementia allows us to also gain insight into progression. A key element in assessing progression is to use outcomes that are unequivocally associated with progression. These considerations led us to restrict out markers of progression to two very extreme outcomes, nursing home admission and death. Studies of incidence provide utility from the public health perspective, but do not necessarily help those who already suffer from disease symptoms (cognitive decline). Subjects typically present to the neurologist only when they are already experiencing cognitive decline. Hence, one would like to be able to gain insight into medications that specifically address this group of patients.
5.2. Pharmacological properties
The large size of the DSS database allows a depth of analysis that is typically not possible with smaller datasets. We have used this size to query whether the epidemiological results are consistent with known pharmacological properties of particular medications. For instance, classic studies of pharmacology employ dose response profiles. We tested the strength of the ARB findings by examining whether differences in ARB exposure was associated with corresponding differential outcomes (Li et al., 2010). These studies focused on dementia because the larger number of incident dementia cases allowed for sufficient statistical power to enable further stratification of the groups. The mean daily exposure to for each ARB was determined for each subject by dividing the total use of ARBs (cumulative dose based on prescription records for each medication) by the cumulative number of days of use. Subjects were then stratified into a high dose and low dose groups around the median daily exposure. A Cox proportional hazard model was used to determine HRs for each ARB. The model included age, stroke, diabetes and cardiovascular disease as covariates. Each of the major ARBs used by the VA system, candesartan, irbesartan, losartan and valsartan showed clear dose-response relationships, and exhibited lower rates of incident dementia with higher doses of medication.
We also examined whether subjects who switched from ACE inhibitors to ARBs, or vice-versa, adopted the dementia risk value for the subsequent medication (Li et al., 2010). We identified subjects without a prior diagnosis of dementia who were prescribed either ARBs or ACE inhibitors (but not both), and showed stable use of the medication over the first 6 months following entry into the study. The “ACE inhibitor” group included all of the ACE inhibitors on the VA formulary. Subjects were followed and the type of medication (ACE inhibitor or ARB) used at the end of the study (dementia or censoring) was determined. Subjects using either ARBs or ACE inhibitors (but not both), and who showed stable use of the medication over the 6 months preceding the end of the study were selected for further analysis. The data were analyzed using a Cox proportional hazard model adjusting for age, stroke, diabetes and cardiovascular disease. Subjects who were on ARBs throughout the study or who started on ACE inhibitors and switched to ARBs showed a significantly lower HR (protective) for incident dementia compared to the reference group (Li et al., 2010). These data show that the association between ARB use and risk of incident dementia is sensitive to dose of medication and medication switching.
Another pharmacological characteristic that we examined was correlation between effect and expected brain penetration of the medications based on the published literature. Our study of statins, in 2007, produced striking results in this respect (Wolozin et al., 2007). The statins differ in their order of lipophilicity, which leads to the observation that some statins cross the blood brain barrier well, while others don’t (Reinoso et al., 2002). Lovastatin, and (to some extent) simvastatin are lipophilic, which allows them to cross the blood brain barrier, while pravastatin and atorvastatin are hydrophilic and do not cross the blood brain barrier; despite these traits, both lipophilic and lipophobic statins appear to modulate cholesterol metabolism in the CNS (Reinoso et al., 2002; Vega et al., 2003). In the 2007 study, we observed that simvastatin was associated with an effect size for incident dementia that was significantly stronger than that for atorvastatin or lovastatin HR, 0.46, CI 0.44-0.48, p < 0.0001, for simvastatin, HR 0.91, CI 0.80-1.02, p = 0.11 for atorvastatin and HR 0.95, CI 0.86-1.05 for lovastatin (Wolozin et al., 2007). Our study of ARBs also showed evidence that medications with greater brain penetration appear to be associated with better outcomes (Li et al., 2010). The rank order of effect size for the different ARBs was candesartan>irbesartan>losartan>valsartan (HRs: 0.73, 0.84, 0.82, 0.91 respectively), which corresponds strongly with brain penetration for each medication (Li et al., 2010).
The approaches and results presented above show a range of data-mining strategies that can be applied to the DSS database. Much of the strength of the DSS database comes from its large size and the availability of multiple types of data. The power of the database increases as the duration of time from the inception of the DSS database increases. The positive attributes of this database must be balanced against weaknesses that are inherent to any population based data source derived from dispersed centers. Despite these weaknesses, the studies performed by our group have shown good reproducibility in subsequent literature. Epidemiologists throughout the world have reproduced our observation of a reduction in incident dementia associated with statin. Work on ARBs by other groups is only beginning, however in collaboration with Rachel Whitmer (Kaiser Foundation), we have already reproduced major elements of this story and have presented this replication at major meetings (AAICAD, 2010, Whitmer et al).
What lies in the future? One of the most exciting prospects is that of pharmaco-genomics. The VA is collecting DNA from 1 million subjects, for use in genetic studies. As this genetic information becomes available, it will be possible to cross-reference it with the DSS database and investigate potential interactions between genetic polymorphisms and health care outcomes.
This work was supported by a grant award to BW from the Retirement Research Foundation.