FI, Brief Fatigue Inventory; DDQ, Disability Days Questionnaire; GAD, Generalized Anxiety Disorder; GAD-Q-IV, Generalized Anxiety Disorder Questionnaire; HU12/HU13, Health Utilities Index; HAM-A, Hamilton Anxiety Scale; MDD, major depressive disorder; MOS SS, Medical Outcomes Study Sleep Scale; PHQ, Patient Health Questionnaire; Q-LES-Q(SF), Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form; SIGH-A, Structured Interview Guide for the Hamilton Anxiety Scale; SCL-90, Symptom Checklist; SDS, Sheehan Disability Scale; SF-12, Short Form Health Survey. *US-based populationQuestionnaires administered in the study (Hamilton 1959; Derogatis et al. 1974; Broadhead et al. 1990; Hays and Stewart 1992; Revicki et al. 1994; Feeny et al. 1996; Ware et al. 1996; Leon et al. 1997; Mendoza et al. 1999; Shear et al. 2001; Kroenke and Spitzer 2002; Newman et al. 2002; Horsman et al. 2003; Statistics Canada and US National Center for Health Statistics 2004; Ware 2007)
The quality of life (QOL) of persons affected by generalized anxiety disorder (GAD), a condition characterized by periods of excessive anxiety and worry (American Psychiatric Association 1994), is significantly impaired, with an established link between GAD and impairment in a variety of areas (Henning et al. 2007). GAD is associated with increased self-reported disability days and impairments in psychosocial functioning, role functioning, work productivity, and QOL (Massion et al. 1993; Kessler et al. 1999; Wittchen et al. 2000; Kessler et al. 2001; Kessler and Wittchen 2002; Wittchen 2002). Consequently, comprehensive evaluations of treatment for GAD must include both clinical endpoints (i.e., Hamilton Anxiety Scale-Anxiety [HAM-A]) and assessments of patient-reported QOL and functioning. Moreover, it has been estimated that 92.1% of individuals with GAD also have another lifetime comorbid psychiatric disorder (Ruscio et al. 2007). Anxiety and depression often co-occur, and it has been proposed that a search for one condition should be accompanied by an assessment of the other (Kroenke et al. 2007). The inclusion of patient-reported outcomes (PROs) in clinical development programs for GAD treatments will provide useful information for clinicians and their patients about the benefits of treatment on patient functioning and well-being, and the relationship between GAD and depression.
The Quality of Life Enjoyment and Satisfaction Questionnaire (Q-LES-Q), in both the long and short form (Q-LES-Q(SF)), is a widely used instrument for measuring QOL and satisfaction. Originally developed for use in clinical trials and among trial participants with a wide variety of mental and medical diseases or disorders (Endicott et al. 1993), it has been shown to offer high internal consistency, validity and reproducibility in non-psychiatric populations and in patients with a range of psychiatric illnesses (Endicott et al. 1993; Ritsner et al. 2005; Rossi et al. 2005; Endicott et al. 2006; Schechter et al. 2007; Mick et al. 2008; Revicki et al. 2008). Thus, the Q-LES-Q(SF) is a PRO measure that has the potential to extend and complement clinical efficacy endpoints and demonstrate the impact of alleviating GAD symptoms on patients’ everyday functioning. To date, few published studies have examined the validity of Q-LES-Q(SF) in GAD (Endicott et al. 2007; Demyttenaere et al. 2008; Revicki et al. 2008; Wyrwich et al. 2009; Matza et al. 2010; Wyrwich et al. 2011), and further studies are required to ascertain the sensitivity of the instrument to detect changes across the range of efficacy measures available to evaluate symptoms associated with GAD (including commonly observed co-occurring conditions). The availability of Q-LES-Q(SF) data from the Quality of life, Utilization of services and Effects of STress (QUEST) study (Revicki et al. 2008) provided the opportunity to contribute to existing evidence by further investigating the reliability, validity, and responsiveness of this instrument when administered via telephone as a measure of overall QOL and satisfaction related to various areas of functioning in patients with GAD.
This study was a post-hoc analysis of the QUEST study, which examined the treatment patterns, clinical and QOL outcomes, and direct and indirect costs associated with GAD in a US managed care organization; details of this study have been published elsewhere (Revicki et al. 2008). Briefly, this was a longitudinal study in which retrospective data were collected through an administrative claims database and prospective data were collected through telephone interviews. The study was approved by the institutional review board at Kaiser Permanente Northwest Region (KPNW Portland, Oregon), and complied with HIPAA requirements.
2.1. Study procedures
Patients who had 2 medical care encounters with diagnoses of GAD and/or anxiety state unspecified in the past 12 months (from 2003–2004) were recruited between June 2005 and June 2006. Eligible participants were identified through a review of the KPNW Data Warehouse. KPNW subsequently sent a memo and study fact sheet to the providers of each of these eligible participants asking for assistance in inviting their patients to participate. Interested potential participants, or those who did not invoke the initial refusal, were then contacted by telephone using a standardized screening script, during which time they were invited to participate in the study.
Other inclusion criteria included: a confirmed DSM-IV diagnosis of GAD (300.02) and/or anxiety state unspecified (300.00) based on the Structured Clinical Interview for DSM-IV-TR; age ≥18 years; the ability to speak and read English; and completion of a written informed consent. There were no treatment requirements; patients were assigned the standard of care. Patients with a diagnosis of psychosis, bipolar disorder, organic psychotic disorder or mental retardation within the past 12 months or who had current cognitive impairment (memory loss and temporal disorientation demonstrated during a telephone contact or reported by a family member) were excluded from the QUEST study. Retrospective data collection was conducted using administrative claims/encounter data to measure medical resource use and costs for the 12 months prior to the baseline survey, while prospective data were collected by following participants for a 6-month period, during which time 3 telephone interviews were conducted (at baseline, 3-month follow-up [10–14 weeks after baseline] and 6-month follow-up [22–26 weeks after baseline]) to administer several questionnaires. For purposes of this secondary analysis, the study population was limited to subjects with prospective data collected through the telephone interviews.
2.2. Measures administered
To evaluate overall QOL and satisfaction related to various areas of functioning, clinical symptoms associated with GAD, general health, fatigue, sleep, and disability, questionnaires administered in the study included the Q-LES-Q(SF), Patient Health Questionnaire Depression Questions (PHQ-8; (Kroenke and Spitzer 2002)), the Generalized Anxiety Disorder Questionnaire-IV (GAD-Q-IV (Newman et al. 2002)), the Structured Interview Guide for the Hamilton Anxiety Scale (SIGH-A (Hamilton 1959; Shear et al. 2001)), the Somatic Subscale of the Symptom Checklist (SCL-90 (Derogatis et al. 1974)), the Medical Outcomes Study 12-Item Short Form Health Survey, version 2 (SF-12v2 (Ware et al. 1996; Ware 2007)), Brief Fatigue Inventory (BFI (Mendoza et al. 1999)), Medical Outcomes Study Sleep Scale (MOS SS (Hays and Stewart 1992)), the Health Utilities Index Mark 2 (HUI2)/Mark 3 (HUI3 (Feeny et al. 1996; Horsman et al. 2003; Statistics Canada and US National Center for Health Statistics 2004)), the Sheehan Disability Scale (SDS (Leon et al. 1997)), and the Disability Days Questionnaire (DDQ (Broadhead et al. 1990; Revicki et al. 1994); Table 1). These were administered at baseline and at 3- and 6-month intervals, with the exception of the SIGH-A, which was only administered at baseline and 3 months.
|Q-LES-Q(SF)||Participant-rated scale designed to measure the degree of enjoyment and satisfaction experienced by participants in their general activities of daily functioning|
Composed of 14 general activity items (included in the score) and 2 additional items on medication satisfaction and overall life satisfaction item
Higher scores indicate greater enjoyment and satisfaction
|0 to 100|
|PHQ||9-item scale consisting of the DSM-IV criteria used to diagnose MDD|
Suicide item excluded in this study (PHQ-8)
Higher scores indicate greater depression severity
|0 to 24|
|GAD-Q-IV||9-item self-reported revised diagnostic measure of GAD|
Based on DSM-IV
Higher scores indicate greater anxiety severity
|0 to 12|
|HAM-A (administered using SIGH-A)||Developed to evaluate the severity of anxiety symptoms|
Administered via the Structured Interview Guide for the [SIGH-A], the developer-approved interview guides for the HAM-A
Higher scores indicate greater anxiety severity
|0 to 56|
|SCL-90||Comprises 12 items that identify distress occurring from perceptions of bodily dysfunction|
Higher scores indicate more somatic symptom distress
|0 to 48|
|SF-12 v2||PCS and MCS scores used in this study|
Scores < 50 represent below-average physical health or mental health
|Norm-based scores with a mean of 50 and SD of 10*|
|BFI||9 items plus an introductory question used to measure fatigue in cancer patients|
A single item on the BFI used in this study, which asks participants to rate their worst level of fatigue during the past 24 hours
|0 (no fatigue) to 10 (as bad as you can imagine)|
|MOS SS||12-item self-reported questionnaire used to evaluate a participant’s sleep disturbances over the past 4 weeks|
Sleep Problem Index II was used in this study
Higher scores indicate more sleep problems
|0 to 100|
|HUI2/HUI3||Comprises the minimum number of questions (40 items) required to classify the health status of a broad range of participants (age 5 +)|
Recall period for each item is the previous 4 weeks
Data on population norms are available for HU12 and HU13
|0 (death) to 1.00 (best possible health)|
|SDS||Patient-reported 3-item questionnaire|
Assesses mental health-related functional impairment
In this study, for participants who selected “not applicable” for the work item, the mean value of their social and family items were substituted in for the work item when deriving the total scale score
|0 to 30|
|DDQ||Consists of 4 questions on missed work, late for work, bed disability and restricted activity days in the past 3 months due to GAD||0 to 92 days (each item)|
The Q-LES-Q(SF) total score was derived by summing scores from the first 14 Q-LES-Q(SF) items, each scored on a response scale ranging from 1 (very poor) to 5 (very good). The raw total score, which can range from 14 to 70, was then expressed as a percentage of the maximum (or % maximum) total score possible (ranging from 0–100) for ease of interpretation, with higher scores indicating greater enjoyment or satisfaction.
2.3. Statistical analyses
The eligible baseline sample (N = 296) was used for all baseline analyses. For 3- and 6-month analyses, those with at least 1 of these follow-up assessments were eligible for the respective analysis. Data analysis was conducted using SAS Version 9.1.3 (Copyright (c) 2002-2003 by SAS Institute Inc., Cary, NC, USA).
Analyses of internal consistency from the baseline, 3-month, and 6-month data of the Q-LES-Q(SF) using Cronbach’s alpha were conducted to ensure that the measure had strong internal consistency, where α >0.70 was indicative of a strong relationship among the measure’s items (Cronbach 1951).
Convergent validity was examined by constructing and reporting the appropriate correlation coefficient (Pearson for continuous variables or Spearman for ordinal variables) between the Q-LES-Q(SF) with the PHQ-8, GAD-Q-IV, HAM-A, SCL-90, PCS, MCS, BFI, MOS SS, HUI2, HUI3, SDS, and the DDQ using the baseline data, and again using the 3- and 6-month data (with the exception of the HAM-A correlations calculated only at baseline and 3 months) to gauge the strength of the cross-sectional relationships as weak (|r|< 0.30), moderate (0.30 ≤|r|< 0.60), or strong (|r| ≥0.60) (Hinkle et al. 1988). Moderate to strong relationships were hypothesized between the Q-LES-Q(SF) and all tested measures, except sleep and work-related measures, given the extreme nature of sleep and work when considered within overall QOL. The correlation of change scores across time (3 months – baseline and 6 months – baseline) were also analyzed to provide more support to the stability of the scale properties, and were hypothesized to be approximately the magnitude of the product of the cross-sectional correlations at each time point.
Analysis of variance (ANOVA) tests comparing the mean values on the Q-LES-Q(SF) scores between the participant groups listed below were conducted to examine discriminant (known groups) validity. Both unadjusted and adjusted comparisons were conducted using age, gender, and baseline Q-LES-Q(SF) as covariates when 3- and 6-month data were used, and statistical significance was set at the P<.05 level. Mean scores were compared for:
Those with HAM-A scores ≤24 points and those with HAM-A scores >24 points (25–40) using the baseline scores (Matza et al. 2010).
Those with GAD-Q-IV scores ≥5.70 and those with scores <5.70 cutoff (Newman et al. 2002).
Those with total scores of ≥10 on the PHQ-8 and those with scores <10 (Kroenke et al. 2001).
Those with SDS total scores ≥5 and those with SDS scores <5 (Leon et al. 1997).
Those classified as asymptomatic (HAM-A ≤9); mild (HAM-A = 10–15), moderate (HAM-A = 16–24); or severe (HAM-A ≥25) using the baseline and 3-month data (Matza et al. 2010).
Those with PHQ-8 total scores in the categories of 0–4 (minimal), 5–9 (mild), 10–14 (moderate), 15–19 (moderately severe) and 20–24 (severe) using the baseline, 3- and 6-month data (Kroenke et al. 2001).
Mean values for each of the groups were also compared to the Q-LES-Q(SF) norming values described in Schechter et al. (Schechter et al. 2007). In that investigation, controls were classified as never mentally ill (NMI); minor mental disorders only (MMD); currently not mentally ill (CNMI) with a history of mental illness that did not meet criteria for the MMD category; and currently mentally ill (CMI) with other than 1 specific phobia. Q-LES-Q(SF) mean scores were 81.8, 83.4, 78.4, and 72.7 for the NMI, MMD, CNMI, and CMI groups, respectively.
ANOVA tests were also conducted comparing the mean values on the Q-LES-Q(SF) change scores (3 months – baseline and 6 months – baseline) between the participant change groups listed below to assess responsiveness. Both unadjusted comparisons and adjusted comparisons were conducted using age, gender, and baseline Q-LES-Q(SF) as covariates when change scores were used, and statistical significance was set at the P<.05 level.
2.4. HAM-A change over time
Changes in anxiety were assessed using changes in HAM-A scores over the 3 months of the study. First, HAM-A responders (≥50% reduction in HAM-A scores at 3 months) were compared with HAM-A non-responders. Second, HAM-A remitters (HAM-A scores ≤7 at 3 months) were compared with HAM-A non-remitters.
2.5. PHQ-8 change over time
Changes in depression classification were assessed using changes in PHQ-8 levels over the 6 months of the study. The mean change scores were calculated for those who were at the minimal (0–4) level at baseline and stayed at minimal at 6 months, minimal at baseline and changed to mild depression (5–9) at 6 months, minimal to moderate (10–14), and mild at baseline to moderately severe (15–19), and minimal at baseline to severe (20–24) at 6 months. This same mean change scores analysis was executed among persons at the mild, moderate, moderately severe and severe levels at baseline and classified the change over the 6-month period to 1 of 5 PHQ-8 groupings (minimal, mild, moderate, moderately severe, and severe).
Of the 296 participants in this study, 72.3% were female, with a mean age of 47.6 years (Table 2). The majority of participants identified themselves as white (92.9%), had at least some college education (75.3%), and most were employed on a full- or part-time basis (42.9 and 13.5%, respectively). At baseline, PHQ-8 mean scores corresponded with moderate depression, averaging 11.0 (possible range: 0–24; higher scores indicate greater depression severity), which is near the median of 12.5 reported in another GAD patient population (Kroenke et al. 2007). HAM-A scores indicated moderate levels of anxiety, averaging 16.7 (possible range: 0–56; higher scores indicate greater anxiety severity), which was lower than the range of 22.6–25.8 reported by Endicott et al. (Endicott et al. 2007) and the average of 25.54 reported by Wyrwich et al. in another GAD patient population (Wyrwich et al. 2009).
Mean SDS scores (13.7; possible range: 0–30; higher scores indicate greater impairment) were close to those seen in a primary care sample of GAD patients with an SDS score
|Characteristic||N = 296|
|Age, y, mean (SD)||47.6 (13.7)|
|Female n (%)||214 (72.3)|
|White n (%)||275 (92.9)|
|Education n (%)|
|Elementary/primary school||10 (3.4)|
|Secondary/high school||62 (20.9)|
|Some college||113 (38.2)|
|College degree||70 (23.6)|
|Postgraduate degree||40 (13.5)|
|Employment status n (%)|
|Employed, full time||127 (42.9)|
|Employed, part time||40 (13.5)|
|Baseline scores, mean (SD)|
|SF-12v2 – PCS||45.0 (10.3)|
|SF-12v2 – MCS||43.1 (8.3)|
|MOS SS (Sleep Quantity)||7.0 (1.8)|
|MOS SS (Sleep Problem Index II)||46.9(19.3)|
|DDQ (missed work days)||5.7 (13.8)|
exceeding the cutoff of 5 or greater, which is an indicator of increased risk of mental health impairment (mean = 13.95) (Leon et al. 1997), and lower than the range of 14.3–17.5 seen at baseline in 3 independent studies among GAD patients (Endicott et al. 2007). Additionally, SF-12v2 PCS scores averaged 45.0 (possible range: 0–100; scores below 50 represent below average physical health), and SF-12v2 MCS scores averaged 43.1 (possible range: 0–100; scores below 50 represent below average mental health).
Reliability of the Q-LES-Q(SF) at baseline, 3, and 6 months was excellent, with Cronbach’s alphas of 0.88, 0.90, and 0.90, respectively. These reliability estimates 1) exceeded the recommended cutoff of 0.70; 2) demonstrated little change over time in the correlations of the Q-LES-Q(SF) items with each other; and 3) support the 1-factor structure of the Q-LES-Q(SF).
3.2.1. Construct validity
Using Pearson or Spearman correlations (as appropriate), convergent validity was examined between the Q-LES-Q(SF) and the PHQ-8, GAD-Q-IV, HAM-A, PCS, MCS, SDS, DDQ, MOS SS, SCL-90, BFI, HUI2, and HUI3 using the baseline data, and again using the 3- and 6-month data to gauge the strength and stability of the baseline relationships (Table 3). With the exception of correlations with the MOS SS at baseline, and the DDQ late for work at 3 months, all Q-LES-Q(SF) correlations were statistically significant at the P<.05 level. Measures of anxiety (GAD-Q-IV, HAM-A) demonstrated moderate correlations (with moderate defined as 0.30 < |r| < 0.60) with values that were fairly consistent at baseline, 3 months and 6 months. The Q-LES-Q(SF) was most highly correlated with the PHQ-8, a measure of depression, with correlations of –0.69 at baseline, –0.73 at 3 months, and –0.72 at 6 months. Measures of general health (PCS, MCS) had moderate correlations and fairly consistent values at baseline, 3 months, and 6 months. As expected, the MCS had a higher correlation with the Q-LES-Q(SF) compared with the PCS. Measures of disability (SDS, DDQ bed days and kept from usual activity days) showed moderate correlations and consistency at baseline, 3 months, and 6 months. Sleep quantity, as measured through 1 MOS SS item, demonstrated a low correlation with the Q-LES-Q(SF) at baseline (r = 0.08), 3 months (r = 0.17), and 6 months (r = 0.13); however, the Sleep Problem Index II (also measured through the MOS SS) demonstrated moderate correlations of –0.49, –0.56, and –0.50 with the Q-LES-Q(SF) at baseline, 3 months, and 6 months, respectively. Distress due to perceptions of bodily dysfunction (SCL-90) and fatigue (BFI) also demonstrated moderate correlations and fairly consistent values at baseline, 3 months, and 6 months. Interestingly, data for measures of health utility (HUI2, HUI3) showed that the HUI3 demonstrated strong correlations (r ≥0.60) with the Q-LES-Q(SF) at 3 and 6 months, while the HUI2 came close to, but did not achieve, this level of association with the Q-LES-Q(SF). Change score correlations supported the stability of the scale properties (data not shown), and slightly exceeded the hypothesized magnitude of the product of the cross-sectional correlations at each time point.
3.2.2. Known groups validity of the Q-LES-Q(SF)
Using a severe anxiety definition of HAM-A scores ≥ 24 (Matza et al. 2010), statistically significant (P<.001) differences in Q-LES-Q(SF) unadjusted mean scores between anxiety severity groups were observed at baseline and at 3 months (58.50 vs. 40.83 and 62.11 vs. 38.65, respectively), with higher Q-LES-Q(SF) mean scores observed in the less severe group (Table 4). Moreover, these differences between the mean scores were greater than 1 standard deviation (SD) at both time points. Analysis using adjusted scores demonstrated similar mean scores and differences between mean scores in these severity groups that were also greater than 1 SD.
|Measure||Baseline||3 Months||6 Months|
|SF-12v2 – PCS||0.37||296||0.37||246||0.42||251|
|SF-12v2 – MCS||0.56||296||0.65||246||0.62||251|
|MOS SS (Sleep Quantity)||0.08||296||0.17||246||0.13||251|
|MOS SS (Sleep Problem Index II)||–0.49||296||–0.56||246||–0.50||251|
|DDQa-Late for work||–0.25||171||–0.07||143||–0.26||147|
|DDQa-Kept from usual activities||–0.36||296||–0.38||245||–0.52||250|
Another known groups comparison of Q-LES-Q(SF) mean scores employed a different anxiety measure, the GAD-Q-IV, and a cutoff score of 5.70, a value that had previously been demonstrated to have optimal sensitivity and specificity for identifying individuals with GAD (Newman et al. 2002). Mean adjusted and unadjusted Q-LES-Q(SF) scores were similar (Table 4), with those above the 5.70 threshold reporting higher Q-LES-Q(SF) unadjusted mean scores of 61.12, 64.53 and 64.56 at baseline, 3 months, and 6 months, respectively, while those with scores below the threshold had unadjusted mean scores of 49.93, 49.66, and 49.48 at the baseline, 3-month and 6-month periods, with all differences between the 2 groups at the 3 time points nearing, if not exceeding, the 1 SD threshold. All of these groups’ mean Q-LES-Q(SF) scores were much lower than NMI, MMD, CNMI, and CMI norming subgroups (Schechter et al. 2007), indicating the important burden on overall QOL and satisfaction related to various areas of functioning among patients with GAD.
|Time||HAM-A ≤24N, Mean (SD)||HAM-A "/24N, Mean (SD)||P Valuea|
|Baseline||250, 58.50 (15.0)||46, 40.83 (16.4)||< .0001|
|3 month||216, 62.11 (15.5)||31, 38.65 (14.8)||< .0001|
|Time||GAD-Q-IV <5.70N, Mean (SD)||GAD-Q-IV ≥5.70N, Mean (SD)||P Valuea|
|Baseline||154, 61.12 (14.4)||142, 49.93 (16.6)||< .0001|
|3 month||156, 64.53 (14.8)||90, 49.66 (17.3)||< .0001|
|6 month||162, 64.56 (14.6)||88, 49.48 (16.1)||< .0001|
|Time||PHQ-8 <10N, Mean (SD)||PHQ-8 ≥10N, Mean (SD)||P Valuea|
|Baseline||132, 65.64 (11.9)||164, 47.79 (15.3)||< .0001|
|3 month||142, 67.54 (13.5)||105, 47.85 (15.3)||< .0001|
|6 month||159, 66.77 (13.1)||92, 46.22 (14.1)||< .0001|
|Time||SDS <5.0N, Mean (SD)||SDS ≥5.0N, Mean (SD)||P Valuea|
|Baseline||40, 68.98 (15.4)||256, 53.68 (15.7)||< .0001|
|3 month||56, 72.75 (14.4)||190, 55.06 (16.0)||< .0001|
|6 month||65, 71.38 (13.1)||185, 54.98 (15.8)||< .0001|
Known groups analysis by severity group was also conducted using the PHQ-8. Using a minimal or mild depressive-state threshold of PHQ-8 scores <10, statistically significant (P<.0001) differences in Q-LES-Q(SF) unadjusted mean scores between those with minimal/mild depression compared with more severe depression were observed at baseline, 3 months, and 6 months (65.64 vs. 47.79, 67.54 vs. 47.85, and 66.77 vs. 46.22, respectively; P<.0001 for all comparisons), with higher Q-LES-Q(SF) mean scores consistently observed in the minimal/mild group (Table 4). Differences in mean scores between those with minimal/mild depression and those with more severe depression were greater than 1 SD at all 3 time points. Adjusted scores were similar to unadjusted scores and demonstrated equal levels of statistical significance. Again, all of these groups’ mean Q-LES-Q(SF) scores were much lower that those found among any Q-LES-Q(SF) norming subgroups (Schechter et al. 2007).
Additional known groups validity of the Q-LES-Q(SF) was demonstrated when known groups based on disability status with an SDS cutoff score of 5.0 were evaluated. Subjects with SDS scores ≥5.0 were classified as at least moderately impaired, and had unadjusted mean Q-LES-Q(SF) scores of 53.68, 55.06, and 54.98 at the baseline, 3-month, and 6-month periods compared with subjects with less than moderate impairment, who had mean scores of 68.98, 72.75, and 71.38 for the same periods (Table 4). Differences in mean scores between those with moderate or greater impairment and those with less than moderate impairment exceeded 1 SD at the 3- and 6-month periods, with statistically significant differences at all 3 time points (P<.0001). Adjusted scores remained similar to the unadjusted scores, and demonstrated the same levels of statistical significance (P<.0001), and all of these groups’ mean Q-LES-Q(SF) scores were much lower than any control subgroups (Schechter et al. 2007).
Unadjusted Q-LES-Q(SF) mean scores by 4 more refined HAM-A score severity groups (Fig. 1) and PHQ-8 severity levels (Fig. 2) were also similar to adjusted Q-LES-Q(SF) mean scores, with clear separation (>1/2 SD) of the sample in each score group for each time point when compared with adjacent groups (with the exception of the moderately severe vs. severe PHQ-8 comparison at 3 months). Statistically significantly (P<.0001) lower Q-LES-Q(SF) scores were observed as anxiety symptom severity increased, with pairwise comparisons that were statistically significant at baseline and 3-month follow-up (all unadjusted P<.01; all adjusted P<.001).
3.2.3. Responsiveness of the Q-LES-Q(SF)
We conducted 3 different analyses to investigate the ability of the Q-LES-Q(SF) scores to detect important changes in this population of persons with GAD. T-tests compared responders with non-responders using 2 different criteria: HAM-A responders (patients who had at least a 50% reduction in their HAM-A scores between baseline and the 3-month follow-up), and HAM-A remitters (patients whose HAM-A scores decreased to levels at or below 7 points). In the HAM-A responders analysis, responders (n=229) had a mean level of improvement of 4.31 points on the Q-LES-Q(SF) over 3 months, compared with mean declines of 8.19 Q-LES-Q(SF) points for non-responders (n=16; P=.0011). HAM-A remitters (n=40) demonstrated a mean improvement of 10.85 points compared with the mean change of 2.06 points among non-remitters (n=205; P=.0006).
A third analysis tested the ability of the Q-LES-Q(SF) scores to detect a change over 6 months using change categories defined by PHQ-8 change scores (Table 5). The bolded values in this table represent the mean change scores for patients who remained in the same PHQ-8 depressive category over the course of the study; on average participants who remained in the minimal, mild, or moderate depression categories demonstrated very little change over time (≤2 points). For those who demonstrated a small improvement in depressive symptoms, as demonstrated by movement to a better adjacent category (italicized mean values), mean change levels ranged from 4.4 points (moderate
|Baseline Depression Category (PHQ-8 Score)||Mean (SD), N|
6-Month Depression Category (PHQ-8 Score)
|Minimal (0–4)||Mild (5–9)||Moderate (10–14)||Moderately Severe (15–19)||Severe (20–24)|
|Minimal(0–4)||–1.6 (10.1), 19||–4.3 (7.3), 11||–51.0 (n/a ), 1|
|Mild (5–9)||9.2 (13.7), 33||–1.5 (11.5), 35||–0.2 (14.4), 14||–23.0 (12.7), 2|
|Moderate (10–14)||18.4 (10.4), 11||4.4 (12.3), 26||-1.8 (14.6), 22||–21.3 (7.1), 4||–10.7 (5.0), 3|
|Moderately Severe (15–19)||20.8 (15.1), 5||16.0 (14.8), 13||7.8 (12.6), 12||–7.2 (16.0), 11||0.0 (12.7), 2|
|Severe (20–24)||39.3 (26.0), 4||36.0 (7.1), 2||11.0 (7.9), 3||10.8 (12.3), 12||3.7 (8.3), 6|
There is a critical need for psychometrically sound measures of mental health–related impairment (Leon et al. 1997). This study investigated the reliability, validity, responsiveness, and interpretation of the Q-LES-Q(SF) scores among the members of a group-model health care delivery system with GAD. Our findings strongly support the psychometric properties of the Q-LES-Q(SF) and give additional support for its use as a PRO in this mental health condition. Reliability was consistently robust, with Cronbach’s alpha at 0.88 or higher at all time points (baseline, 3 months, and 6 months). High correlations (r ≥0.60) with other measures of anxiety and depression (PHQ-8 and HAM-A) at all time points, as well as the HUI3 at 3 and 6 months and SDS at the 3-month time point, support the construct validity of the measure. Although the Q-LES-Q(SF) scores demonstrated high correlations with these mental health severity measures, it is also important to note that the Q-LES-Q(SF) is not redundant with them; that is, there is at most 53% shared variance (–0.732) with the PHQ-8 at the 3-month measurement. Moreover, moderate (0.30 < |r| < 0.60) correlations with other measures associated with GAD (GAD-Q-IV, SCL-90, SF-12V2 PCS and MCS, BFI, HUI2, and disability days for missed work, bed days or kept from usual activities) were demonstrated at 1 or more of the time points.
In the current study, the Q-LES-Q(SF) mean baseline score was 55.8, and similar to the baseline mean score of 51.2 reported by participants across three GAD clinical trials (Wyrwich et al. 2009). A number of correlations between the Q-LES-Q(SF) and other outcomes detected in the current analysis are also consistent with previous studies, namely the low to moderate correlation with sleep measures, which mirrors that observed with the Pittsburgh Sleep Quality Index in other GAD samples (Wyrwich et al. 2009), and the moderate correlation between Q-LES-Q(SF) and HAM-A reported in 2 other studies (r = –0.45 (Endicott et al. 2007) and r = –0.36 (Wyrwich et al. 2009)).
Known groups validity found an effect size difference of 1 or more SDs between relevant groups using standard thresholds for classifying persons with GAD with HAM-A, PHQ-8, GAD-Q-IV, and SDS scores. Moreover, most of the relevant dichotomous cut points for the HAM-A (cut point of 24), PHQ (cut point of 10) and GAD-Q-IV (cut point of 5.70) yielded mean scores at similar levels, where the group with better health had a mean score of about 60–70 points, and the group with worst health averaged in the 40- to 50-Q-LES-Q(SF) point range. As seen earlier, all of these groups’ mean Q-LES-Q(SF) scores were much lower than those found among any of the relevant subgroups of normal controls investigated by Schechter et al. (Schechter et al. 2007). Additional analyses comparing the change scores for HAM-A remitters and responders over 3 months, and PHQ-8 improvements in depressive states classifications over 6 months, yielded mean change scores in a consistent range corresponding to the responder threshold level established for persons with GAD in prior treatment studies.
In prior work we determined that the mean Q-LES-Q(SF) score change was 6.80 in patients experiencing minimal improvement reported by their clinicians using the Clinical Global Impressions-Improvement of Illness at 8 weeks (Wyrwich et al. 2009). In this post-hoc analysis, mean Q-LES-Q(SF) changes for the HAM-A responders and remitters were 4.31 and 10.85 points over 3 months, respectively. Similarly, small but possibly important changes on the PHQ-8 categories yielded mean change levels that ranged from 4.4–10.8 points over 6 months of observation, and these additional analyses appear to support the 6.80 point responder threshold using the novel anchors available in these data.
In considering potential methodological limitations of this study, it should be noted that retrospective data were collected through an administrative claims database, which may be subject to bias due to the inability to ensure coding accuracy. The fact that the Q-LES-Q(SF) does not represent a GAD-specific patient-reported measure of QOL is another potential limitation. However, the validity of the Q-LES-Q has been convincingly demonstrated across psychiatric disorders (Endicott et al. 1993; Ritsner et al. 2005; Rossi et al. 2005; Endicott et al. 2006; Schechter et al. 2007; Mick et al. 2008; Revicki et al. 2008). As an exploratory secondary data analysis, no method to control for the probability of family-wise type I error due to multiple comparisons planned was incorporated beyond Scheffe’s method as mentioned for overall comparisons. Nonetheless, P-values were consistently less than.01 for all significant comparisons reported, and therefore, reduce the likelihood that any differences were the result of Type 1 error. Finally, questionnaires used in this study were administered via telephone. Previous psychiatric studies have successfully employed the telephone interview method for the administration of questionnaires (Larson et al. 2008; Kroenke et al. 2009; Simon et al. 2009). Cacciola et al. suggested caution in assuming comparability between telephone and in-person Structured Clinical Interview for DSM-III (SCID) Diagnosis based on telephone data collected from 41 college aged men with very limited psychiatric diagnoses (Cacciola et al. 1999); however, subjects in the QUEST study were diagnosed with GAD by their physicians prior to study entry and the SIGH-A has demonstrated strong reliability among patients with GAD (Shear et al. 2001).
Although our focus in this investigation was on the psychometric properties of the Q-LES-Q(SF), it is important to note that these results also demonstrate the significant impairment to psychological well-being, physical functioning, work productivity, and additional disability associated with GAD. Despite estimated prevalence rates for GAD ranging from 2.7–5.4% in the general population in the United States and Europe (Massion et al. 1993; Kessler et al. 1999; Wittchen et al. 2000; Kessler et al. 2001; Henning et al. 2007), diagnosis and subsequent treatment are often missed (Kessler et al. 2005; Ruscio et al. 2007). Given the impact of this condition on health status and overall QOL (Revicki et al. 2008), this post-hoc study shows that Q-LES-Q(SF) constitutes a short, focused and psychometrically sound PRO that complements a range of outcome measures evaluating symptoms associated with patients with GAD seeking treatment for this condition.
5. Conflicts of interest
This study was supported by a research grant from AstraZeneca Pharmaceuticals LP, Wilmington, Delaware, USA.
The authors would like to acknowledge the editorial assistance of Eleanor Bull, PhD and Anusha Bolonna, PhD (PAREXEL). Financial support for this assistance was provided by AstraZeneca Pharmaceuticals LP.