Design-based multiple linear regression for the mean cigarette price per pack.

## Abstract

We discuss statistical methods suitable for comparing multiple populations versus one reference population and consider two common problems: (1) detecting all significant mean differences and (2) demonstrating that all mean differences are significant. Discussed methods include the Bonferroni approach (both problems), Min test (problem 2), and Strassburger-Bretz-Hochberg (SBH) confidence interval for estimating the smallest mean difference (problem 2). They illustrate the methods using the pooled 2010–2015 Tobacco Use Supplement to the Current Population Survey (TUS-CPS) data on the cigarette purchase price (per pack) reported by adult daily smokers (n = 34,728). The goal was to show that among seven considered racial/ethnic groups of daily smokers, non-Hispanic (NH) Whites paid least for cigarettes (on average). We used the design-based multiple linear regression to derive the estimates and raw p-values. The Min test supported the study goal. Likewise, SBH lower 95% confidence interval bound was $0.08, indicating that the other racial/ethnic groups of daily smokers paid at least eight cents more for a pack of cigarettes (on average) than did non-Hispanic Whites. However, Bonferroni method (that was originally proposed for problem 1) failed to support the study goal. The study highlights the importance of choosing the right statistical method for a given problem.

### Keywords

- balanced repeated replications
- complex survey
- multiple comparisons
- statistical multiple-testing problems

## 1. Introduction

In this chapter, we discuss statistical methods for comparing multiple populations relative to one population (termed “reference”). These types of multiple comparisons commonly arise in behavioral science, for example, when multiple racial/ethnic groups are compared to non-Hispanic (NH) White smokers in terms of tobacco-use-related behaviors [1, 2, 3, 4]. When the statistical parameter of interest is the mean difference, the most common study goal is one of the following two goals. ** Goal 1**is to detect all significant mean differences among the considered populations (versus the reference population), that is, to draw an individual conclusion regarding significance of each mean difference.

**is to demonstrate that all mean differences among the considered ones are significant. Note that if one assessed Goal 1 and concluded that each mean difference was significant then s/he has (indirectly) assessed Goal 2 as well. Other more intricate study goals, such as the ones arising in pharmaceutical statistics which involve a hierarchical structure among the primary and secondary end points, were addressed elsewhere and are outside of the scope of this chapter [5, 6, 7, 8, 9, 10, 11].**Goal 2

We discuss how Goals 1 and 2 can be assessed in a study of racial and ethnic disparities, where Hispanic (H) population and five non-Hispanic populations such as American Indian/Alaska Native (AIAN), Asian (ASIAN), Black/African American (BAA), Hawaiian/Pacific Islander (HPI), and Multiracial (MULT), are compared to non-Hispanic White (W) population in terms of the mean differences.

where

Finally, let

Suppose the overall error rate for assessing Goal 1 (Goal 2) is fixed at α-level. Then to assess Goal 1 (to detect all significant mean differences), we should first rescale each p-value

To assess Goal 2 (to demonstrate that all differences are significant), one can use the Min test that is an intersection-union test [16, 17, 18, 19, 20, 21]. The p-value for the Min test, denoted by

If

Alternatively to the Min test, we can use the Strassburger-Bretz-Hochberg (SBH) confidence interval approach as follows [23, 24]. First, we compute the lower

Then the SBH lower

We note that one needs to identify the appropriate statistical method to compute the individual p-values and confidence bounds. The choice depends on the study design, probability distributions, and other statistical considerations. The Min test and the SBH interval were discussed for parallel and factorial designs, where sample mean responses followed normal distributions with known variances or unknown (common) variance, as well as Binomial and several other distributions [20, 21, 23, 24, 25, 26, 27]. In addition, one needs to decide whether the analyses should adjust for explanatory factors, for example, sociodemographic characteristics [28, 29, 30]. Such adjustments may help reduce the effect of confounding factors and therefore, improve estimation [31, 32]. For example, Golden et al. examined how much smokers pay for a pack of cigarettes, on average, in the United States using data from the 2010–2011 Tobacco Use Supplement to the Current Population Survey (TUS-CPS) [1]. Among several design-based multiple linear regression models for the mean purchase price per pack (PPP) used in the study, one model adjusted for smokers’ sociodemographic and smoking-related characteristics, cigarette purchase attributes, and the survey wave [1].

Despite availability and benefits of the Min test and SBH interval, these methods have not received much attention in behavioral sciences. We illustrate benefits of using these methods over Bonferroni method and simplicity of applications of these methods. We consider a study of racial and ethnic disparities in cigarette purchase prices conducted to demonstrate that W daily smokers, on average, purchase cigarettes at lower prices than do AIAN, ASIAN, BAA, H, HPI, and MULT daily smokers in the United States. This goal was motivated by results of a prior study revealing that BAA, H, and ASIAN/HPI (ASIAN and HPI combined) smokers paid higher PPP, on average, relative to W smokers, in the United States in the period from 2010 to 2011 [1].

## 2. Methods and results

### 2.1 Using data to derive the p-values and lower confidence interval bounds

We used the pooled 2010–2011 and 2014–2015 TUS-CPS data for adult daily smokers (n = 34,728) who reported the price of the last self-purchased pack or carton of cigarettes. The reported prices were used to compute the (average) PPP. The overall cohort was representative of about 23,370,261 adult daily smokers, where 12% were 18–24 years old, 38% were 25–44 years old, and 50% were 45+ years old, and 54% were men and 47% were women. The racial/ethnic representation was as follows: 76% were W, 11% were BAA, 8% were H, 2% were MULT, 2% were ASIAN, 1% were AIAN, and less than 1% were HPI. All racial/ethnic groups were well represented in the sample: the smallest number of respondents (96) corresponded to HPI daily smokers. Additional sample characteristics have been described in a prior study of purchasing cigarettes on Indian reservations [33].

We fixed the overall error rate at α = 5% and fitted a design-based multiple linear regression (R^{2} ≈ 30%, F(25, 160) ≈ 257, p < 0.0001) to model the mean PPP as a function of daily smokers’ characteristics, location of the purchase (on/off Indian reservation), survey mode (phone, in-person), and survey period (2010–2011, 2014–2015). The daily smokers’ characteristics included race/ethnicity, age, sex, marital status, education, employment record, region of residency (West, South, Midwest, and Northeast), metropolitan area of residency (metro, nonmetro), and heavy smoking indicator. The analysis incorporated statistical methods recommended in the methodological guidelines for analysis of the CPS and CPS supplements [34, 35]. Specifically, because the CPS incorporates complex sampling, we estimated variance using balanced repeated replications [36]. The main and 160 replicate weights for this approach have been made available for public use by the U.S. Census Bureau [34, 35]. The analysis was performed using SAS®9.4 software [37]; the SAS®9.4 Survey Package procedures suitable for analysis of TUS-CPS have been discussed elsewhere [38]. Table 1 depicts the estimated model coefficients and their standard errors for all covariates. As is shown in Table 1, smokers’ sex and survey mode (phone, in-person) were not significant.

Factor | Estimated coefficient | Standard error | p-Value* |
---|---|---|---|

3.64 | 0.12 | * | |

(reference group is W) | |||

AIAN versus W | 0.61 | 0.13 | * |

ASIAN versus W | 0.62 | 0.09 | * |

BAA versus W | 0.51 | 0.04 | * |

H versus W | 0.61 | 0.06 | * |

HPI versus W | 0.83 | 0.22 | 0.0002 |

MULT versus W | 0.22 | 0.08 | 0.0087 |

(reference group is 45+ years old) | |||

18–24 years old versus 45+ years old | 0.19 | 0.04 | * |

25–44 years old versus 45+ years old | 0.20 | 0.02 | * |

Female versus male | 0.00 | 0.02 | 0.9052 |

(reference group is widowed/divorced/separated) | |||

Married (living with a spouse) | 0.02 | 0.02 | 0.4878 |

Never married | 0.22 | 0.03 | * |

(reference group is some college/Bachelor’s degree) | |||

Graduate degree | 0.11 | 0.08 | 0.1632 |

High school/equivalent | −0.14 | 0.02 | * |

Less than high school | −0.15 | 0.03 | * |

(reference group is unemployed) | |||

Employed (at work or absent) versus unemployed | 0.16 | 0.03 | * |

Not in labor force versus unemployed | −0.15 | 0.04 | * |

(reference group is “on Indian reservation”) (reference group is “yes) | |||

No versus yes | 1.57 | 0.12 | * |

Midwest versus West | 0.09 | 0.03 | 0.0091 |

Northeast versus West | 1.75 | 0.05 | * |

South versus West | −0.62 | 0.03 | * |

Metropolitan area versus nonmetropolitan area | 0.32 | 0.03 | * |

Heavy (20+ cigarettes per day) versus non-heavy smoker | −0.20 | 0.02 | * |

Personal interview versus phone interview | −0.01 | 0.02 | 0.5205 |

2010–2011 versus 2014–2015 | −0.38 | 0.02 | * |

Table 1 presents the individual p-values for comparisons of racial/ethnic populations of daily smokers versus W daily smokers (based on the model):

where

Figure 1 depicts the lower bounds * proc surveyreg*procedure with

*statements (with “cl,” “e,” “upper,” and “alpha = 0.05” options) when fitting the model using SAS software. Alternatively, we could use the*lsmestimate

*statement (with “adj = bon,” “cl,” and “alpha = 0.1” options), and select the comparisons of interest out of all 21 pair-wise comparisons reported and note the lower bound of the two-sided 90% confidence interval reported in the output.*lsmeans

### 2.2 Demonstrating the study goal via the min test and SBH confidence interval

The p-value for the Min test is

If instead of the Min test we used the Bonferroni approach, then the adjusted p-values would be less than 0.0006 for four comparisons (AIAN versus W, ASIAN versus W, BAA versus W, and H versus W), 0.0012 for one comparison (HPI versus W), and 0.0522 for one comparison (MULT versus W). Therefore, we would conclude that only AIAN, ASIAN, BAA, H, and HPI daily smokers pay higher PPP, on average, than do W daily smokers; and would fail to demonstrate that all six considered racial/ethnic groups of daily smokers pay higher PPP, on average, relative to W daily smokers.

## 3. Discussion

The choice of the reference group as “W daily smokers” was based on the study goal and prior studies of cigarette purchasing behaviors of smokers [1, 33]. The choice of the reference group as well as the statistical methods should always align with the study goal and should be made prior to the data analysis. Specifically, when examining racial/ethnic disparities, using “W” as the reference group could be logical in some studies but not logical in the other studies. For example, if the study goal is to show that purchasing cigarettes on Indian reservations is most prevalent among AIAN smokers, then “AIAN smokers” should be chosen as the reference group. In addition, while both Bonferroni method and the Min test are simple to use, in practice, only Bonferroni method results in individual conclusions regarding each comparison. However, Bonferroni method is less powerful than the Min test when applied to an intersection-union problem (to assess Goal 2) [6, 12].

The study indicated that W daily smokers paid significantly less for cigarettes, on average, than the other six racial/ethnic groups of daily smokers in the United States in the period from 2010–2011 to 2014–2015. The earlier reported finding (see model 6 in [1]) was that non-Hispanic White smokers, on average, paid significantly less for cigarettes than did BAA, AIAN, ASIAN/HPI (combined), and H smokers, and paid similar prices to the prices paid by “other non-Hispanic” smokers [1]. While the results might seem to disagree, the direct comparisons between these two findings are problematic, because the studies concerned different populations of smokers (daily smokers in our study, and daily and occasional smokers in the prior study) and time periods (overall 2010–2011 and 2014–2015 in our study, and 2010–2011 in the prior study). Moreover (though, the authors did not mention the method they used to adjust for multiple comparisons, if any), the authors considered the union-intersection problem that is conceptually different from the intersection-union problem addressed in our study [1].

Our study has several potential limitations. First, we considered the population of daily smokers, and thus, results should not be generalized to other populations of smokers such as occasional smokers. Indeed, daily and occasional smokers have very different cigarette purchasing behaviors, for example, daily smokers are more likely to purchase cigarettes in cartons rather than packs and travel to another state or Indian reservations to purchase cigarettes at lower prices [1, 39, 40]. Second, the analysis was based on a certain regression model where the mean PPP was modeled as a function of smokers’ characteristics, location of the purchase, survey mode, and survey period. Another model could potentially lead to a different conclusion, for example, only two out of six models indicated significantly higher mean PPP for AIAN smokers relative to W smokers [1]. Another potential limitation is a lack of a theoretical proof that the SBH interval for the smallest mean PPP difference has indeed confidence level of

Future research may target development and implementation of procedures for the Min test and SBH interval. Specifically, the software packages developed for analysis of complex survey data currently offer just a few multiple comparison methods. For example, the SAS Survey Package offers a built-in procedure for Bonferroni adjustments but lacks procedures for the multiple testing (interval estimation) such as the Min test (SBH interval). Availability of the “Min test” and “SBH interval” procedures would enable researchers to incorporate these methods directly in their analyses of complex survey data.

## 4. Conclusion

In our study, results of the Min test (and SBH interval) were different from the results of the Bonferroni method. Specifically, using the Min test (and SBH interval), we demonstrated that all six racial/ethnic groups of daily smokers paid, on average, higher PPP relative to W daily smokers in the United States in the periods from 2010–2011 to 2014–2015. However, using the Bonferroni method, we failed to demonstrate this claim. This discrepancy highlights the importance of choosing the appropriate statistical method for assessing the minimum among multiple mean differences (relative to one reference population). Availability of the “Min test” and “SBH interval” procedures in survey packages would help facilitate application of these methods in behavioral research.

## Acknowledgments

The authors are thankful to Richard Pack, B.S., for providing editing comments.

## Funding

Research reported in this publication was supported by the National Institute on Minority Health and Health Disparities of the National Institutes of Health under Award Number R01MD009718. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.