Meaning of Each Level of Recommendation Score
1. Introduction
Net Promoter Score (NPS) is a popular metric used in a variety of industries for measuring customer advocacy. Introduced by Reichheld (2003), NPS measures the likelihood that an existing customer will recommend a company to another prospective customer. NPS is derived from a single question that may be included as part of a larger customer survey. The single question asks the customer to use a scale of 0 to 10 to rate their willingness and intention to recommend the company to another person. Ratings of 9 and 10 are used to characterize socalled ‘promoters,’ ratings of 0 through 6 characterize ‘detractors,’ and ratings of 7 and 8 characterize ‘passives.’ The NPS is calculated as the percentage of respondents that are promoters minus the percentage of respondents that are detractors.
The idea behind the labels given to customers is as follows. Promoters are thought to be extremely satisfied customers that see little to no room for improvement, and consequently would offer persuasive recommendations that could lead to new revenue. The passive ratings, on the other hand, begin to hint at room for improvement and consequently the effectiveness of a recommendation from a Passive may be muted by explicit or implied caveats. Ratings at the low end are thought to be associated with negative experiences that might cloud a recommendation and likely scare off prospective new customers. Additional discussion on the long history of NPS can be found in Hayes (2008).
Some implementations of NPS methodology use reduced 5point or 7point scales that align with traditional Likert scales. However it is implemented, the hope is that movements in NPS are positively correlated with revenue growth for the company. While Reichheld’s research presented some evidence of that, other findings are not as corroborative (Kenningham et al., 2007). Regardless of whether there is a predictive relationship between NPS and revenue growth, implementing policies and programs within a company that improve NPS is an intuitively sensible thing to do [see, for example, Vavra (1997)]. A difficult and important question, however, is how to identify key drivers of NPS.
This chapter is an illustrative tutorial that demonstrates how a statistical classification model can be used to identify key drivers of NPS. Our premise is that the classification model, the data it operates on, and the analyses it provides could usefully form components of a Decision Support System that can not only provide both snapshot and longitudinal analyses of NPS performance, but also enable analyses that can help suggest company initiatives aimed toward lifting the NPS.
We assume that the NPS question was asked as part of larger survey that also probed customer satisfaction levels with respect to various dimensions of the company’s services. We develop a predictive classification model for customer advocacy (promoter, passive or detractor) as a function of these service dimensions. A novelty associated with our classification model is the optional use of constraints on the parameter estimates to enforce a monotonic property. We provide a detailed explanation of how to fit the model using the SAS software package and show how the fitted model can be used to develop company policies that have promise for improving the NPS. Our primary objective is to teach an interested practitioner how to use customer survey data together with a statistical classifier to identify key drivers of NPS. We present a case study that is based on a reallife data collection and analysis project to illustrate the stepbystep process of building the linkage between customer satisfaction data and NPS.
2. Logistic regression
In this section we provide a brief review of logistic and multinomial regression. Allen and Rao (2000) is a good reference that contains more detail than we provide, and additionally has example applications pertaining to customer satisfaction modeling.
2.1. Binomial logistic regression
The binomial logistic regression model assumes that the response variable is binary (0/1). This could be the case, for example, if a customer is simply asked the question “Would you recommend us to a friend?” Let
The binomial logistic regression model posits that
Model fitting for the binomial logistic regression model entails estimating the parameters
and the maximum likelihood estimate (MLE) of
2.2. Multinomial logistic regression
Suppose now that the outcome variable has more than two categories. For example, suppose the responses
where
and the MLE of
3. Case study
Carestream Health, Inc. (CSH) was formed in 2007 when Onex Corporation of Toronto, Canada purchased Eastman Kodak Company’s Health Group and renamed the business as Carestream Health. They are an annual $2.5B company and a world leader in medical imaging (digital and film), healthcare information systems, dental imaging and dental practice management software, molecular imaging and nondestructive testing. Their customers include medical and dental doctors and staff and healthcare IT professionals in small offices and clinics to large hospitals and regional and national healthcare programs. A major company initiative is to create a sustainable competitive advantage by delivering the absolute best customer experience in the industry. Customer recommendations are key to growth in the digital medical space and no one has been able to do it consistently well. The foundation for taking advantage of this opportunity is to understand what's important to customers, measure their satisfaction and likelihood to recommend based on their experiences, and drive improvement.
While descriptive statistics such as trend charts, bar charts, averages and listings of customer verbatim comments are helpful in identifying improvement opportunities to improve the Net Promoter Score (NPS), they are limited in their power. First, they lack quantitative measurements of correlation between elements of event satisfaction and NPS. As a consequence, it is not clear what impact a given process improvement will have on a customer’s likelihood to recommend. Second, they lack the ability to view multidimensional relationships – they are limited to single factor inferences which may not sufficiently describe the complex relationships between elements of a customer’s experience and their likelihood to recommend.
This section summarizes the use of multinomial logistic regression analyses that were applied to 5056 independent customer experience surveys from Jan 2009 – Jan 2010. Each survey included a question that measured (on a 5point Likert scale) how likely it would be for the customer to recommend colleagues to purchase imaging solutions from CSH. Five other questions measured the satisfaction level (on a 7point Likert scale) of the customer with CSH services obtained in response to an equipment or software problem. Key NPS drivers are revealed through the multinomial logistic regression analyses, and improvement scenarios for specific geographic and business combinations are mapped out. The ability to develop a quantitative model to measure the impact on NPS of potential process improvements significantly enhances the value of the survey data.
3.1. CSH Customer survey data
The 5point Likert response to the question about willingness to recommend is summarized in Table 1 below. CSH calculates a unique net promoter score from responses on this variable using the formula
Recommendation  Interpretation 
1  Without being asked, I will advise others NOT to purchase from you 
2  Only if asked, I will advise others NOT to purchase from you 
3  I am neutral 
4  Only if asked, I will recommend others TO purchase from you 
5  Without being asked, I will recommend others TO purchase from you 
Let
The demographic covariates include the (global) region code, country code, business code and the customer job title. The demographic covariates are coded using the standard dummy variable coding technique. For example, region code utilizes 7 binary variables
Country code utilizes similar dummy variables, but because countries are nested within regions we use the notation
In the data set we have
Finally, job title utilizes 10 dummy variables
The customer satisfaction covariates are also coded using the dummy variable scheme. The data on these covariates are the responses to the survey questions identified as q79, q82a, q82b, q82d and q82f. These questions survey the customer satisfaction on ‘Overall satisfaction with the service event,’ ‘Satisfaction with CSH knowledge of customer business and operations,’ ‘Satisfaction with meeting customer service response time requirements,’ ‘Satisfaction with overall service communications,’ and ‘Satisfaction with skills of CSH employees,’ respectively.
Survey questions q82c and q82e, which survey satisfaction with ‘Time it took to resolve the problem once work was started,’ and ‘Attitude of CSH employees’ were also considered as covariates, but they did not show themselves to be statistically significant in the model. Their absence from the model does not necessarily imply they are not important drivers of their overall satisfaction with CSH, but more likely that their influence is correlated with the other dimensions of overall satisfaction that are in the model. Each customer satisfaction covariate is scored by customers using a 7point Likert scale (where ‘1’ indicates the customer is “extremely dissatisfied” and ‘7’ indicates “extremely satisfied”), and thus each utilizes 7 dummy variables in the coding scheme. We denote these dummy variables as
Assembling all of the covariates together, we then have a total of 77 covariates in
3.2. Model fitting and interpretation
The SAS code for obtaining maximum likelihood estimates (MLEs) for the model parameters
Parm.  Est.  Parm.  Est.  Parm.  Est.  Parm.  Est. 

7.80 

.41 

.15 

.53 

5.69 

0 

.28 

.14 

2.34 

.11 

.62 

.23 

.045 

.092 

0 

0 

.11 

1.63 

1.92 

1.27 

.59 

.11 

2.09 

.92 

1.23 

0 

1.43 

.67 

.13 

.13 

.84 

.77 

.59 

.34 

.58 

.44 

.40 

.23 

.19 

.19 

0 

0 

0 

0 

.45 

.42 

2.68 

.85 

.60 

0 

.71 

1.69 

0 

.21 

1.05 

1.08 

0 

0 

.89 

.69 

0 

.097 

.31 

.59 

.78 

.25 

.14 

.16 

.53 

.20 

0 

0 

.83 

.48 

1.31  

.050 

.11 

.81  

.24 

.47 

.75 
The section of the PROC LOGISTIC output entitled ‘Type3 Analysis of Effects’ characterizes the statistical significance of the covariates through pvalues obtained by referencing a Wald chisquare test statistic to a corresponding null chisquare distribution. Table 3 shows the chisquare tests and the corresponding pvalues, and it is seen that all covariate groups are highly significant contributors in the model.
One way to assess model adequacy for multinomial logistic regression is to use the model to predict Y and then examine how well the predicted values match the true values of Y. Since the output of the model for each customer is an estimated probability distribution for Y, a natural predictor of Y is the mode of this distribution. We note that this predictor considers equal cost for all forms of prediction errors. More elaborate predictors could be derived by assuming a more complex cost model where, for example, the cost of predicting 5 when the actual value is 1 is higher than the cost of predicting 5 when the actual value is 4. Table 4, the socalled confusion matrix of the predictions, displays the cross classification of all 5056 customers based on their actual value of Y and the modelpredicted value of Y.
Covariate Group 
Degrees of Freedom 
Wald Statistic 
pvalue 
RC  6  41.2  < .0001 
CC  16  40.9  < .01 
BC  1  7.9  < .01 
JT  9  43.7  < .0001 
q79  6  84.8  < .0001 
q82a  6  56.5  < .0001 
q82b  6  34.4  < .0001 
q82d  6  34.8  < .0001 
q82f  6  39.9  < .0001 
Actual Y 
Predicted Y  Total  
1  2  3  4  5  
1  3  2  7  4  4  20 
2  3  8  48  22  7  88 
3  2  3  342  486  133  966 
4  0  0  126  1233  723  2082 
5  0  0  39  705  1156  1900 
Total  8  13  562  2450  2023  5056 
A perfect model would have a confusion matrix that is diagonal indicating the predicted value for each customer coincided identically with the true value. Consider the rows of Table 4 corresponding to Y=4 and Y=5. These two rows account for almost 80% of the customers in the sample. It can be seen that in both cases, the predicted value coincides with the actual value about 60% of the time. Neither of these two cases predicts Y=1 or Y=2, and only 4% of the time is Y=3 predicted. The mean values of the predicted Y when Y=4 and Y=5 are 4.28 and 4.59, respectively. The 7% positive bias for the case Y=4 is roughly offset by the 11.8% negative bias for the case Y=5.
Looking at the row of Table 4 corresponding to Y=3, we see that 86% of the time the predicted Y is within 1 of the actual Y. The mean value of the predicted Y is 3.77, indicating a 26% positive bias. Considering the rows corresponding to Y=1 and Y=2, where only about 2% of the customers reside, we see the model struggles to make accurate predictions, often overestimating the actual value of Y. A hint as to the explanation for the noticeable overestimation associated with the Y=1, Y=2 and Y=3 customers is revealed by examining their responses to the covariate questions. As just one example, the respective mean scores on question q79 (“Overall satisfaction with the service event”) are 3.8, 4.1 and 5.2. It seems a relatively large number of customers that give a low response to Y are inclined to simultaneously give favorable responses to the covariate questions on the survey. Although this might be unexpected, it can possibly be explained by the fact that the covariate questions are relevant to the most recent service event whereas Y is based on a customer’s cumulative experience.
Overall, Table 4 reflects significant lift afforded by the multinomial logistic regression model for predicting Y. For example, a model that utilized no covariate information would have a confusion matrix whose rows were constant, summing to the row total. In sum, we feel the accuracy of the model is sufficient to learn something about what drives customers to give high responses to Y, though perhaps not sufficient to learn as much about what drives customers to give low responses to Y.
Figure 1 is a graphical display of the slopes for each of the customer satisfaction covariates. The larger the coefficient value, the more detrimental the response level is to NPS. The yaxis is therefore labeled as ‘demerits.’
In view of the ordinal nature of the customer satisfaction covariates, the slopes, which represent the effect of the Likert scale levels, should decrease monotonically. That is, the penalty for a ‘satisfied’ covariate value should be less than or equal to that of a ‘dissatisfied’ covariate value. As such, it would be logical to have the estimated values of the slopes display the monotone decreasing trend as the response level of the covariates ascends. Figure 1 shows that the unconstrained MLEs for the slopes associated with the customer satisfaction covariates nearly satisfy the desired monotone property, but not exactly. The aberrations are due to data deficiencies or minor model inadequacies and can be resolved by using a constrained logistic regression model introduced in the next section.
3.3. Constrained logistic regression
Consider the situation where the
In order to simplify our use of PROC NLP, it is convenient to work with a fullrank parameterization of the logistic regression model. Because countries are nested within regions, a linear dependency exists between the dummy variables corresponding to regions and countries within regions. We can eliminate the linear dependency by removing region from the model and specifying country to be nonnested factor. The result of this model reparameterization is that instead of 6 degrees of freedom in the model for regions and 16 degrees of freedom for countries nested within regions, we equivalently have 22 degrees of freedom for countries. For the same purpose, we also redefine the dummy variable coding used for other categorical and ordinal covariates by using a full rank parameterization scheme. In particular, we use
Beginning with line 12 in the SAS code, PROC NLP is used to derive the MLEs of the parameters under the constrained parameter space. The ‘max’ statement (line 13) indicates the objective function is the loglikelihood function of the model and that it is to be maximized. The maximization is carried out using a NewtonRaphson algorithm, and the ‘parms’ statement (line 14) specifies initial values for the intercept and slope parameters. The SAS variables bq
Covariate  MLE of Slope  Covariate  MLE of Slope  
Unconstrained  Constrained  Unconstrained  Constrained  

1.92  2.01 

1.27  1.27 

2.09  2.01 

.92  .96 

1.43  1.43 

.67  .73 

.84  .82 

.77  .73 

.58  .57 

.44  .42 

.19  .19 

.19  .19 

Structural 0  Implied 0 

Structural 0  Implied 0 

2.68  2.56 

.85  1.28 

.71  .98 

1.69  1.28 

1.05  .98 

1.08  1.07 

.89  .88 

.69  .69 

.31  .30 

.59  .58 

.14  .14 

.16  .16 

Structural 0  Implied 0 

Structural 0  Implied 0 

1.31  1.28  

.81  .85  

.75  .77  

.53  .56  

.14  .21  

.23  .21  

Structural 0  Implied 0 
Table 5 provides a sidebyside comparison of the constrained and unconstrained MLEs for the slopes of the customer satisfaction covariates, and Figure 2 is a plot that shows the monotone behavior of the constrained estimates. There is very little difference between the unconstrained and constrained MLEs for the demographic covariates. Recall that for the unconstrained MLEs, the zero for the slope of the last level of each covariate is a structural zero resulting from the nonfull rank dummy variable coding used when fitting the model. In the case of the constrained MLEs, the slopes of the last levels of the covariates are implied zeros resulting from the fullrank dummy variable coding used when fitting the model. Table 5 shows that incorporating the constraints do not lead to a substantial change in the estimated slopes. In an indirect way, this provides a sanity check of the proposed model. We will use the constrained estimates for the remainder of the case study.
3.4. Model utility
A purely empirical way to compute NPS is to use the observed distribution (based on all 5,056 survey responses) of
We illustrated potential pathways to improve the overall NPS score, but this can also be done with specific subpopulations in mind. For example, if the first region was under study, then one could simply adjust the demographic covariates as illustrated in section 3.4.2 before implementing scenarios adjustments.
Scenario  Brief Description 
1  For each of q79, q82a, q82b, q82d and q82f, alter the distributions of their responses by reassigning the probability of a neutral response (4) equally to the probability of responses (5), (6) and (7) 
2  Replace the response distribution for subelements q82a, q82b, and q82d with what was observed for q82f (which was the subelement that had the most favorable response distribution) 
3  Make the response distribution for each of q82a, q82b, q82d and q82f perfect by placing all the probability on response (7) 
4  Improve the response distribution for each of q82a, q82b, q82d and q82f by placing all the probability equally on responses (6) and (7) 
5  Improve the response distribution for q79 by placing all the probability on response (7) 
6  Improve the response distribution for q82a by placing all the probability on response (7) 
7  Improve the response distribution for q82b by placing all the probability on response (7) 
8  Improve the response distribution for q82d by placing all the probability on response (7) 
9  Improve the response distribution for q82f by placing all the probability on response (7) 
10  Improve the response distribution for each of q79, q82a, q82b, q82d and q82f by distributing the probability of response (1) equally on responses (2)(7) 
11  Improve the response distribution for each of q79, q82a, q82b, q82d and q82f by distributing the sum of the probability of responses (1) and (2) equally on responses (3)(7) 
12  Improve the response distribution for each of q79, q82a, q82b, q82d and q82f by distributing the sum of the probability of responses (1), (2) and (3) equally on responses (4)(7) 
13  Simulate making Business Code 2 as good as Business Code 1 by setting 
14  Improve the response distributions of q79, q82a, q82b, q82d, and q82f by replacing them by the average across the different Region Codes, excluding the worst Region Code 
15  Improve the response distributions of q79, q82a, q82b, q82d, and q82f by replacing them by the average across the different Region Codes, excluding the two worst Region Codes 
16  Improve the response distributions of q79, q82a, q82b, q82d, and q82f by replacing them all by the observed, respective, distributions for Region Code 2 (which was the region that had the most favorable response distribution) 
4. Discussion
Alternative measures to NPS of customer advocacy include customer satisfaction (CSAT) and Customer Effort Score (CES) (Dixon et al., 2010). CES is measured on a 5point scale and is intended to capture the effort required by a customer to resolve an issue through a contactcenter or selfservice channel. (Dixon et al., 2010) compared the predictive power of CSAT, NPS and CES on service customers' intention to do repeat business, increase their spending, and speak positively about the company. They concluded that CSAT was a relatively poor predictor, while CES was the strongest. NPS ranked in the middle.
The choice of which customer advocacy measure to use depends on many factors such as the type of companytocustomer relationship, the degree to which recommendations (for or against a company) influence a purchase decision, and whether the measures will be complemented by other customer feedback. To gain an indepth understanding of customers' experiences and how to improve them may require multiple indicators. In the end, it is the action taken to drive improvements that customers value that is most critical.
Our case study validates the feasibility for using a multinomial logistic regression model as a means to identify key drivers of NPS, though it is clear that the same methodology could be employed with alternative measures of customer advocacy. Improvement teams at CSH have used this model to prioritize projects relative to their expected impacts on NPS. A novel aspect of our model development was the implementation of monotone constraints on the slope parameters of the ordinal covariates. Our illustrative SAS code showing how to impose the constraints on the maximum likelihood estimates should be of significant help to practitioners interested in doing the same thing.
5. Appendix A
data indata;
infile 'C:\CarestreamHealth\indata.txt';
input RC CC BC JT Y q79 q82a q82b q82d q82f;
proc logistic data=indata;
class RC CC BC JT
q79 q82a q82b q82d q82f/param=glm;
model Y = RC CC(RC) BC JT
q70 q79 q82a q82b q82d q82f;
6. Appendix B
data indata;
set indata;
array cc{23} cc1cc23; do i=1 to 23; if CC=i then cc{i}=1; else cc{i}=0;end;
if BC=1 then bc1=1;else bc1=0;
array jt{9} jt1jt9; do i=1 to 9; if JT=i then jt{i}=1; else jt{i}=0;end;
array q{6} q1q6; do i=1 to 6; if q79=i then q{i}=1; else q{i}=0;end;
array a{6} a1a6; do i=1 to 6; if q82a=i then a{i}=1; else a{i}=0;end;
array b{6} b1b6; do i=1 to 6; if q82b=i then b{i}=1; else b{i}=0;end;
array d{6} d1d6; do i=1 to 6; if q82d=i then d{i}=1; else d{i}=0;end;
array f{6} f1f6; do i=1 to 6; if q82f=i then f{i}=1; else f{i}=0;end;
run;
proc nlp data=indata;
max loglik;
parms alp1=7, alp2=5, alp3=2, alp4=1, bcc1bcc10=0, bcc12bcc23=0, bbc1=0, bj1bj9=0, bq1bq6=0, ba1ba6=0, bb1bb6=0,bd1bd6=0, bf1bf6=0;
bounds 0 <= bq6,0 <= ba6,0 <= bb6,0 <= bd6,0 <= bf6;
lincon 0<=alp4alp3, 0<=alp3alp2, 0<=alp2alp1;
lincon 0<= bq5bq6, 0<= bq4bq5, 0<= bq3bq4, 0<= bq2bq3, 0<= bq1bq2;
lincon 0<= ba5ba6, 0<= ba4ba5, 0<= ba3ba4, 0<= ba2ba3, 0<= ba1ba2;
lincon 0<= bb5bb6, 0<= bb4bb5, 0<= bb3bb4, 0<= bb2bb3, 0<= bb1bb2;
lincon 0<= bd5bd6, 0<= bd4bd5, 0<= bd3bd4, 0<= bd2bd3, 0<= bd1bd2;
lincon 0<= bf5bf6, 0<= bf4bf5, 0<= bf3bf4, 0<= bf2bf3, 0<= bf1bf2;
tp=cc1*bcc1+cc2*bcc2+cc3*bcc3+cc4*bcc4+cc5*bcc5+cc6*bcc6+cc7*bcc7+cc8*bcc8+cc9*bcc9+cc10*bcc10+cc12*bcc12+cc13*bcc13+cc14*bcc14+cc15*bcc15+cc16*bcc16+cc17*bcc17+cc18*bcc18+cc19*bcc19+cc20*bcc20+cc21*bcc21+cc22*bcc22+cc23*bcc23+bc1*bbc1+jt1*bj1+jt2*bj2+jt3*bj3+jt4*bj4+jt5*bj5+jt6*bj6+jt7*bj7+jt8*bj8+jt9*bj9+q1*bq1+q2*bq2+q3*bq3+q4*bq4+q5*bq5+q6*bq6+a1*ba1+a2*ba2+a3*ba3+a4*ba4+a5*ba5+a6*ba6+b1*bb1+b2*bb2+b3*bb3+b4*bb4+b5*bb5+b6*bb6+d1*bd1+d2*bd2+d3*bd3+d4*bd4+d5*bd5+d6*bd6+f1*bf1++f2*bf2+f3*bf3+f4*bf4+f5*bf5+f6*bf6;
pi1=exp(alp1+tp)/(1+exp(alp1+tp));pi2=exp(alp2+tp)/(1+exp(alp2+tp));
pi3=exp(alp3+tp)/(1+exp(alp3+tp));pi4=exp(alp4+tp)/(1+exp(alp4+tp));
if Y=1 then loglik=log(pi1);
if Y=2 then loglik=log(pi2pi1);
if Y=3 then loglik=log(pi3pi2);
if Y=4 then loglik=log(pi4pi3);
if Y=5 then loglik=log(1pi4);