Sequencing the ‘ Dairy Mind ’ Using Mind Genomics to Create an “ MRI of Consumer Decisions ”

We present the research methodology that generates an integrated database of the mind of a dairy consumer, regarding nine different dairy products. The set of studies deals with a variety of end products, presenting alternative messages about each product. Respondents rate combinations of messages, that is, vignettes, which are created using an advanced form of conjoint analysis. OLS (ordinary least-squares) regression is used to deconstruct the ratings at the level of the individual respondents, producing a coefficient value for each message that was tested. Cluster analyses revealed three distinct mind-sets around dairy products: a strong focus on flavor, a strong focus on health, and a strong focus on price. This chapter demonstrates how the science of Mind Genomics is further applied through a typing tool, known as PVI (personal viewpoint identifier). The PVI is able to identify the mindset of any individual that provides a binary response to six short questions. The chapter concludes with a vision for the future of the Mind Genomics research methodology in the fields of science and business.


Introduction
When one thinks of large-scale 'consumer research' in the world of products, such as dairy products, one is limited by that which exists, that which works, and of course that which one can afford. It should come as no surprise that the armory of knowledge about consumers and dairy come at once from observations of trends in the market, and at the same time large-scale segmentation studies, wherein the respondent is asked many questions about habits, practices, beliefs, and so forth; for instance, for dairy products, such data ground up into segmentation studies [1][2][3]. The result of both large-scale tracking of consumer behaviors, whether purchase or expressed attitudes, and deep studies of attitudes and behaviors produce for us a bewildering array of numbers, statistics, points to be talked about in presentations, and indeed a panoply of what might be called interesting consumer information.
Whereas there is an ongoing focus on the bigger world in which dairy 'plays', there is a parallel world of scientists focusing on the product itself. These are typically so-called sensory scientists, who study the properties of foods, using The objective of a Mind Genomics study is to discover how the respondent 'weights' the different inputs. The higher weights, as discussed below, mean that the message or element in the test stimuli more strongly 'drives' the response toward a defined 'high' point, such as likely to purchase, likely to crave, likely to like the product, and so on. The lower and negative weights mean that the message of element in the test stimuli is irrelevant or may even drive the rating to the low end of the scale, such as would not purchase, not likely to crave, do not like the product, and so on. Mind Genomics studies typically focus on the positive coefficients only, values higher than 0. Values of 0 or lower mean that either the respondent feels that the message drives the response to the lower anchor (viz., the negatives, such as 'do not like the product,') or the respondent often feels that the element is simply irrelevant.
Beyond the creation of a database for each element showing how that element 'drives' the response, the Mind Genomics project focuses on the discovery of underlying mind-sets, that is, groups of individuals in the population who think about the product in the same way. Although we are 'taught' that one can divide people by WHO THEY ARE, such divisions are scarcely useful when it comes to understanding the preferences of people toward products, whether these preferences pertain to product features, product 'benefits', product 'packaging,' and so forth. Until the development of Mind Genomics, there appears to be no efficient, standard way to uncover the latent mind-sets.

Background to the studies presented in this chapter
During the first decade of the 21st century, from 2001 to 2005, author HRM was involved in the development of Mind Genomics as an integrated database of the mind [5,10]. During those 4 years, Moskowitz and colleagues created the It! studies, each It! study comprising 20-30 parallel studies of a food or beverage. The studies themselves were designed according to a common experimental design, typically comprising four basic questions (silo), and nine answers (element) to each basic question. Each respondent evaluated a total of 60 vignettes, each vignette comprising 2-4 elements, at most one element from each silo. Furthermore, each respondent evaluated every element the same number of times. One of the important innovations of the Mind Genomics approach is that each respondent evaluated the same number of vignettes, 60, but the 60 vignettes for each respondent differed from the 60 vignettes for every other respondent. Indeed, the Mind Genomics algorithm ensured that most of the vignettes in a study were seen at most two or three times across the several hundred respondents, and the several thousand vignettes.
The design, permuted orthogonal design [11], was the most important feature of the Mind Genomics approach, instantiated in these different sets of 60 vignettes each, one set for each respondent. The second innovation providing the statistical power, and the potential for deep understanding was that the 60 vignettes were arrayed according to an experimental design, at the level of the individual respondent. This is called the within-subjects design. Each respondent could be investigated alone, without the need of other respondents.
The mathematical structure of the vignettes, comprising individual, permuted experimental designs, means that the 36 independent variables, so-called 'elements', were likely to be statistically independent of each other, whether the data were considered in their original form for one respondent (4 questions, 9 answers per question), or the data combined the results from any combination of respondents.
The respondents evaluated each rating on an anchored 1-9-point scale, a socalled category, or Likert Scale. The scale allows the respondent to act as a measuring instrument. However, for the subsequent analysis, the 9-point scale was divided into two parts. Ratings of 1-6 were converted to 0 to denote little or no level of the attribute being rated (e.g., desirability or crave-ability of the product denoted by the scale). Ratings of 7-9 were converted to 0 to denote a great deal of the attribute being rated. The conversion of ratings from the more granular 1-9 scale to the less granular 0/100 scale was done following the basic worldview of consumer researchers, wherein managers prefer all-or-none or yes-no answers. The binary transformed ratings, generating a 0/100 output, makes it easy for the manager to understand and use the data. Some of the granularity is lost, however. A good practice is to work with at least 5--0 respondents with different patterns of response AFTER the binary transform has been done.
The mathematics of the design allowed the researcher to use OLS (ordinary least-squares) regression to the presence/absence of the element (Eq. (1)): The foregoing model allows the researcher to learn, quite quickly, which of the elements (A1-D9) are key to driving the rating.
Moving beyond the general model, the Mind Genomics software created individual level models, one model or equation for each respondent. This analysis is possible because the underlying experimental design was 'complete' for each respondent. That is, one needed only the ratings from each respondent to create an equation for that respondent. The individual models were then clustered [12], so that similar patterns of the 36 coefficients were put into the same cluster or group. Clustering itself is a form of exploratory data analysis. The objective of clustering is simply to identify, in the manner of a heuristic, generally groups showing distinct, and interpretable patterns. The composition of the clusters is a function of the data itself, and the form of clustering.
Each data set was subjected to the same clustering approach whereby the first two clusters were generated, and then three clusters. We chose the fewest number of clusters, subject to the requirement that the clusters could be interpreted, that is, in such a way that it told a coherent, seemingly reasonable story. Generally, the 'three-cluster-solution' best fulfilled the joint goals of parsimony (fewer clusters are better) and interpretability (the clusters told a story, which made sense at a logical level).

Understanding the results
We begin the analysis with a summary table showing how the Mind Genomics process identified the relevant mind-sets for 10 dairy products. Table 1 shows the different, emerging mind-sets for each product. The remainder of this chapter will discuss how these mind-sets were discovered and used for understanding how people think of dairy products, and how the minds of 41 students were 'sequenced' to identify the pattern of mind-sets for dairy for each student.
We now go in depth to show how these mind-sets were developed, and the rich data underlying the table. Table 2 shows the three clusters emerging from the clustering for healthful yogurt. The clustering was done on all 36 elements, in the original study, to generate either two or three different clusters (mind-sets). Author SD then selected 16 elements from the data to present. Keep in mind that each data then associated with four questions, so that each 'question' was associated with four different, but related elements. This analysis was done AFTER the research was completed. The four questions and the four answers to each question, created on a post-hoc basis, do not affect the results at all, but simply represent an easy way to deal with the data for subsequent analyses. Table 2 presents the data in three columns, one column for each mind-set. The base size shows the number of respondents in the cluster. Note that the clustering program attempts to separate the respondents into either two or three groups based upon the pattern of the 36 coefficients. The base sizes do not have to be equal. Furthermore, the clustering program is 'agnostic' in terms of the 'meaning' of the elements and the reason for their membership in a cluster. The only consideration is the satisfaction of the mathematical criterion.
The additive constant is the estimated value of a vignette without any elements. Since all vignettes comprised elements, the additive constant is a purely estimated parameter. The OLS regression relating the presence/absence of the 36 elements to the binary response returns with one number for each element. This element is the coefficient. The coefficients can be both positive and negative. For positive coefficients, the interpretation is that putting the element into a vignette sways an additional percent of the respondents to assign the rating of 7-9. Furthermore, the coefficient, whether positive or negative (see below) can be added to the constant to estimate the percent of times that the vignette would be assigned a value of 7-9 on a 9-point scale, when the vignette comprises the specific element(s).
For example, for Mind-Set 1, the additive constant is 30. A vignette comprising E1, E5, E9, would be expected to get ratings of 7-9 (30 + 21 + 23 + 12) or about 86% of the time. The same vignette would be far lower in Mind-Set 2 because two of the coefficients are either 0 or negative (not shown), and only one coefficient (E9) is positive.
Looking at mind-set 1, we see that the additive constant for healthful yogurt is 30. In the absence of elements, we expect 30% of the responses to be ratings of 7-9, and the other 70% of response to be 1-6. Again, keep in mind that the additive constant is a purely theoretical parameter, computed by the OLS regression. Mindset 2 is a bit more positive. The additive constant is 55, meaning that in the absence of elements, we expect to see 55% of the responses from the mindset to be between 7 and 9. Finally, Mind-set 3, with 89 respondents, shows an additive constant of 39, in the middle. This implies that in the absence of elements, we see 39% of the responses from this mindset to lie between 7 and 9.
Our first conclusions are that there are three interpretable mind-sets. The basic interest in healthful yogurt spans a range from low (mind-set 1) to reasonably high (mind-set 2). What we do not know is the nature of the mind-sets. The remainder  of the table shows coefficients from 16 of the 36 elements, or 44% of the original data, the specific elements in the table being chosen because it is 'actionable', viz., describing the nature of the product. Some elements score very strongly across all mind-sets. An example is 'The delicious, classic fruit flavors like raspberry, strawberry banana, and blueberry.' Some elements score very strongly, but perhaps only with one mind-set. They may or may not even score positively among other mindsets. A good example is 'Contains the essential nutrient choline … shown to improve memory and learning', performing well in mind-set 1, virtually irrelevant in mindset 2, and perhaps totally irrelevant, and even damaging in mind-set 3. Table 2 can be made more informative by sorting the table by mind-set; this can be done based upon the strong performing elements (coefficient => 8). The sorted table shows the strong performing elements for each mind-set, with elements performing strongly in two mind-sets appearing twice or thrice, once for each mind-set in which the element performs strongly. The duplicates are not important. Table 3 shows the sorted data.
When we look at the 10 tables of data, we see 10 different sets of mind-sets, generally three mind-sets for a study, but sometimes two mind-sets. We look at the mind-sets for the 4 Â 4 matrices, and in our analysis develop a name for each mindset, based upon the elements that perform most strongly. It is important once again to reiterate the fact that the clustering program does not name the mind-set. Rather, the researcher does. All that the clustering does is to create the different groups based upon statistical criteria.

Finding these mind-sets in the population using the PVI
Researchers are accustomed to working with mind-sets. The notion that people radically differ from each other in how they react to simple stimuli is an old one, embodied in aphorisms and folk wisdom. What is novel, however, is the rather unpleasant realization that there is generally no simple set of rules, which one can use to put a new person into a mind-set. There is the ever-present wish that people who are 'alike' in who they ARE (e.g., age, education, gender, residence, shopping behaviors, and so forth) will share similar mind-sets. Thus, the standard method of cross-tabulating individuals to search for clues to the potential membership in one or several mind-sets is to use the easy-to-collect information about the person. As we will see below, in a study of 41 students, similar in age, education, and so on, this is not the case. Birds of a feather may flock together, but they think disparately.
Many marketers and scientists have 'complained' that the mind-sets provide valuable information, but they need the mind-sets to be generalized. For reason of cost and simply the marginal knowledge imparted by each new respondent, most Mind Genomics studies comprise at most 300 respondents. A great lesson can be learned about mind-sets with as few as 40-50 respondents. The base size of 50-100 suffices to reveal the nature of the mind-set, and often to define it, but does not let the researcher or businessperson make full use of the mind-sets for other purposes. A method is needed to assign new people to mind-sets that have already been discovered.
One original method was to work with large samples of 300+ respondents, discover their mind-sets, and then, during the research, purchase a great deal of additional information about these same 300 respondents. A data analysis would be then hired to create ad hoc models attempting to relate mind-set members to some combination of purchased information. Occasionally the predictive methods worked, but most often the collection of the ancillary data was expensive; the number of variables to collect was unknown as subject to many vagaries occurring when the data were collected and required significant analytical effort.
During the past 5 years, author HRM and colleagues, especially author Gere, have collaborated to create an easier system based upon a Monte-Carlo method. The original summary for data, showing the coefficients for the three mindsets, for example, is perturbed to create 'noisy' data. A decision tree is created to determine the assignment of a new respondent to one of the three mind-sets, based upon the perturbed data. At the end, a synthesized decision tree is created, comprising six of the 16 elements. The respondent uses a 2-point scale rate for each of these six elements. The pattern of the ratings assigns a respondent to one mind-set or of the two or three mind-sets emerging from the study. Figure 1 shows an example of the first part of the PVI. The left-most rectangle shows the introductory information about the respondent. The respondent identity (name) is never collected, but there is an option to collect the respondent phone number and email address. This option must be accepted by the respondent who participates in the study, viz. so-called opt-in. Should the respondent refuse to provide the information when requested, PVI is instructed to close, going no further, and thus respecting the respondent's desire for privacy.
Each of the 10 studies generates six questions, based upon the elements in the study, but with the option to edit the elements, as well as edit the two-point rating scale. Not shown is the option to ask four simple questions for each product PVI, each question having up to four answers, one of which must be selected.
Each of the 10 studies is set up separately, and then added into the PVI tool. Thus, for this project with 10 different dairy products, the PVI comprised the information rectangle (left), and 10 columns, one column corresponding to each product in the set of studies. The researcher setting up the study can instruct the PVI to randomize the order of the studies when desired, to randomize the order of the questions within the study, when desired, and even to randomize the full set of 60 questions. The latter, full randomization, makes the task difficult for the respondent to 'game. ' The time to complete the introductory panel is approximately 45 s. The evaluation for each panel takes approximately 15 s. Thus, for the introductory panel and for the 10 product panels, the total time is approximately 195 s or 3.5 min. The time suffices to 'sequence' the mind of the respondent on the 10 dairy products, that is, to discover what is important. The PVI typically takes about 3-4 min for 10 different products (as well as the information page.) The researcher can set up the PVI to drive three additional steps, each of which is optional. Figure 2 immediately provides the feedback to the respondent regarding the assignment of the respondent to the proper mind-set for each product. 1. Provide information, that is, feedback, about the different mind-sets, those to which the respondent is assigned, and those to which the respondent is not assigned.
2. Path to a landing page to which the respondent can be automatically directed after being assigned to the proper mind-set. Only one landing page can be selected, however. The researcher must select the specific product which determines the landing page.
3. Path to a video to which the respondent can be automatically directed after being assigned to the proper mind-set.

Creating an integrated database from the set of PVIs
We conclude the empirical section of this chapter with the creation of an integrated database, comprising the information about each respondent who participates. The database comprises the information about the respondent herself or himself, such as age, gender, country, and other material collected at the start of the PVI. Figure 3 shows part of the database.
Each row of the database comprises the information about the respondent, the name of the individual study on which the respondent is being 'typed,' the mindset name, as well as other information not shown. The other information comprises the mind-sets, the feedback, the six questions and their answers, and the (up to) four questions and answers that could be asked for each product at the start of the PVI for that product.
The first objective of the database is to advance science. Tables 4 and 5 show the results from one small study conducted with 41 students at Ryerson University, who participated in a larger study, from which these data were abstracted. Had the study been limited to 10 products, each respondent would have seen the products in random order, the questions within the products in random order, and the entire sequence might have lasted less than 4-5 min. The actual study comprised the 'typing' by all 41 students on the full set of 67 products.
It is clear from Tables 4 and 5 that groups and individuals show a preponderance of the group of mind-sets encompassed by the term 'flavor seeker'. Yet there are other mind-sets, and a few respondents who fall into these other mind-sets. Table 5 shows the mind-set memberships for 10 of the 41 respondents. Most of the respondents fall into the group called 'flavor seeker'. In general, for these dairy products, about 60% of the time a respondent will fall into one of the groups that can be defined as 'flavor seekers.' The rest of the time, the respondent will fall into different groups, whether these be traditionalists, value seekers, health seekers, and so on. From this small sample, the hierarchy of memberships in the different mindsets is not clear. That is, when the respondent does not fall into the 'flavor seeker' group, it is not clear the next likely group to which the respondent might file. Total in 'flavor seeker' 70% 80% 60% 60% 50% 80% 80% 50% 60% 60%

Discussion
The 'emerging' science of Mind Genomics traditionally has focused on how people think about one product. The notion of creating a set of Mind Genomics studies appears to be first attempted with the It! studies beginning in 2001. In those studies, the effort was made to identify fundamental mind-sets of respondents across 20-30 different foods and beverages [13].
These early studies opened the way to thinking both about a 'wiki' of the mind for a set of different foods, and the potential of typing a person on these different foods. The early thinking, however, was simply to discover a limited set of overarching categories. Thus, in the first study, the efforts revealed three groups of mind-sets for foods, based on one's desire for the food. The set of 30 foods was encompassed in the so-called Crave It! Study [14]. The three mind-sets were called Elaborates (focusing on the description of the food), Imaginers (focusing on the description of the ambience, and other ancillary factors), and Classics (focusing only on the food itself). These three mind-sets appeared in consecutive studies, albeit in different proportions.
Around 2008, when marketers began to think about using Mind Genomics to sell foods, the notion of typing the same person on a set of related foods began to emerge. The standard question was the same: Across different foods, is there a single mind-set segment which best describes a single individual? And thus, was born the idea for this paper, namely, create a typing tool, the PVI, personal viewpoint identifier, which could 'sequence' a person's mind, assigning the respondent to different and appropriate mind-sets for each of a set of identifiable products.
As noted in the introduction to this chapter, the evolution of Mind Genomics studies quickly revealed just how easy it was to dig deeply into the granularity of a person's mind on a specific topic. The simplicity, rapidity, and sheer efficiency of a Mind Genomics study soon make it less rewarding to investigate one product with excruciating thoroughness. One might consider that response to Mind Genomics to be more of an indication of personality than a description of the scientific project, but the reality is that it appeared possible to create powerful, granular data at an 'industrial level.' It was easy to investigate 10, 20, 30, or more products or situations (e.g., insurance, anxiety, health issues) as it did to investigate one product or situation. One needed simply to create more studies, launch them in parallel with as many respondents as one wanted, and as many of the types of respondents as were thought to be need. The only constraint was money.
The question arose, however, about interconnecting these results, not at the general level, but at the level of the individual. If one could mind-type a person on 10, 20, or even 100 or more products or situations, was there any way to integrate the data? It was not feasible to run a person on 100 studies, each lasting 3-4 min, simply because of fatigue, boredom, and resistance. Working with 100 products, each study requiring 3-4 min, means that to do the original study at an industrial scale, we would require 300-500 min, or 5+ h. One could, however, create the simply typing tool, the PVI, with each part of the PVI lasting 15-30 s. The typing tool could be run in one long, relaxed, stretched session, lasting about 30-55 min.
The data for this study comes from the typing of 41 students from Ryerson University, done by author SD as part of her senior capstone project. This chapter demonstrates the relative simplicity and power emerging from the research ability to corral data from different studies, reshape the results, and use the resulting data to create a new data set, and in turn to create a new PVI. The PVI, whether for all the products or simply for the 10 dairy products, allows us to type new people in a reasonably short session, to identify relations between who the person is and how the person thinks.
Looking backward at the effort as it applies to the knowledge of thinking, it seems possible now to erect large-scale databases of the mind literally from the 'bottom-up', in short spans of time, with efficiencies never-before realized. One can imagine the power of science, whether food science, medicine, social science, legal science, and so on, when it is possible for practitioners to create these large-scale structures, with the PVI attached, literally one can type millions of people, to understand the covariation of the mind with behavior, with health, and so on, almost ad infinitum.

Conclusion
To get a sense of an investor looking at the value of mind-typing a person on a set of different products, consider this scenario, doable now, and most likely the case in the not-too-distant future. Imagine a store with 'beacons', receivers and senders of information. Imagine these beacons linked with computer screens, with the computer screens placed near different parts of the dairy case(s). A shopper who has gone through the PVI exercise, and had her or his mind 'typed,' whether for dairy alone or for many foods, would have a card in her or his bag or wallet. The information in the card would identify the mind-set of the person for the different types of items in the dairy case, or even for the different types of items in the entire store.
One might then imagine the beacon 'reading the card', to discover the mind-sets of the individual with that card. The person herself or himself would not be the relevant information, and thus would remain private. All that would be required would be 'knowing' the mind-set of the person for the particular product. Privacy would be an issue and certainly the massive computations to generate a cogent recommendation for this individual would not be necessary. All the relevant information is stored on the card, that is, the relevant information about what to say to the shopper for the product to be sold. An offer about the product might be made, or the salient messages about the product would appear on the respondent's smart phone, or on electronic signage above the product. The scenario just painted means true individualization of the shopping experience, with the right words, cogent messages, and even electronic, storable coupons.