Short response time available in the event of a major earthquake poses unique challenges for earthquake early warning (EEW). Mobile phone apps may be one way to deliver such messages effectively. In this two-phase study, several hundred participants were first randomly assigned to one of eight experimental conditions. Results of phase one afforded researchers the ability to reduce the number of conditions to four. Phase two consisted of five experimental conditions. In each condition, a 10 second EEW was delivered via a phone app. The four treatment conditions were designed according to elements of the IDEA model. The control condition was based on the actual ShakeAlert EEW computer program message being used by emergency managers across the US west coast at the time. Results of this experiment revealed that EEW messages designed according to the IDEA model were more effective in producing desired learning outcomes than the ShakeAlert control message. Thus, the IDEA model may provide an effective content framework for those choosing to develop such apps for EEW.
- IDEA model
- risk communication
- crisis communication
- disaster warnings
- earthquake early warning
- communication and technology
Effective earthquake early warning messages can empower target populations to take appropriate actions for self-protection and, ultimately, save lives. The communication challenges facing those who wish to design warning messages involve both content and access. Content focuses on gaining attention and providing appropriate instructions for self-protection. Access depends on sending the messages through a channel or channels and a medium or media that can be retrieved quickly and easily. A team of researchers completed a project designed to develop such content and access. The project was based on previous warning message testing research. Specifically, researchers attempted to apply the IDEA model to create brief, easily accessible earthquake early warning messages via a mobile phone app.
The IDEA model for effective instructional risk and crisis communication is an acronym that stands for internalization, distribution, explanation, and action . According to the IDEA model, such messages ought to include appeals to internalization (e.g., proximity, personal relevance, impact, timeliness), be distributed over multiple channels deemed appropriate based on crisis type and target audience(s), and offer cogent explanations about what is happening. These explanations should be offered by credible sources and the scientific information provided in them be both accurate and translated intelligibly for the target population(s). These messages also must include specific action steps receivers are to take (or not take) for self-protection [2, 3, 4, 5, 6, 7]. The following paragraphs describe the message design and testing project processes, results, and conclusions based on the timeline under which it unfolded.
2. The IDEA model message design and testing experiment
The design and testing process occurred in two phases. Thus, this section first describes the study design process followed by the results of the two-phase experiment. It closes with a discussion of the results as they may inform the design of effective EEW messages delivered via phone apps.
2.1 Designing the study
To launch the project, a multidisciplinary group comprised of seismologists, instructional risk and crisis communication scientists, graphic artists, and emergency managers from the US west coast states met in Pasadena, California to participate in a 3-day design storm focused on earthquake early warning messaging. This design storm was essentially a synergistic brainstorming session to formulate an ecologically valid plan based on a broad cross-section of expertise represented in crisis communication and earthquake science that would inform earthquake early warning message design.
Ultimately, the group agreed that message content distributed via a phone app would likely be a predominant interface for US west coast residents. Thus, message content (both visual and aural) would be designed for a phone app. The group also agreed that the content would need to be developed based on social science crisis communication best practices research.
Message content would address internalization components as follows. To test proximity, some conditions will include a map and others will not. To test timeliness, the conditions will include a countdown to when the strong shaking is expected to occur. Timeliness would also be tested by providing no more than 10 seconds for the entire message. Personal relevance would be addressed by focusing on “very strong shaking” (i.e., “7” or higher Intensity level shaking).
Message content would address explanation components as follows. To address source credibility, Dr. Lucy Jones’ voice was recorded and applied as she is a known and credible earthquake expert among many throughout the US west coast states. The accurate science provided by seismologists was translated into simple, easily comprehended language. Also, intensity level was selected rather than magnitude because intensity is directly related to very strong shaking that can harm individuals that do not enact the appropriate actions for self-protection. Finally, some conditions used a verbal message–very strong sharking—and others a numerical message—“7”—to signal the kind of shaking to occur. This allowed researchers to test for potential lack of understanding regarding what “7” might mean. Existing research suggest that less numerate people—those that lack the ability to process mathematical concepts—tend to trust verbal risk information that they can comprehend more than numeric information that may be unintelligible to them and, consequently, make poorer decisions based on numerical data than highly numerate people . Thus, it seemed critical to test intensity comprehension based on numeric versus verbal reporting.
Message content would address action components by a visual graphic accompanied by a verbal message—Drop! Cover! Hold on!—reinforced orally by a speaker saying “Drop, take cover, hold on.” All experts in the design storm agreed that all conditions testing high intensity earthquake early warning message should include this specific action statement in some way. Thus, all treatment conditions included this message.
The graphic artists created eight visual representations of a smart phone app screen, which the instructional risk and crisis team would test during the fall semester. The team would create an online survey to collect responses to the various versions and measure their effectiveness based on affective, cognitive, and behavioral learning outcomes. A snowball sample of participants would be invited via Lucy Jones’ Facebook page and the Shakeout website. The survey collected quantitative and qualitative data and employed a mixed methods analysis.
2.2 Message testing: phase one
The results for phase one of the project were collected using a survey distributed via an invitation on Lucy Jones’ Facebook page and the Shakeout website. Both outlets targeted users across the US west coast. Of the 469 surveys entered, 198 were completed in entirety and, thus, usable for the analysis. The usable data resulted in 22–28 responses per condition for the eight varieties tested. The sample was comprised of 106 (56%) females and 92 (44%) males. A majority of the participants (80%) were Caucasians from Southern California (n = 153). Notably, 37% (n = 66) of the participants reported earning incomes over $100,000 annually.
Eight message conditions were manipulated in ways that indicated the location with or without a map, intensity in numerical or non-numerical form, and countdown in graphic or numerical form. All eight conditions used the same aural warning alert sound and voice command. All eight conditions offered actionable vocal instructions to “drop, take cover, and hold on.”
Survey responses were examined using both quantitative and qualitative methods. Statistically significant quantitative results and dominant qualitative themes emerging in open-ended responses are reported here. The quantitative analysis produced five sets of meaningful results. The subsequent qualitative examination of open-ended responses provided additional insight to inform message refinement. In total, 87 participants (44%) offered open-ended responses to the prompt, “Please provide any additional feedback you believe would be helpful concerning the quality of the app.” These comments ranged from 3 to 328 words (M = 50) Exemplars from these responses are reported with the corresponding quantitative results. These sets of results focus on intensity, location, countdown, perceived helpfulness of visual images, and perceived helpfulness of aural components.
A chi-square test revealed that participants were more likely to recall earthquake intensity level correctly when the message included a numerical representation of intensity (χ2 = 78.049, df = 7, p < 0.0001). In other words, participants more accurately recalled the numeral “7” than the phrase “strong shaking.” Qualitative responses indicated confusion, however, regarding what the number “7” actually means. Some noted that “the number was important,” but many also claimed that “many people don’t known what INTENSITY means.” One respondent wrote, for example, “I assume the 7 means 7 out of 10?” Thus, although respondents could recall seeing the number “7,” many did not know what it meant. Therefore, if a numerical representation is present in the warning message, the message must also somehow indicate its meaning and/or prior instruction may be necessary for it to be truly meaningful/effective.
Four of the eight conditions tested included a map indicating the epicenter of the earthquake in relation to well-known California cities and highways and four did not. Participants viewing a message without a map were more likely to recall the earthquake’s location incorrectly or not at all than those viewing a message that included a map (χ2 = 43.831, df = 7, p < 0.0001). Qualitative analysis of open-ended responses confirmed the value of the map. Participants viewing the map reported, for example, that “the map showing the general area of the quake was important” and “the map helped me realize where the earthquake was occurring.” Moreover, some commented about the size of the map saying, “the map was far too small to be useful.” Those viewing a message without a map made queries such as: “Did I miss the location?” Based on these results, then, the prototype messages should include a map and the map should be large enough to see easily via a smart phone app.
Four conditions provided countdowns represented numerically and four offered countdowns represented in graphic form. The quantitative analysis revealed no statistically significant differences in terms of message recall or effectiveness. The qualitative analysis of the open-ended responses provided insight as to why. When participants viewed the numerical countdown, some reported that the static image was confusing because it did not actually count down from 6 to 5, 4, 3, 2, and 1. Moreover, participants that viewed conditions that conveyed both the countdown and the intensity in numerical form were confused about the meaning of each one. When participants viewed the graphic countdown, they also indicated confusion due to the fact that it was static and did not actually move as the number of seconds to impact declined. Many commented as this participant did: “I’m not sure what the pie graph was supposed to represent.” Regardless of the version participants viewed, many suggested using a countdown clock (stopwatch image or digital clock face: 06, 05, 04 that actually ticked down with each second). As one participant noted, “a countdown clock would underscore the importance of acting quickly.” Clearly, additional message refinement using an active countdown was warranted.
2.2.4 Perceived helpfulness
Perceived helpfulness was measured using a Likert-type scale ranging from 1 to 5, where 1 was least helpful and 5 was most helpful. In addition, open-ended responses were collected and analyzed for recurring themes. Perceived helpfulness results were analyzed and reported for visual images and aural components.
An Analysis of Variance test revealed significant differences among conditions. Regarding visuals, messages that did not include maps and numerical intensity indicators were perceived as least helpful (F(7,188) = 7.789, p < 0.0001). These results support previous findings regarding location and intensity indicators.
An Analysis of Variance revealed no significant differences in the perceived helpfulness of the app based on the alert sound or the speaker’s voice. Since this phase one experiment did not examine different alert message sounds or speaker voices, this result is not surprising. The qualitative analysis did reveal several dominant themes, however, regarding aural components that may inform future message design and testing. Regarding the alert sound, for example, participants responded it “was too light and high pitched” and “should also vibrate the phone.” Some also argued “it should be the same as other national emergency radio announcements.” Others contended that it should be the same as the sound used in Japanese earthquake warning messages. Regarding the speaker’s voice, some claimed that recognizing the voice of Lucy Jones provided a sense of credibility. In other words, the voice “has meaning because I recognize it is Dr. Lucy Jones I find her voice compelling and reassuring.” Thus, this qualitative analysis suggests that the familiar voice of a noted expert may be most important for fostering trustworthiness. These preliminary findings support the meta-analysis of hundreds of communication studies drawing similar conclusions that there are, in fact, negligible differences in perceived credibility and effectiveness based on sex and perceived gender .
2.2.5 Phase one summary and next steps
The results of phase one of this research project revealed several findings:
Participants were significantly more likely to recall the location of the earthquake when the app included a map. They also perceived the apps that included a map to be most helpful.
Participants were significantly more likely to recall the intensity level of the earthquake when a numerical indicator was included. However, a qualitative analysis of open-ended responses revealed a great deal of confusion about what this number means.
No significant differences were found among apps that used numerical versus graphical countdown imaging. The qualitative analysis of open-ended responses revealed confusion because neither countdown approach actually counted down by seconds from 6, to 5, 4, 3, 2, and 1. Participants indicated a desire to see the seconds dropping via an image that represents a digital clock-face or stopwatch-type image.
This phase one pilot project did not test different alert sounds or voice commands statistically as the project was already comprised of eight conditions. However, a qualitative analysis of open-ended responses revealed that participants believed the alert sound should be familiar (e.g., similar to the one used in the US for other warning messages or similar to the one being used already in Japan for earthquake warning messages).
Based on these results and input from risk and crisis communication specialists, seismologists, and emergency manager practitioners, the research team moved into phase two of the project. More specifically, the researchers used this information to refine the prototype IDEA model messages down from eight to four conditions and created a control message based on the existing ShakeAlert warning message computer program used by emergency managers throughout the US west coast states at the time.
2.3 Message testing: phase two
Based on the results of phase one message testing and focused feedback from crisis and risk communication subject matter experts, seismologists, and practitioners, the original eight conditions were reduced to four. These four treatment conditions were manipulated as follows:
Japanese alert sound with numerical intensity display
US alert sound with numerical intensity display
Japanese alert sound with verbal intensity display
US alert sound with verbal intensity display
The map either rotated with the numerical intensity display or with the verbal intensity display. All other elements were the same across the four conditions (map, countdown, action steps).
The demographics for the sample (N = 294) for phase two was 62.5% female and 37.5% male, 88% Caucasian, and age (M = 47.5; SD = 14.04). Regarding socio-economic status, 52% of the sample reported an annual income of $70,000 or more and 32% currently live in southern California. Of the 294 respondents, 133 provided comments in response to the prompt: “Please provide any additional feedback you believe would be helpful concerning the quality of the app.” Key findings from this round of message testing focus on perceived quality of the app overall, as well as intensity (verbal/numerical), location (map), and behavioral intentions (drop/take cover/hold on).
2.3.1 Perceived quality of the app
A series of stepwise regression analyses were conducted to examine the research question about perceived quality of the app. The single item asking about the quality of the app used a five-point Likert type response scale (1 = very effective to 5 = not effective). Overall, 75% of the participants across conditions rated the app as “effective” or “very effective” and only 2% rated the app as “not effective.” On the first block, demographic variables were entered in order to account for any variance attributable to respondent characteristics. These included sex, age, race/ethnicity, and income. The second predictor block included these variables, as well as experimental condition. The examination focused on significant models and predictors, as well as potential improvements based on the addition of experimental condition.
The results for the first predictor block indicate a significant model, F(4, 223) = 6.775, p < 0.001. R2 = 0.108. Of the demographic variables only sex β = −249 p < 0.000, and age β = −175 p < 0.01 were predictive of ratings of app quality. When experimental condition was added to the predictor block a significant model was also produced, F(5, 222) = 4.32, p < 0.001, R2 = 0.112. However, the change in variance accounted for was not significant ΛR2 = 0.004. Of the variables in the predictor block, only sex β = −245 p < 0.000, and age β = −176 p < 0.01 were predictive of ratings of app quality.
A t-test was conducted for the variables of sex and overall quality across conditions. Women (M = 1.73 SD = 0.81) were more likely than men (M = 2.14, SD = 1.04) to rate the app as being of high quality t(2) = 3.592, p < .001. Sex differences in perceptions of app quality were then broken down by each condition. Differences were found for condition 2, where women (M = 1.61, SD = 1.12) reported higher perceptions of app quality than men (M = 2.30. SD = .74) t(2) = 2.696, p < 0.01, and condition 5 where women (M = 1.70, SD = 0.65) reported higher perceptions of quality than men (M = 2.19, SD = 1.01) t(2) = 2.190, p < 0.05.
Perhaps most important here is that participants in all treatment conditions rated the quality of the app as high. Since all treatment conditions used similar content based on the IDEA model (i.e., alert sound, oral and visual countdown, intensity level, map, actionable instructions), it seems the appropriate content is being included. Moreover, a thematic analysis of the open-ended responses revealed that those viewing the control (ShakeAlert) condition were “overwhelmed by the visuals” and wanted to see and hear directions to “duck, cover, and hold on.” These themes suggest that (a) too much information, although accurate, can defeat the purpose of the warning and (b) specific action steps need to be included. In addition to perceived quality of the app, the researchers sought to learn more regarding numerical versus verbal intensity displays, the effect of the map in location cognition (proximity), and behavioral intentions to take appropriate self-protective actions.
Key findings from this round of message testing regarding intensity are as follows. First, there were no significant differences among conditions regarding intensity. However, an exploration of descriptive statistics shed additional light on this issue. When asked “how important is it to know the kind of shaking,” 76–87% reported it as very important across all conditions. Moreover, 77–85% of the respondents across conditions answered correctly (i.e., 10 seconds or less) when asked when the shaking would begin.
Important findings emerged when asked what kind of shaking would occur. It is encouraging to note that 77–93% of the respondents reported correctly that very strong shaking was going to occur. The researchers placed a screen shot before entering the survey that summarized the meaning of the numerical intensity numbers. When respondents that viewed the verbal intensity display were asked about the numerical intensity level (8), only 15 and 22.4% recalled the correct number. Of the respondents that viewed the numerical intensity display, 69 and 80% recalled the correct number. Of the respondents that viewed the control (ShakeAlert) message, only 35.5% recalled the correct number. This low percentage may be impacted by the amount of detailed information being displayed in the control message. So much information may be difficult to process in 10 seconds or less and, thus, may result in misunderstanding.
Subsequently, when asked how well they understand the meaning of intensity level numbers, 48.4 and 38.8% of those viewing the verbal display marked “very well.” Respondents that viewed the numerical intensity display reported knowledge comprehension of “very well” at 56.7 and 56.5%. Those viewing the control (ShakeAlert) message reported knowing the meaning very well at 45.9%. These results suggest the verbal intensity display is more meaningful than the numerical display. These results also suggest that displaying both (as in the control ShakeAlert message) appears to be too much information to process accurately in a short amount of time.
All conditions included a map identifying where the shaking was going to occur. There were no significant differences among the conditions regarding the importance of the map or for accurate location identification. Across conditions, 74–92% reported a map as “important” or “very important.” A somewhat troubling finding, however, was that when asked where the shaking was going to occur, only 33–55% answered correctly (Los Angeles area) across conditions. When the researchers drilled down to include only participants currently living in southern California, the results improved slightly among the four treatment conditions (64–74% correct). However, only 29% of the respondents that viewed the control (ShakeAlert) message answered correctly. Moreover, when asked how helpful the visual images were in conveying information about location, only 27.9–50% said “very helpful” across conditions. However, in all four treatment conditions, respondents reported more preference for the visual images (M = 1.90, SD = 1.82) than those in the control condition (M = 2.26, SD = 1.30) t(2) = −2.106 = p < 0.05. Moreover, a thematic analysis of the open-ended comments revealed a desire for a simple map that merely showing a familiar city with a bullseye target or location flag would be more helpful than one showing both the epicenter and location where shaking will occur. Taken together, these results suggest that a simple map highlighting the location may be more effective than a detailed one showing lots of information.
2.3.4 Behavioral intentions
A series of stepwise regression analyses were conducted to examine the research question regarding behavioral intentions. The composite measures were used to assess perceptions of behavioral intentions. The measure for behavioral intentions used nine items with a response scale of 1 = “Very Unlikely” to 5 = “Very Likely.” On the first block, demographic variables were entered in order to account for any variance attributable to respondent characteristics. These included sex, age, race/ethnicity, and income. The second block added experimental condition to these possible predictor variables. The analyses focused on significant models and predictors, as well as possible improvement to the model based on the addition of experimental condition.
The results for the first predictor block did not indicate a significant model, F(4, 227) = 0.989, p = n.s. R2 = 0.017. None of the demographic variables were predictive of behavioral intentions. When experimental condition was added to the predictor block the model did not improve, F(5, 229) = 0.788, p = n.s., R2 = 0.017. None of the variables in the predictor block were predictive of behavioral intentions.
The fact that no significant model stood out as a better predictor for behavioral intentions combined with the descriptive statistics suggest that the including the IDEA model components as we did in each condition may be effective for earthquake early warning messages delivered via a smart phone app. Although the means reported are encouraging, the fact that the pretest self-efficacy (M = 4.44) also may point to a respondent pool comprised of members of a disaster sub-culture that is already pre-disposed to taking appropriate actions for self-protection.
Several promising conclusions may be drawn from these two rounds of message design and testing. First, a phone APP can be designed in ways that employ the IDEA elements of effective instructional risk and crisis messages for earthquake early warnings in 10 seconds or less. Second, the elements of the IDEA model do appear to positively influence affective (perceived value/importance), cognitive (comprehension), and behavioral (efficacy and intention) learning outcomes.
Also based on these message testing results, however, more honing of some particulars are still warranted. For example, with regard to internalization, the design of the map (proximity) needs to be simplified to ensure accurate comprehension of location. Regarding explanation, it appears that verbal intensity displays are more effective than numerical displays unless a comprehensive educational campaign could be conducted to teach users what the different numbers mean.
The sample for both rounds of message testing was not representative of the entire population in southern California. Additional message testing targeting more representative demographic diversity and marginalized populations is warranted in order to be certain about ultimately launching the most effective warning app.