Effective management of flood events depends on a thorough understanding of regional geospatial characteristics, yet data visualization is rarely effectively integrated into the planning tools used by decision makers. This chapter considers publicly available data sets and data visualization techniques that can be adapted for use by all community planners and decision makers. A long short-term memory (LSTM) network is created to develop a univariate time series value for river stage prediction that improves the temporal resolution and accuracy of forecasts. This prediction is then tied to a corresponding spatial flood inundation profile in a geographic information system (GIS) setting. The intersection of flood profile and affected road segments can be easily visualized and extracted. Traffic decision makers can use these findings to proactively deploy re-routing measures and warnings to motorists to decrease travel-miles and risks such as loss of property or life.
- trend extraction
- spatial and temporal trends
Floods are the most frequently occurring natural disaster. A flood event occurs when stream flows exceed the natural or artificial confines at any point along a stream . This is often due to heavy rainfall, ocean waves coming on shore, rapid snow melting, or failure of manmade structures such as dams or levees . From 1998–2017, flood events affected more than two billion people globally . Disasters of this frequency and magnitude are typified by extreme costs to governments. In 2019, historic flooding across Missouri, Arkansas, and the Mississippi River basin resulted in an estimated cost of 20 billion dollars . These estimates typically do not reflect indirect costs such as added travel-miles and the subsequent loss of time. Further, floods are among the deadliest natural disasters. From 2010–2020, floods resulted in the fatalities of 1089 people in the United States . A majority of these deaths were comprised of motorists. Therefore, urban planners such as traffic decision makers are tasked with proactively deploying resources that minimize motorist risk exposure. At present, traffic decision makers rely on static flash flood inundation profiles related to discrete rainfall events. These profiles are often created through multiagency cooperation efforts such as . Some studies have begun to generate dynamic flood inundation data visualizations based on these profiles . Additionally, integrated approaches that use machine learning and geographic information systems (GIS) to track changes in critical infrastructure over time are emerging as powerful decision support tools . However, there is limited use of state-of-the-art time series prediction models to generate dynamic data visualizations in a GIS setting for improved flood management. This book chapter explores the integration of publicly available data and machine learning models to address this gap in the literature.
Precise determination of when and where to deploy re-routing measures is a complex task. One approach that improves planning effectiveness is to integrate time series characteristics of river behavior and corresponding spatial flood profile. In this chapter, a univariate time series prediction of river stage is conducted that improves the temporal resolution and accuracy of publicly available forecasts. This prediction is then tied to a corresponding spatial flood inundation profile in a GIS setting. The resulting geospatial deep learning model provides a data visualization tool that traffic decision makers can use to proactively manage road closures in the event that a flood is likely to occur. The first section provides an overview of relevant river behavior that causes flooding. State-of-the-art trend extraction and prediction techniques are then presented and tied to geospatial use cases. The methodology section presents the data used, time series prediction model selected, and geoprocessing procedures required for data visualization using GIS software. Next, an illustrative example is provided for a frequently flooded intersection in Missouri. A discussion section is provided that positions the findings in the context of improving traffic management in the event of a flood. Lastly, a conclusion is given that summarizes the key findings and outlines model limitations and future work.
2. A geospatial deep learning approach
Two key characteristics of streams that relate to flood events are stream stage and streamflow. Stream stage refers to height (ft) of the stream and streamflow corresponds to discharge (ft3/s) or alternatively, volumetric flowrate. Typically, governmental organization such as the United States Geological Survey maintain a network of sensors that monitor these characteristics over time for various stream segments. The National Weather Service classifies flood categories into four groups based on stream stage: Action Stage, Flood Stage, Moderate flood Stage, and Major Flood Stage . These values vary for a given segment of stream based on analysis of previous floods, local topography, and underlying geological properties.
Given that stage is monitored over time, the use of time series forecasting methods to predict stage values is appropriate. There are two modeling approaches that are useful in this context: statistical and computational intelligence. Statistical models use historical data to identify underlying patterns to predict future values . Some commonly used techniques for flood forecasting include simple exponential smoothing , autoregressive moving average , and autoregressive integrated moving average . However, one shortcoming of these approaches is lack of scalability as the quantity and complexity of data increases . An alternative approach that addresses these issues is computational intelligence. A key feature of computational intelligence approaches is the capacity to manage complexity and non-linearity without needing to understand underlying processes . In summary, statistical methods rely on precise underlying relationships and exhibit decreased performance as the number of variables increases whereas computational intelligence approaches identify patterns using large amounts of training data to establish a model capable of accurate predictions . Some commonly used flood forecasting computational intelligence models include support vector machines , artificial neural networks , and deep learning . Further, they have demonstrated superior performance when compared to conventional statistical modeling approaches for flood prediction studies. LSTM models have explicitly shown promising results in time series contexts. Therefore, LSTM models provide a state-of-the-art trend extraction and prediction technique regarding stream stage values.
Stream stage values are categorized based on resulting flood severity. The physical reality of these categories is the spatial extent of the flooding event often referred to as a flood inundation map . These maps provide decision makers with a useful visual reference to determine what specifically has been affected by a flood event. An area of research, data visualization, and practical application that has not been fully investigated is the integration of computational intelligence stream stage predictions with geospatial flood inundation maps. The methodology provided in the following section addresses this gap.
This section consists of three parts: LSTM prediction of stream stage, data required, and geoprocessing procedures. First, a brief overview of LSTM will be given. This will include explanatory figures and relevant mathematical formulas. Second, data required to conduct the LSTM prediction of stream stage will be procured. Flood inundation imagery and road network data will also be obtained. Lastly, data will be uploaded to a GIS software and processed for end use by traffic decision makers. An illustrative example is presented in the next section.
3.1 LSTM prediction of stream stage
Stream stage prediction is a time series forecasting procedure that is dependent on previous data to predict future values. As the quantity and quality of data continues to increase, more powerful computational approaches can be applied to prediction problems. The results of the literature review demonstrated that deep learning approaches, namely LSTM networks, are increasingly being applied to these problems.
Deep learning is an extension of the conventional neural network by adding additional layers and layer types. Figure 1 provides a visual comparison of the two approaches . The simple neural network (left) consists of a single input layer, hidden layer, and output layer. Alternatively, the deep learning neural network (right) has one input layer followed by three successive hidden layers that ultimately feed into a final output layer. This configuration has generated superior performance in capturing complex relationships.
However, neither approach retains previous time step information. Recurrent neural networks (RNNs) were introduced to address this limitation. LSTM networks are the deep learning variant of RNNs. All figures and mathematical formulation are borrowed from . The primary benefit of LSTM networks is the capacity to retain longer term information. This is accomplished by removing and adding information determined by a series of ‘gates’ and vector operations. Figure 2 provides a visual representation of an LSTM cell. The first gate, illustrated in yellow, generates a value between 0 and 1 using the current input (xt) and output from the previous step (yt-1) that determines how much information is passed on (forget gate). A zero corresponds to no information transfer whereas a one represents a complete transfer.
The result of this procedure () is presented mathematically in Eq. (1) as a sigmoid neural network layer where U (weights) and W (recurrent connections) are matrices.
Next, a decision must be made regarding what information needs to be stored. This is accomplished by applying an additional sigmoid layer (red, it). New values are then added to the cell state () by using a tanh layer (green). Eqs. (2) and (3) present these procedures mathematically.
The line at the top of the cell is known as the cell state () and has interactions with all components. Information has the opportunity of being forgotten when the old state () is multiplied by the result of the first forget gate (). The product of the second (red) and third (green) gates are then added which results in new information being provided to the cell state and is represented by Eq. (4).
Lastly, the output layer of the LSTM cell determines the forecast for the current time step. A sigmoid layer (blue) and tanh layer are multiplied to generate an output (). This final step is represented by Eqs. (5) and (6).
The result of this computational procedure is a time series forecast of future values. However, a large amount of data must be gathered to use as a model input. This data is presented in the next section.
3.2 Data required
Historic stream stage height for the location further explained in Section 4 must first be gathered. 113,994 data points were procured that correspond to 15-minute intervals from May 19, 2016 (5 PM) – September 1, 2019 (4 PM). Stage height is herein referred to as ‘gauge height’ to account for the source of the data. This data is represented graphically in Figure 3 .
Using USGS’ flood inundation mapper (FIM), these gauge heights can be tied to a specific flood inundation profile . The FIM is a publicly available tool that provides resulting flood inundation maps for one-foot gauge height increments in image format (.tif). A sliding bar that accomplishes this is available on the online user interface and is presented in Figure 4.
An example of a flash flood inundation profile being uploaded to a GIS software is provided in Figure 5. Purple lines correspond to road network data derived from the National Transportation Dataset . Blue raster (grids of pixels) imagery denote the depth of water at discrete locations where darker blue reflects deeper water. Useful geoprocessing techniques that generate actionable decision support tools are presented in the next section.
3.3 Geoprocessing procedures
Traffic decisions makers are tasked with identifying flood affected road segments. In Figure 5, it can be observed that the flood inundation profile does overlap certain road segments. Relying on visual inspection alone is time consuming and prone to inaccuracies due to human error. A solution to this issue is the application of a set of straightforward geoprocessing tools that are built-in to most GIS softwares: conversion and intersection.
Some tools do not allow raster and vector data layer interoperability. Therefore, it is necessary to convert one of the data layers to establish a consistent data type. One approach is to convert the raster layer into a vector layer using the conversion tool within ArcGIS. Figure 6 illustrates the result of this operation. The flood inundation profile has been converted into several points at 1-m increments. This spatial resolution can be modified by the user. The road network has been changed from its previous color to improve readability.
Once the raster layer has been converted into vector format, it is eligible for use as an input layer for the intersection tool. The intersection tool generates a point at every location where there is an intersection between the input layers. In the next section, an illustrative example is provided to demonstrate the effectiveness of the methodology presented.
4. Illustrative example
Valley Park, Missouri is located at the intersection of I-44 and State Route 141. This location is the setting for the example figures presented previously. The Meramec River winds through this area and has regularly flooded in recent years. In 2017, the river exceeded its banks and caused significant damage to the surrounding area as seen in Figure 7. This location provides a suitable candidate to test the methodology presented given the extent of the flood event and data availability.
First, data is gathered from a nearby stream gauge. Figure 8 provides a geographical point of reference for the gauge denoted by a green square with respect to I-44 and State Route 141. The data presented in Figure 5 is then procured and used as an input for the LSTM network. Figure 9 presents the prediction results of the LSTM model superimposed on the actual data for May 19, 2016-September 1, 2019.
The actual data (blue) can be observed deviating from the prediction results for the training (orange) and testing (green) results of the LSTM network. A lack of discrepancy between the actual data and predictions demonstrates the model’s effectiveness. Further, it is useful to determine how the prediction compares with publicly available forecasts for the same location. USGS provides a forecast every six hours. Alternatively, the LSTM network provides 24 predictions in the same period. Figure 10 provides a comparison of the prediction provided by USGS and the LSTM model for September 1, 2019 (6 PM) – September 3, 2019 (6 AM).
The red line represents the original data. Gauge height is initially observed at just above six feet. From there, it trends in a downwardly direction until it reaches the end of the dataset at less than 3.5 feet. The green line corresponds to the USGS prediction. This prediction initially overshoots the original data before briefly correcting and then diverging significantly from the observed trend. Lastly, the blue line represents the LSTM prediction. At first, this prediction captures the downward trend missed by the USGS prediction. Ultimately, the prediction flattens out and diverges from the original observations but to a lesser extent when compared to the USGS prediction. Root Mean Squared Error (RMSE) values for each of the predictions are provided to further demonstrate the difference in model performance. The RMSE value of 0.453 reported by the LSTM model represents superior accuracy compared to the 1.065 value reported by the USGS prediction. Therefore, the LSTM model presented here improves on the accuracy of publicly available forecasts and can be used as an input for the flood inundation tool.
Valley Park has 43 flood inundation profiles available in one-foot increments from 11–54 feet. The highest stage value recorded at this location is 44.11 feet on December 31, 2015. Figure 11 provides the flood inundation profile for 45 feet to approximate this event. Note that 45 feet is used instead of 44. This is due to the flood inundation profile incremental limitation and opting for a rounding approach that provides a more conservative risk assessment. The inundation profile is then converted to point format and intersected with the road network as illustrated by Figure 12.
At present, urban planners such as traffic decision makers rely on static flood inundation maps and post hoc planning to reroute traffic in the event that a flood occurs. This approach puts motorists already in-transit at risk to rapidly changing road conditions. To address these risks, a field of research has emerged to provide decision makers with real-time decision-making tools. However, using time series prediction models that capture river characteristics and integrating them with flood inundation profiles has received limited attention. The methodology provided here addresses this gap.
Traffic decision makers can use the data visualization presented in Figure 12 as a powerful decision support tool. The flood affected road segments can be easily identified (orange) and rerouting measures can be promptly dispatched. With the improved temporal resolution and accuracy of the LSTM prediction of stage height, traffic decision makers can deploy resources proactively to avoid unnecessary risk to motorists and improve traffic flow. Concluding remarks, limitations, and future work are presented in the next section.
Flash floods are a frequent and devastating natural disaster. The impetus to manage these events belongs to local decision makers that work in a resource constrained environment. To improve their decision-making effectiveness, a framework was presented that integrates machine learning and geospatial data to extract spatial and temporal trends using publicly available data. An illustrative example was provided to demonstrate the effectiveness of the framework provided. Valley Park, Missouri is located near the intersection I-44 and State Route 141. These roads represent major traffic throughputs and persistent flooding of the Meramec River has jeopardized the safety of motorists and the flow of commercial goods. Using 113, 994 river stage observations procured from a nearby sensor, an LSTM network was developed to improve the accuracy of publicly available forecasts. The result was an improvement in both the frequency and accuracy of forecasts provided. Once the stage value is predicted it can be tied to a spatial flood inundation profile using the publicly available FIM. Using the flood inundation profile for 45 feet observed at Valley Park as a proxy for the historic crest at this location, data visualization of flood affected road segments was generated in a GIS setting. The key benefit of this output is the ease with which traffic decision makers can use the results presented to inform urban planning and decision making. Traffic decision makers can use the resulting data visualization presented here to guide real-time decision making in the event that a river stage value is predicted to reach a flood event stage for a specified river segment. Despite the usefulness of the findings, there remain a number of model limitations that represent areas of future work.
Model limitations can be divided into two categories: data gathering and model extension. Deep learning models are dependent on large amounts of data. Therefore, sensors that collect data need to be installed and active for an extended period. The cost to install and maintain an enlarged sensor network might be prohibitive for some locations. Due to this fact, model implementation is limited to river locations where sensors are already installed. Additionally, FIM coverage is confined to a small number of locations nationwide. Similarly, to sensor coverage, if there are not already-available flood inundation maps, then the model cannot be applied to those locations. Model extension includes options to improve the model in a material way. One recommendation would be to determine the best locations for road signage that will provide optimal re-routing to motorists given a finite amount of signage. Another approach would involve working with local decision makers to determine re-routing effectiveness based on how quickly resources are deployed given model predictions. Areas of future work not related to model extensions include alternative prediction approaches in river networks with no sensors and refinement of the model to account for flash floods. Each of these components represent considerable opportunity for model enrichment that further improve the decision-making effectiveness for traffic management professionals.
The results presented here demonstrate the utility of using machine learning models and geospatial data to generate data visualization tools that key stakeholders can use to improve planning effectiveness. As data becomes increasingly available, use of comparably sophisticated methods can be applied to a suite of natural disaster phenomena. The outcome of such an undertaking will be the widespread use of data visualization tools that will reduce the risk motorists are exposed to and mitigate the accompanying economic fallout.
This work was partially funded by the Missouri Department of Transportation, Award Number TR201912 and the Mid-America Transportation Center, Award Number 25-1121-0005-130.
Conflict of interest
The authors declare no conflict of interest.