Definitions and formulas of selected topographic metrics.
Landscape topography is a key parameter impacting soil properties on the earth surface. Strong topographic controls on soil morphological, chemical, and physical properties have been reported. This chapter addressed applications of topographical information for mapping spatial patterns of soil properties in recent years. Objectives of this chapter are to provide an overview of (1) impacts of topographic heterogeneity on the spatial variability in soil properties and (2) commonly used topography-based models in soil science. A case study was provided to demonstrate the feasibility of applying topography-based models developed in field sites to predict soil property over a watershed scale. A large-scale soil property map can be obtained based on topographic information derived from high-resolution remotely sensed data, which would benefit studies in areas with limited data accesses or needed to extrapolate findings from representative sites to larger regions.
- DTM-based model
- high-resolution remotely sensed data
- soil carbon
- principal component analysis
- factor analysis
Landscape topography is a key parameter influencing biogeochemical processes that occur in the near-surface layer of the earth . In particular, the topography plays an important role in soil formation through regulating soil hydrological regimes and controlling the gravity-driven soil movements [2, 3, 4, 5, 6].
Quantitative and qualitative topographic information is essential in understanding the heterogeneity of soil chemistry and physics. Before the 1990s, geographic maps were the main source to quantify landscape topography in soil science . Topographic variables, such as slope and plan and profile curvatures, were calculated manually from these maps to investigate their relationships with soil properties and to generate soil maps [8, 9, 10].
Along with the development in computer, aerial, space, and geographic technologies, the availability of high-resolution digital elevation models (DEMs) introduces a new technique in deriving digital terrain models (DTMs) and has been the main source for topographic information extraction in soil biogeochemical studies since the 1990s . A DEM is a digital representation of the terrain surface elevation referenced to a vertical datum. A DTM is an enhanced DEM that has been augmented with breaklines and other observations to describe the land surface geometry [1, 11, 12]. The application of DTMs enables effectively reconstruct topographic landscape over a large scale. Recently, there are two main applications of DTMs in soil science. One is analyses of topographic influences on soil formation and movement, which would be introduced in Section 2. The other is modeling of relations between soil properties and topography and using the results to predict soil properties, which would be discussed in Section 3.
The objective of this chapter is to provide an overview of how topographic heterogeneity causes the spatial variability in soil properties. This chapter starts with an introduction of DTMs applications, which is then followed by reviews of investigations on topographic impacts on soil formation and movements and modeling of soil morphological, chemical, physical properties based on DTMs. The last section presents a case study of DTM-based analysis on how land topography affects soil carbon (C) dynamics.
2. Impacts of topography on soil properties
DTMs are functions of morphometric variables that digitally represent the geometry of the land surface. Various techniques have been developed to generate different DTMs such as topographic metrics of slope, aspect, and curvature. Fifteen topographic metrics that have been reported to be highly correlated with soil properties, including slope gradient, slope aspect, profile curvature, plan curvature, general curvature, flow accumulation, topographic relief, topographic openness, upslope slope, flow path length, downslope index, catchment area, topographic wetness index, stream power index, and slope length factor, are introduced in this section (Table 1). Based on the spatial scope, the topographic metrics can be grouped into three main categories :
Local topographic metrics: variables describe the surface geometry at a given point on the land surface. Slope gradient, slope aspect, and curvature related (plan, general and profile curvatures) metrics belong to this category.
Nonlocal topographic metrics: variables consider relative positions of a selected point, including catchment area, upslope slope, downslope index, flow path length, flow accumulation, topographic relief, and topographic openness.
Combined topographic metrics: variables integrate local and nonlocal topographic metrics considering both local surface geometry and relative positions of a point on the land surface. This group of metrics includes topographic wetness index, stream power index, and slope length factor.
|Variables||Definition and formula|
|Slope gradient, G (radian)||An angular measure of the relation between a tangent plane and a horizontal plane|
|Slope aspect, A (radian)||Direction of slope measured clockwise with north as 0|
|Profile curvature, P_Cur (1/m)||Slope change rates in the vertical plane|
|Plan curvature, Pl_Cur (1/m)||Curvature in a horizontal plane|
|General curvature, G_Cur (1/m)||Curvature of the surface itself|
|Catchment area, CA (m2)||Upslope area contributing runoff to a given point on the land surface|
|Upslope slope, Upsl (radian)||Mean slope of upslope area|
|Downslope index, DI (radian)||Head differences along a flow path|
|Flow path length, FPL (m)||Maximum distance of water flow to a location in the catchment|
|Flow accumulation, FA (m2)||Land area that contributes surface water to an area in which water accumulates|
|Topographic relief, TR (m)||Elevation difference between the highest (hmax) point in an area and a given point (hi)|
|Positive topographic openness, PTO (radian)||Angular measure describing the relationship between surface relief and horizontal distance|
|Topographic wetness index, TWI||Frequencies and duration of saturated conditions|
|Stream power index, SPI||Erosive power of overland flow|
|Slope length factor, LS||Distance from flow origin to a point where deposition begins|
These nonlocal and combined topographic metrics often reflect important physics involved in water and soil mass transfer processes considered to have important impacts on soil property patterns.
2.1 Local topographic metrics
Slope gradient indicates the steepness of a line which directly influences the velocity of a gravity-driven flow . For example, a steep area drains quickly and retains less soil than a flat area [4, 13]. Therefore, negative soil redistribution rates with high erosion possibilities are often observed in steep areas. The erosion processes tend to remove fine particles which are usually enriched in soil organic carbon (SOC), leading to low SOC content in a steep area [4, 14]. Meanwhile, the slope gradient can impact soil water content [2, 7]. For relatively flat areas, soil water content commonly decreases with slope gradient due to increased lateral flow and depositional crusts that decrease infiltration; while as slope steepens, rills may occur that can disrupt the crusts and favor greater infiltration, and therefore lead to a positive relationship between soil water content and slope gradient [15, 16, 17, 18].
Slope aspect shows the direction that a slope faces. This metric influences soil-water balance by affecting insolation and evapotranspiration . Soil temperature and evapotranspiration tend to be lower, and soil water content tends to be higher in shady aspect areas. These environmental conditions can be favorable for slow decompositions of organic matter and high accumulations of soil C and nitrogen (N) content [20, 21, 22]. Soil water content impacted by slope aspect can further influence vegetation density, which may have impacts on runoff velocities and soil erosion rates .
Profile, plan, and general curvatures are important topographic factors controlling patterns of overland flow and soil water content. Profile curvature shows upwardly concave with positive values and upwardly convex with negative values (Figure 1a). This variable affects flow acceleration and deceleration and therefore influences soil redistribution and distribution patterns of SOC content [24, 25, 26]. A positive plan curvature value indicates a laterally convex surface and a negative value indicates a laterally concave surface  (Figure 1b). Water accumulates and soil water content decrease when flow diverges (positive plan curvature) and increase when flow converges (negative plan curvature) . General curvature is the curvature of the land surface and describes peaks with positive values and valleys with negative values. This metric enables more accurate estimation of overland flow paths than plan and profile curvatures, and can significantly correlate with patterns of soil erosion and deposition [4, 27].
2.2 Nonlocal topographic metrics
Catchment area and slope related nonlocal topographic metrics (upslope slope and downslope index) affect soil properties mainly through regulating soil water content. At a location, increases in water amount from upslope areas can increase water supply to the location and affect the water accumulation . Therefore, positive correlations have been observed between the catchment area and soil water content . Furthermore, as the catchment area increases, the chance for sediment deposition increases, and thus affects the soil C stocks . The upslope slope relates to slopes in upslope contributing areas. Overland flow velocities are usually less at positions with lower values of upslope slope [1, 30]. The downslope index is a metric including dispersal (downslope) controls on drainage . Since the drainage of a location is the balance between the water from a specific upslope contributing area and to a downslope area, this index usually shows a better representation of groundwater gradients and soil water content than slope gradient [31, 32].
The two flow-related nonlocal topographic metrics (flow path length and flow accumulation) reflect the impacts of soil hydrology on soil properties. The longer flow path length decreases overland flow velocity and increases infiltration [33, 34]. Increased erosion of fine particles can also be observed when flow path length increases [35, 36]. This metric has been widely used in soil erosion models, describing soil loss under flow divergence and convergence conditions [37, 38]. Flow accumulation mainly influences water conditions in soils. Flow volume and soil water content response positively to this metric, which in turn can influence the soil C stocks [39, 40].
For topographic relief, higher values suggest larger differences from the highest points, which would stimulate flow velocity, leading to more rapid downslope soil transport from low relief areas [4, 41]. Moreover, topographic relief influences landscape drainage characteristics. Tucker and Bras  found that drainage density was positively correlated with relief in semiarid and low-relief landscapes but negative related to relief in humid or high-relief landscapes. Areas with a broad range of relief may cover several altitudinal climatic zones with differences in vegetation types, further influencing weathering and denudation processes .
Topographic openness describes the distinction between relief and surrounding topographic features . Convex landforms often exhibit high positive topographic openness values, whereas concave landforms typically have high negative topographic openness values (Figure 2) [44, 45]. Therefore, soil water content may change with this variable . The low positive openness areas are more likely to be depressional areas with high soil water contents that provide suitable anaerobic environments for denitrification but impede aerobic SOC decomposition [4, 46].
2.3 Combined topographic metrics
Topographic wetness index combines a local topographic metric (slope) and a nonlocal topographic metric (upslope contributing area) . It is considered as an indicator effectively reflecting the spatial distribution of wetness conditions as the upslope contributing area would impact groundwater level and soil water content, and the slope would influence drainage processes . Areas with higher wetness index tend to be wetter. The topographic wetness index has been used to estimate the spatial distribution of hydrological and geochemical properties of soil, and significant correlations have been observed between this metric and soil C and N content [3, 4, 32, 46, 48, 49].
Stream power index takes into account both specific contributing area and slope. This metric is useful for characterizing potential erosive power of water flow . When the slope gradient and catchment area increase, the amount of water from contributing area and the velocity of water flow increase, and consequently enhancing the erosive power of water . Therefore, areas with larger stream power index values have greater potential to be erosive regions [4, 51]. Due to its impacts on erosive power, this metric can also be useful in understanding erosion-induced soil C and N dynamics [14, 52, 53].
Slope length factor includes the length and steepness of a slope and thus reflects the topographic impacts on erosion [54, 55]. As the slope length increases, the soil loss per unit area usually increases due to a greater runoff accumulation on a longer slope length that increases transport capacity of runoff; as slope steepness increases, soil loss also generally increases [54, 56]. This factor is essential in estimating soil transport and erosion by runoff [37, 38, 50, 56, 57].
3. DTM-based soil property prediction
In a DTM-based soil property model, the predictive variable could be the morphological, chemical, or physical property of soil. Development of DTM-based models follows two assumptions including that (1) the controls of topography on soil properties can be found through a relatively small set of soil samplings and topographic metrics, and (2) the statistical correlations between topography and soil properties are often strong. In this case, soil properties can be predicted based on the topographic metrics . Due to the recent availability of large-scale, high-resolution DEMs, DTMs over large-scales can be derived. The DTM-based models benefit investigations in regions with limited observations and can generate spatially continuous soil property maps based on extrapolation.
Methods of DTM-based soil property prediction could be grouped into two categories :
DTM-based models to predict quantitative soil properties based on statistical analyses. Multiple regression analysis, regression kriging, cokriging, and kriging with external drift are the widely used methods to predict quantitative soil properties.
DTM-based models to predict categorical soil properties. Statistical methods such as classification tree model, fuzzy logic, and discriminant analysis are usually employed in this category.
3.1 DTM-based methods to predict quantitative soil properties
Multiple linear regression (MLR) simulates relationships between two or more independent variables and a dependent soil property variable by fitting to a linear equation. The DTM-based MLR models have been applied to study spatial patterns of soil structures, horizonation, and soil water content [12, 39, 58, 59, 60, 61, 62, 63, 64, 65], to explore spatial variability of cation exchange capacity and pH [62, 65, 66, 67], and to derive continuous quantitative maps of SOC, C isotopes, and nutrients over large spatial scales [4, 6, 12, 14, 46, 52, 68, 69]. In some modeling investigations, Hybrid regression methods were used to improve the efficiency of soil property prediction. Li et al.  combined stepwise MLRs with principal component analysis (PCA) for SOC mapping. Results suggested that the combination of DTM-based MLRs with PCA outperformed regular stepwise MLRs in the prediction of SOC and soil redistribution rates at a watershed scale.
Regression kriging (RK) is a spatial prediction combining an MLR with kriging of the regression residuals. The RK acts as a MLR model if the data used in the model have low spatial structure, and the method reduces to Ordinary kriging (OK) if there are no linear statistic correlations between the dependent variable and the ancillary variables . Based on topographic and other environmental variables, numerous studies have applied RK for predicting spatial patterns of soil properties, such as soil horizon thickness [70, 71], soil structures [63, 70, 71], soil water content [72, 73], soil C content [63, 69, 74, 75], cation exchange capacity [66, 76], and soil hydraulic properties [77, 78]. Generally, this method is more accurate in soil property estimations than the OK, Cokriging, or MLR because residual values from kriging analysis were added to the regression [63, 70, 71, 73, 76, 77]. However, Zhu and Lin  reported that the RK performed worse than the OK for soil property prediction in relatively low relief areas.
Cokriging (CK) and kriging with external drift (KED) are also popular and practical spatial predicting techniques in digital soil mapping. The CK calculates soil properties by investigating topographic metrics in the kriging procedure and KED uses external ancillary topographic variables as kriging weights. Various studies have employed CK and KED to derive continuous maps of soil physical and chemical properties [65, 66, 70, 71, 76, 77, 78, 79, 80]. Some of these studies were also suggested that these techniques would be superior to OK in soil property estimation when the selected topographic metrics are highly correlated with the dependent variables [79, 81].
3.2 DTM-based methods to predict categorical soil properties
Classification tree models (CTMs) are a major type of Decision Tree method used in soil science, in which the target variable is a categorical soil variable. This model applies a set of rules that use explanatory variables to split data into homogeneous subsets. The explanatory variables can be either categorical, such as geological unit number, soil unit, etc., or continuous, such as slope, elevation, topographic wetness index, etc., [82, 83]. Compared to mathematic functions, the tree structure can provide a more visualized explanation of relationships between explanatory variables and the target variable. The CTMs can be used to derive efficient predictions of soil taxonomic classes from local to large spatial scales [82, 83, 84, 85, 86, 87, 88, 89, 90, 91]. Soil drainage can also be effectively classified using the CTMs with soil profiles and topographic metrics as predictors [92, 93, 94, 95, 96].
The basic idea of fuzzy logic (FL) is to show “degrees of truth” for a variable. Soils are continuums in both geographic and attribute spaces. As a result, using 0 and 1 or discrete categories cannot provide sufficient information about soil properties. The FL overcomes the limitation. If a variable belongs to a set, the model would take a value between 0 to 1 instead of 0 or 1. Several studies have used the DTM-based FL to improve soil taxonomic classes in soil mapping [97, 98, 99, 100, 101], soil texture and soil horizonation prediction [98, 102, 103, 104, 105], and soil vulnerability classification . Qi et al.  found that using the FL the accuracy of soil series name prediction increased 17% compared to the conventional soil survey. The FL was also combined with maximum likelihood regression to derive the prediction of some continuous soil properties .
Discriminant analysis (DA) is a type of supervised classification to assign objects to the most likely group among a lot of groups. It uses some observations (training dataset) to classify others. This method is applicable when correlations between soil property variables and independent variables are high . It has been applied to differentiate soil taxonomic classes and to generate soil texture maps using multiple ancillary variables including topographic metrics [107, 108, 109, 110]. Several studies also demonstrated the feasibility of using DA in deriving soil drainage classes based on its relationships to topography and soil electrical conductivity [111, 112, 113, 114].
3.3 New emerging methods to predict soil properties
Artificial intelligence (AI) or machine learning gives computers a degree of sophistication to act intelligently . Therefore, to be intelligence, computers should be able to learn from training datasets, correctly interpret external data, and apply learned knowledge to achieve specific goals. With increased computing power, massive sets of labeled data, and developed pre-trained models, increasingly researchers have applied AI to fields such as speech recognition, objective detection, visualization, machine translation, image processing and others . However, it is not until the recent decade that the potential applications of AI on soil property prediction have come into more common awareness by scientists.
Artificial neural networks (ANNs) are a representative AI technique that has been applied to solve complex machine learning problems (Figure 3a). The method has similar data processes as a biological neural with nonlinear mapping structures, which consists of a set of interconnected units (neurons) . The input neurons are predictors, linking to one layer of hidden neurons and finally linking to the output variables . To obtain accurate prediction results, the network model is trained first by a set of observations. The weights that connect neurons are adjusted iteratively using the training dataset. After training, the model is applied to predict areas with the same input variables. ANNs outperform traditional statistics in handling large datasets even when the input data are noisy with low levels of precision due to the ability to reduce bias by evenly distributing training data across classes . Various researchers have employed ANNs for efficient prediction of quantitative soil chemical and hydrological properties [118, 120, 121] and adequate mapping categorical soil taxonomic classes [122, 123, 124, 125, 126, 127, 128, 129] based on DTMs and environmental variables. Zhao et al.  also tested the feasibility of using ANNs for soil drainage classification and found an accuracy of 52% between field observations and digital classification.
Deep learning (DL) is considered as an advanced ANN (Figure 3b) that has been facilitated by recent advances in technology for highly parallel computing. In contrast to single hidden layer ANNs, DL algorithms allow the computer to learn on its own by multi-layer nonlinear transformations of the input training data . For instance, such algorithms can define edges within images by training on multiple examples and perform automatic feature extraction without human intervention. Therefore, massive quantities of representative learning data are the prerequisite for effective estimation from DL. The architectures of DL include Convolutional Neural Networks (CNNs), Deep Belief Networks (DBNs), and Recurrent Neural Networks (RNNs). CNNs are classic feedforward networks in which the hidden layers consist of convolutional layers, pooling layers, and fully-connected layers. The convolutional layers apply different convolution operations (filters) to pass results from local patches in the feature maps of the input or the previous layers to the next layers, enhancing certain features in the output. Neurons in the same convolutional layer share the same weight. The pool layers merge similar feature together, improving the robustness of features against noise and distortion. The convolutional and pooling layers are finally stacked to a fully-connected layer. The local connectivity in a convolution layer allows CNNs to achieve a better generation in output analysis and the shared weights increase the possibility to extract information of high complexity . Padarian et al.  applied CNNs to predict SOC at multiple depths using elevation, slope, topographic wetness index, temperature and rainfall as input data. The results suggested that the CNNs reduced errors by 25% for SOC predictions than the conventional Cubist model. The DBNs are considered as a composition of unsupervised sub-networks, which are trained to maximize the likelihood of training data. Each sub-network serves as a visible layer used for unsupervised training of the next layer . Song et al.  demonstrated the usefulness of DBNs in predicting soil water content in highly nonlinear forms over an irrigated field. RNNs have a “memory” called hidden state to remember all information that has been calculated, so the output of RNN loops connect to their past decision nodes based on the hidden state. The networks process an input sequence at one time, preserving the sequential information in the hidden state and producing the output sequence. Therefore, this model is especially useful for tasks containing sequential input . Researchers have demonstrated the feasibility of using RNNs for hydrological study, although no reports were found using this approach to map soil properties based on topography [135, 136].
Random forest (RF) is another emerging method of AI and consists of an ensemble of classification and regression trees for prediction (Figure 4). Each tree is a random subset of features and uses a random set of the training data (about 2/3 of the available observations), which increases the diversity of the forest and decreases the correlation of individual trees. RF commonly has high efficiency and low bias and variance since the output is the average or majority voting of a large number of trees . The method has been proved to be resistance to over-fitting because each tree is trained on a unique bootstrap subset and provide a reliable error estimate using Out-Of-Bag data (the remaining one-third of the observations) . Because of the above advantages, increasing scientists have used RF in soil mapping. Combining topographic metrics, environmental variables, climate variables, or/and land cover as input, RF can predict quantitative and category soil properties. For quantitative soil property prediction, the output is the average of individual tree outputs. RF has been successfully applied to investigate spatial patterns of soil organic matters [139, 140, 141, 142] and to estimate soil texture . Guo et al.  further developed soil organic matter prediction by combining RF with Residuals Kriging, for which the prediction accuracy increased dramatically (R2 = 0.86) compared to the method using RF only (R2 = 0.65). For categorical soil property classification, the output is obtained from voting by the majority on the correct classification. Several studies have demonstrated the feasibility of using RF for updating soil survey maps  and predicting soil classes in unmapped regions [146, 147, 148].
4. Case study: DTM-based models on SOC dynamics
In this section, a case study about DTM-based modeling of SOC and soil redistribution (SR) was discussed to understand the impacts of topography on SR and SOC dynamics. We also compared efficiencies of three types of DTM-based models in predicting the soil properties. Cesium-137 (137Cs) was used to trace the SR process, and high-resolution light detection and ranging (LiDAR) data were applied to derive DEMs for DTM extraction. Based on the DEM-derived topographic information and field measured SOC density and SR rates, the multiple linear regression (MLR), MLR combined with principal component analysis (MLR-PCA), and MLR combined with factor analysis (MLR-FAn) were developed and discussed.
The study was carried out in Walnut Creek watershed (WCW), which is located in Boone and Story counties, Iowa, USA (Figure 5a, 41°55′–42°00′N; 93°32′–93°45′W). It has a humid continental climate. The landscape of this watershed is relatively flat with a low topography relief (2.03 ± 1.62 m). The typical soils are poor-drained Nicollet and Webster soils in the lowlands and well-drained Clarion in the uplands. More than 86% area of the watershed is cropland. Chisel plowing in autumn and spring disking are the current primary tillage operations. Directions of tillage practices in the WCW are mostly north-south or east-west, depending on the management and field configurations. Detailed information on climate, soils, and farming practices can be found in Hatfield et al. .
Two field sites were selected for intensive sampling investigation. Each site is approximately 15 ha. Site 1 is in the WCW (Figure 5b) and Site 2 is located between Boone and Ames (Figure 5c), which is within 10 km of the closest watershed boundary. Similar to the WCW, low reliefs (<4.6 m) were observed for both sites. Tillage practices at these two sites were along the north-south direction.
4.2 Materials and methods
4.2.1 Field sampling and laboratory analyses
The SOC and 137Cs data used in this section have been reported in Ritchie et al.  and Li et al. [4, 52]. A total of 460 locations were randomly selected for WCW and 230 locations were selected for each site of Sites 1 and 2 (Figure 5). Topography information was extracted for all locations using the LiDAR-derived DEMs. For the watershed, 100 out of the 460 locations, including two 300-m transects, were chosen for field estimations of SOC content and 137Cs inventory in 2006. The field samplings for Sites 1 and 2 were collected in 2003. A 25 × 25 m grid was created for each site of Sites 1 and 2. The 230 samplings were obtained at grid nodes. At each location, we collected three samples that were located within a 1 m × 1 m quadrat from top 30 cm of soil using a push probe (3.2 cm diameter). At locations where sediment depositions were observed, deeper soils from the 30 to 50 cm layer were collected. Four reference soil samples for estimation of the baseline 137Cs inventory were collected from a local cemetery in WCW where no apparent soil redistribution had occurred from the 1950s. Trimble RTK 4700 global positioning system (GPS) was used to record the locations of sampling.
During laboratory analyses, bulk density was calculated after drying soil at 90°C for 48 hours based on the soil volume and dry mass weight. Then, the three samplings were mixed and sieved through a 2 mm screen. We ground subsamples that were taken from the composite soils to fine power with a roller mill and measured soil total C content by dry combustion at a temperature of 1350°C using an elemental analyzer (LECO CNS 2000, LECO Crop., St. Joseph, MI). Then, C content in CaCO3 was analyzed by dry combustion after the soil sample was baked in a furnace at 420°C for 16 hours. Estimates of SOC content (SOCcontent, %) were obtained from the differences between total C content and C content in CaCO3. SOC density (ρSOC, kg m−2) of the top 30 cm layer was calculated from the bulk density (ρbulk) and SOC content using the equation of ρSOC = SOCcontent × ρbulk × 0.3.
Measurement of 137Cs inventory used another subsample of the sieved soil sample and placed and sealed in a Marinelli beaker. The 137Cs concentration was estimated by Canberra Genie-2000 Spectroscopy System that receives input from Canberra high purity coaxial germanium crystals (HpC > 30% efficiency) into three 8192-channel analyzers through gamma-ray analysis. Analytic mixed radionuclide standard (10 nuclides) that follows the U.S. National Institute of Standards and Technology was applied for calibration the Spectroscopy System. The measurement precision is between ±4 and ± 6%. Unit of 137Cs concentration is in Becquerels per gram (Bq g−1) and was converted to 137Cs inventory in Becquerels per square meter (Bq m−2) using soil bulk density.
Calculation of SR rates based on 137Cs inventories was carried out by applying a Mass Balance Model II in a spreadsheet Add-in program . Before running the model, parameters of tillage depth, proportion factor, and relaxation depth were set to 0.25 m, 0.5, and 4 kg m−2, respectively. The baseline 137Cs inventory estimated from the mean of 137Cs inventory in Ref. sites was 2657 Bq m−2 for Sites 1 and 2 in 2003 and 2526 Bq m−2 for the WCW in 2006. Positive SR rates were obtained when 137Cs inventories were higher than the baseline and the sites were referred as depositional sites; while eroded sites were considered when negative soil SR rates were estimated under conditions of lower 137Cs inventories than the baseline. Details of soil sampling and laboratory analyses can be found in Ritchie et al.  and Li et al. .
4.2.2 Topographical analysis
Fifteen topographic metrics that were discussed previously were used as ancillary variables for the development of the DTM-based SOC and SR models. All metrics were derived from DEMs generated from high resolution (1 m horizontal and 0.1 m vertical resolutions) LiDAR data . Before generation of topographic metrics, inverse distance weighted interpolation was applied to produce 3 m spatial resolution DEMs after converting the raw LiDAR data to LAS files.
Topographic metrics were derived based on the 3 m DEMs after filtering twice by a 3-kernel low pass filter. Modules in an open-access software of the System for Automated Geoscientific Analysis (SAGA) v. 2.2.5 were applied to generated 14 of the selected topographic metrics including slope gradient (G), aspect (A), profile curvature (P_Cur), plan curvature (Pl_Cur), general curvature (G_Cur), flow accumulation (FA), positive topographic openness (PTO), upslope slope (Upsl), flow path length (FPL), downslope index (DI), catchment area (CA), topographic wetness index (TWI), stream power index (SPI), and slope length factor (LS). Topographic relief (TR) was calculated by the difference between a maximum elevation map within a specific area and the filtered 3 m DEMs. In order to reduce errors due to an arbitrary selection of the radius of the specific area, a series of maximum elevation maps with multiple radiuses including 7.5, 15, 30, 45, 60, 75, and 90 m, were used to generate TR maps with different spatial scales. Principal component analysis (PCA) and varimax rotated Factor Analysis (FAn) were used and converted the TR maps into two main topographic relief components (TRPC1 and TRPC2) and two topographic factors (TRFA1 and TRFA2). The detailed topographic metric processing can be found in Li et al. .
4.2.3 Statistical analysis
Spearman’s rank analysis was applied to understand the impacts of topographic metrics on SR and SOC distribution patterns. Due to high correlations between some of the topographic metrics, PCA and varimax rotated FAn were used to limit errors caused by collinearity between topographic variables. The PCA analyzed topographic metrics from the 460 locations of the WCW. Loadings for the first eight components that explained 90% of the variance of all metrics were selected and used to calculate topographic principal components (TPCs) at the field Sites 1 and 2. Similarly, eight topographic factors (TFAs) at field sites were also estimated based on loadings from the watershed using FAn with varimax rotation.
4.2.4 Model calibration and evaluation
The stepwise linear regression with “leave-one-out” cross-validation was applied for MLR, MLR-PCA, and MLR-FAn model development using the topographic metrics from two field sites. Akaike Information Criterion was used to select variables contained in each model. The SOC density and SR rates were log-transformed to meet the assumption of residual normality. Model efficiencies were assessed with the following three criteria. The first one is the adjusted coefficient of determination (), which adjusts coefficient of determination based on the number of predictors in the model. The second one is Nash-Sutcliffe efficiency (NSE). Ranging from −∞ to 1.0, the NSE estimates the ratios of residual variance to measured variance. The model performance was considered acceptable when the NES is in a positive value. The third one is the ratio of the root mean square error (RMSE) to the standard deviation of measured data (RSR). It standardizes RMSE. The smaller the RSR value is, the higher efficiency it indicates. Usually, the model performance is considered as satisfactory if the NSE value is larger than 0.5 and the RSR is <0.7 .
4.3 Results and discussion
4.3.1 Topographic impacts on soil properties
The high-resolution topographic metrics derived from LiDAR data presented detailed topographic information in the WCW. Take field Site 1 as an example, Figure 6 exhibited characteristics of each topographic metric in response to the elevation. Seven topographic metric maps, including catchment area (CA, Figure 6f), downslope index (DI, Figure 6h), flow path length (FPL, Figure 6i), flow accumulation (FA, Figure 6j), topographic relief component 1 (TRPC1, Figure 6k), topographic relief factor 1 (TRFA1, Figure 6m), and topographic wetness index (TWI, Figure 6p), showed high values in depressional areas and low values in sloping and ridge areas. Positive topographic openness (PTO, Figure 6o) had a reverse pattern compared to the above seven metrics. It showed high values in ridge areas where a wider view of a landscape can be seen. For slope gradient (G, Figure 6a), upslope slope (Upsl, Figure 6g), topographic relief component 2 (TRPC2, Figure 6l), topographic relief factor 2 (TRFA2, Figure 6n), stream power index (SPI, Figure 6q), and slope length factor (LS, Figure 6r), high values were observed in sloping areas, but low values were found in ridges and depressional areas.
Most topographic metrics showed significant correlations with SOC density and SR rates except A. The A was slightly correlated with SOC density (r = −0.097; P = 0.02) and insignificantly correlated with 137Cs inventory (P > 0.05) and SR rates (P > 0.05). Generally, stronger topographic controls on SOC density than 137Cs inventory and SR rates were observed (Table 2). TWI, TRFA1, TRPC1, CA, FPL, DI, FA, SPI, and TRFA2 were significantly positively correlated with SOC density and G, LS, PTO, Upsl, Pl_Cur, G_Cur, TRPC2, and P_Cur were significantly negatively correlated with SOC density. For both 137Cs inventory and SR rates, similar high related topographic metrics (|r| > 0.5) were observed, including TRPC1, TWI, TRFA1, G, and CA.
Topographic wetness index (TWI) was the most influential topographic factor with a correlation coefficient up to 0.735. The finding is consistent with the high TWI impacts on SOC in previous studies. The high impact of TWI suggests that soil water content distribution was an important driver of SOC dynamics in the WCW. In areas with high TWI and possibly elevated water content, litter decomposition rates decrease and plant productions increase, which increases SOC input and accumulations and results in high SOC density in the soil; while low soil water content areas provide an adequate environment for rapid aerobic decomposition of soil C, leading to a negative correlation between SOC and TWI [3, 4, 32, 46, 152].
Topographic relief (TR) was found to be the most important factor for 137Cs inventory and SR rates with correlation coefficients of 0.686 and 0.687, respectively. This metric was also highly correlated with SOC density. The strong effects of TR on soil properties may be due to its influence on flow velocity. The flow velocity reflects runoff shear stress, which would impact the sediment transport capacity of the runoff [153, 154]. Thus, as the TR increases, the flow velocity and runoff shear stress increases, leading to enhance in sediment transport capacity, which increases the transports of 137Cs and SOC-enriched fine fraction of sediments from low TR areas to the high TR areas.
Slope gradient (G) was another important factor for SOC density, 137Cs inventory, and SR rates with absolute correlation coefficients larger than 0.6. Our findings are consistent with those of other researches, reporting high erosion rates in areas with relatively steep slopes [151, 155]. In agricultural fields, the main erosion processes include both water and tillage erosion [156, 157, 158]. Soil and associated SOC are transported to downslope due to gravity-driven lateral transport by overland and concentrated flows. Tillage operations would also cause redistribution of soil by small downslope movements of soil associated with each operation. Furthermore, as discussed in Section 2, G increase could enhance runoff and decrease infiltration, reducing water content in soil in the flat watershed area [15, 17, 18]. The controls of G on water and tillage erosion and water content could be related to the high slope impacts on soil properties in agricultural areas.
4.3.2 DTM-based models on soil property predictions
Since slope aspect (A) showed a weak correlation with SOC and no significant correlations with 137Cs inventory and SR rate, we removed the A for the following DTM-based model development. Therefore, 17 topographic metrics, including slope gradient (G), curvature related metrics (P_Cur, Pl_Cur, and G_Cur), catchment area (CA), upslope slope (Upsl), downslope index (DI), flow path length (FPL), flow accumulation (FA), topographic relief principal components 1 and 2 (TRPC1 and TRPC2), topographic relief factors 1 and 2 (TRFA1 and TRFA2), positive topographic openness (PTO), topographic wetness index (TWI), stream power index (SPI), and slope length factor (LS) were used for building the MLR models. We only used TRPC1 and TRPC2 to represent topographic relief for MLR-PCA development and TRFA1 and TRFA2 for MLR-FAn.
The MLR, MLR-PCA, and MLR-FAn models were developed based on topographic and soil property data at the two field sites (Table 3). The MLR models showed the best simulations of SOC and SR rates with the highest and NSE values and the lowest RSR values over the three types of models. The two MLR models contained more than 7 predictors. The MLR-PCA model had a slightly lower efficiency than MLR-FAn model in simulating SOC density, but exhibited similar performance compared to the MLR-FAn model in SR rate simulations. The predictors included in the MLR-FAn were more than the MLR-PCA models. There were 6 and 5 factors included in MLR-FAn SOC and SR models, respectively; while only 4 and 5 components were contained in the MLR-PCA SOC and SR models, respectively.
|SOC||2.98 + 0.071TRFA1 − 4.23G-9.29G_Cur + 0.0004FPL + 0.030TRPC2 + 0.103Pl_Cur + 0.063DI†||0.723||0.727||0.522|
|SR||2.12 − 3.12G + 0.107DI + 0.019TRFA2 + 0.0002FPL + 0.915Upsl + 0.010TRFA1 − 0.002SPI-1.53G_Cur||0.655||0.659||0.584|
|SOC||2.94 − 0.060TPC2 − 0.024TPC3 + 0.051TPC7 + 0.037TPC1||0.684||0.686||0.560|
|SR||2.11 + 0.013TPC1 + 0.032TPC7 − 0.028TPC2−0.016TPC3 − 0.010TPC6||0.625||0.629||0.609|
|SOC||2.92 − 0.101TFA1 + 0.074TFA4 + 0.045TFA7 + 0.026TFA8 + 0.037TFA2 − 0.027TFA3||0.706||0.710||0.538|
|SR||2.10−0.047TFA1 − 0.011TFA3 + 0.026TFA7 + 0.034TFA4 + 0.025TFA8||0.620||0.624||0.613|
Although the MLR showed the best simulation performance for the two field sites, the MLR-PCA had the highest prediction efficiency when applying models to predict the spatial patterns of SOC and SR rate over the watershed (Figure 7a). The SOC predictions by MLR-PCA explained 60% of the variability in observed SOC in the WCW. The NSE value was larger than 0.5 (0.591) and RSR value was <0.7 (0.639), which suggested a satisfactory performance of SOC prediction by the MLR-PCA model. The prediction efficiencies of MLR and MLR-FAn models were lower than the MLR-PCA model with correlation coefficients of 0.39 and 0.49, respectively. Based on these results, the SOC map over the watershed was generated based on the MLR-PCA model (Figure 8). The derived SOC map captured the majority spatial variability in SOC density as reflected by consistent spatial patterns between observed and simulated SOC density. High values of SOC density were observed in depressions and low values were found in ridges and sloping areas.
Lower efficiencies were observed for SR rate than SOC density when compared the model predictions in the WCW (Figure 7b). The MLR-PCA SR model showed the highest correlation coefficient. However, the low NSE and high RSR values indicated that the model could not well predict SR rates when applying a model developed at field-scale for predictions at watershed-scale.
The better performance of MLR-PCA models relative to MRL models may be due to the exclusion of multicollinearity by PCA. High correlations (|r| > 0.8) were observed for some of topographic metrics, such as G and Upsl, and G_Cur and Pl_Cur. Uncertainty increases due to the high collinearity because models can be significantly influenced by small changes in the high collinearity predictors . Thus, the MLR models were less stable with lower efficiencies in predicting SOC density and SR rates when applying to different spatial scales. The use of PCA could eliminate the multicollinearity and increase the stability of model since the PCA converted the 15-dimension topographic dataset to eight mutually independent combinations (TPCs) [159, 160].
Furthermore, by analysis of TPC loadings, hidden relationships between topographic metrics were uncovered, which could be another advantage of using the PCA . For example, in this study, we selected TPCs 1, 2, 3, 6, and 7 for model development (Table 4). The high loading (|loading| > 0.35) topographic metric in TPC1 was G_Cur (−0.353), and thus, this component was associated with runoff divergence. G (0.475), TWI (−0.465), Upsl (0.419), and LS (0.396) were the high loading metrics for TPC2, which indicated that TPC2 was associated with soil water content. TPC3 were associated with runoff volume since the high loading metrics were FA (0.482), SPI (0.460), and CA (0.400). TPCs 6 and 7 were associated with runoff velocity and flow acceleration, respectively. Based on the TPCs, we can obtain a better understanding of the controlling components for SOC distribution and SR. For the low-relief agriculture watershed under study, the spatial patterns of SOC and SR rate were mainly impacted by soil water content (TPC2) and runoff divergence (TPC1), respectively, according to the priority of TPCs used in model development. This conclusion is also consistent with findings by Fox and Papanicolaou  that indicated flow divergence significantly influenced soil erosion from uplands in a low-relief watershed.
The lower efficiencies of MLR-FAn than MLR-PCA may be because the latter approach diminishes the risk of over-fitting the models. The difference between PCA and FAn is that PCA considers all of the variance in the matrix, including unique, error and shared variance; while FAn extracts and exhibits shared variance only. Although some studies were preferable to FAn because of its ability to understand the underlying structure by extracting latent shared variance [162, 163], others also proved that there were almost no practical differences between the two methods [164, 165]. In this study, we found that both methods had similar performance during model calibration in small-scales. However, including more predictors in MLR-FAn models may enhance the instability of models and increase uncertainties during extrapolating prediction points over a large-scale [159, 166].
This case study demonstrated the importance of topography on soil properties in the low-relief watershed. DTM-based models are feasible for SOC predictions at different spatial scales. By combining MLR with PCA, the model efficiencies increased during soil property prediction. The DTM-based mapping techniques can be improved by further refinement remotely sensed data, improvement of the topographic dataset, and development of modeling techniques such as including Hybrid Regression and Artificial Intelligence techniques. The large-scale soil property maps can provide a more sound scientific basis for understanding of the mechanisms underlying the topographic impacts on soil movement in agricultural landscapes and the fate of SOC at the watershed and regional scales.
This research was supported by the USDA Natural Resources Conservation Service in association with the Wetland Component of the National Conservation Effects Assessment Project (NRCS 67-3A75-13-177). All data used in this study has been published on the USDA Ag Data Commons and can be accessed using DOI:
Conflict of interest
No conflict of interest exists in the submission manuscript.