Mapping subsurface clay minerals is an important issue because they have particular behaviors in terms of mechanics and hydrology that directly affects assets laid at the surface such as buildings, houses, etc. They have a direct impact in ground stability due to their swelling capacities, constraining infiltration processes during flooding, especially when moisture is important. So detecting and characterizing clay mineral in soils serve urban planning issues and improve the risk reduction by predicting impacts of subsidence on houses and infrastructures. High-resolution clay maps are thus needed with accurate indications on mineral species and abundances. Clay minerals, known as phyllosilicates, are divided in three main species: smectite, illite, and kaolinite. The smectite group highly contributes to the swelling behavior of soils, and because geotechnical soil analyses are expensive and time-consuming, it is urgent to develop new approaches for mapping clays’ spatial distribution by using new technologies, e.g., ground spectrometer or remote hyperspectral cameras [0.4–2.5 μm]. These technics constitute efficient alternatives to conventional methods. We present in this chapter some recent results we got for characterizing clay species and their abundances from spectrometry, used either from a ground spectrometer or from hyperspectral cameras.
Soils represent a complex environment, spatially and temporally dynamic in their structure as in their composition . They provide essential services to humanity such as water storage and filtration, agriculture support, storage carbon to regulate the climate, and physical support of buildings. So, soil knowledge, in particular their clay mineral composition and their mapping, is necessary for the decision-making on the management of many human activities. The study of clay minerals is most of the time motivated by the assessment of the risk associated to shrinkage-swelling phenomenon that affects building; sometimes, they are also taken into consideration in flooding/infiltrating effects and in the evaluation of the vehicles’ mobility. It is important to specify that the term “clay” may correspond to two distinct definitions in geology. From a physical point of view, clay minerals correspond to a texture class, e.g., a classification defined by the size of minerals in soils. In that classification, gravels are defined as elements larger than 2 mm, sands have a grain size of between 2 mm and 50 μm, silts have grain size between 50 and 2 μm, and clays have grain size lower than 2 μm.
From a mineralogical point of view, montmorillonite (i.e., the smectite group), illite, kaolinite, and interstratified minerals are the most common clay species that are commonly involved in swelling and shrinking processes. In the following, clays refer to this last mineralogical definition.
The shrinkage-swelling effect of soils is a phenomenon causing numerous damages on houses when built on soils containing smectite minerals. Indeed, these so-called swelling clays are sensitive to soil moisture content, since they shrink during periods of drought and swell after rain. The presence of water variations causes changes in volumes producing cracks in the soil structures and therefore vertical differential movements at the surface. In France, these damages reach 38% of natural disaster compensation costs after the floods. For the period 1990–2014, this overall cost represents a little more than 9 billion euros or 370 million euros per year . In Great Britain, the association of insurers British estimated the cost of shrinkage-swelling to more than 400 million pounds each year . In the USA, the economic cost of these claims is $15 billion annually . As far as we know, population increase as well as projections of climate change should increase this risk at temperate latitudes, which in the future will affect areas previously untouched by drought. Identification of soils impacted by this phenomenon is currently based on specific mineral identifications, e.g., using X-ray diffraction (XRD) techniques, carried out on soil samples and difficult to implement at large scale. At the same time, some hazard maps (1:50000) were produced from geological data to identify clayed formations . Unfortunately, these maps cannot consider local spatial heterogeneities, from one to hundreds of meters. In addition, mapping clay texture is not sufficient to evaluate the swelling capacity of clayed soils. To solve this issue, in situ and/or proximal sensors can be used.
Several authors have successfully quantified mineral clays in soils [6, 7, 8] by field spectroscopy and laboratory spectral measurements. In these studies, measurements were generally carried out on dry soils for avoiding spectral perturbations due to moisture, and under ideal conditions of illumination, away from real cases contexts found in the field. Airborne hyperspectral imagery has also been successfully used to detect clays [9, 10, 11], despite low spatial resolution offered by sensors and low signal-to-noise ratio in the spectral range affected by clays (1000–2500 nm). Recent advances in UAV-type platforms for hyperspectral imaging are expected to remove some of these limitations by a better spatial resolution of acquired images, moving from meters for airborne to centimeters for UAV [12, 13]. These advances must offer more pixels of pure soils and thus improve the quantification of clay minerals. Indeed, quantifying clay species from spectral data needs taking into account mixing spectral signatures of minerals, simply because they are mixed in the soil. Some studies hypothesize a linear mixture of soil mineral spectra (or “patchwork”). That means each component will have its spectral signature mixed in proportion to its abundance in the soil [6, 7, 8], which is an approximation because the diffusion of light induces nonlinearities on the spectral behavior of the reflectance present in an intimate mixture . The impact of this phenomenon needs to be clearly assessed in order to correctly quantify clay species.
In the following, we propose a review on these different issues and describe the different approaches able to quantify clay species from hyperspectral data. This overview is based on different pieces of works realized in lab but also on the field, with different instrumental devices and several processing techniques.
2. Spectrometry experiments and data processing
The principle of spectrometry is based on the measurement of the interaction between an electromagnetic radiation and a given material at different frequencies. Applied to mineral characterization, this technique gives crystallo-chemical information on the material from its interaction with the incident radiation. Depending on the selected frequency of the radiation (ultraviolet, visible, infrared, etc.), the interaction produces various types of energy. This response is represented as a spectrum that is an intrinsic characteristic of the material . The infrared radiation (IR) is an electromagnetic radiation, corresponding to the spectrum between 12,800 and 10 cm−1 (0.78–1000 μm). Figure 1 shows the infrared electromagnetic spectrum that can be decomposed in three parts: the near, the middle, and the far IR. For mineral characterization, the domains of interest are the near-infrared (NIR) and the shortwave infrared (SWIR), which extend, respectively, from 0.75 to 1 μm and from 1 to 2.5 μm.
When an IR radiation interacts with a molecule, it can absorb partially and selectively this radiation, leading to modifications of the vibrational and rotational energy of the molecule. These energy losses lead to the presence of absorption bands at specific wavelengths corresponding to the frequencies at which the molecule is excited. The absorbed energy is therefore characteristic of each of the chemical bonds of the molecule. In the case of clay minerals, absorption bands are mostly visible in the SWIR domain. The complexity of working with absorption bands comes from the presence of water that also produces numerous absorption phenomena masking large parts of the spectrum (Figure 2). To predict soil properties related to the presence of clay minerals, intensive research has been carried out in reflectance spectroscopy in the visible near-infrared (VNIR; 300–1100 nm) and SWIR wavelength domains .
Interpreting correctly the spectrums resulting from interactions between SWIR irradiations and clayed soils is thus no straightforward, due to the noise coming from atmosphere, the presence of water molecules and the complexity of soils mineralogical composition.
Various approaches can be used to predict the clay mineralogical compositions of soils from measured spectra when the sample number is sufficiently high, e.g., multivariable regression analysis (MRA) or partial least square regression (PLSR). For example,  have successfully estimated the smectite content of soils in the Colorado Front range by using a PLSR analysis of second derivative reflectance spectra measured in the field. MRA was also successfully used to quantify clay content in soil, independently of the nature of clay minerals [18, 19]. However, such approaches required a large number of observation samples to carry out the analysis but also to validate the regression accuracy. They are also site dependent, meaning that the calibration-validation processes need to be performed specifically for the studied sites. To tackle this issue with minimal uncertainties, we propose to start with simple experimental setups by analyzing in the laboratory the spectral responses of pure clays and mixtures of two or three species of clays.
2.1 Making and testing a spectral database from synthetic mixtures and a spectrometer
The objective of this first approach consists in preparing simple mixtures composed by pure clay minerals. They were prepared by  using the most common clays: montmorillonite, illite, and kaolinite, each of them provided by material sellers. The particle sizes of the minerals were measured with a VASCO-2 laser grain size analyzer and estimated to be about ~450 nm for the illite and the kaolinite and about ~475 nm for the smectite. The pure clay minerals were mixed using an agate mortar to produce mixed powders. A total of 27 binary mixtures of 10/90, 20/80, 30/70, 40/60, 50/50, 60/40, 70/30, 80/20, and 90/10 mass-percent ratios of kaolinite/illite, illite/montmorillonite, and montmorillonite/kaolinite were produced, as well as 19 ternary mixtures of kaolinite/illite/montmorillonite  (Figure 3).
All samples were dried and brought to humidity conditions of the laboratory. The reflectance spectra were measured in the laboratory using an ASD FieldSpec Pro. This spectrometer is portable and able to probe from 350 to 2500 nm in the electromagnetic spectrum. Its spectral resolution ranges from 10 nm with a 2 nm sampling interval in the SWIR. The mixtures were placed into Petri boxes, in contact with the probing system. A standard white Spectralon (Labsphere) was used to calibrate the reflectance reference. To increase the signal-to-noise ratio, the resulting spectrum was computed as the average of 10 spectral measurements .
As soon as the spectra are available for all the mixtures, a comparative analysis is used to relate a set of parametric observables derived from the spectrum morphology and the mineralogical composition of mixtures. Before this step, and in order to remove the large wavelength effects from each spectrum, a continuum-removal is applied as shown in Figure 4 . This processing leads to normalize the reflectance spectra and highlights absorption bands. The principle consists in connecting local maxima of the spectrum to obtain a good fit across the 350–2500 nm spectral domain . After this processing, the continuum-removed spectrum has values ranging between 0 and 1 . After this step, various geometrical parameters can be measured on the spectral curve as suggested by . Indeed, this approach has the advantage to manipulate a few set of value to characterize a specific absorption band rather that considering overall values of the curve. The considered geometrical parameters are the following:
The wavelength position corresponding to the minimum reflectance of the absorption band. In Figure 4, it corresponds to values around 1400 nm (P1400), 1900 nm (P1900), and 2200 nm (P2200).
The depth, which is the length of the absorbing pattern along the reflectance axis. In Figure 4, the depth is estimated around 1400 nm (D1400), 1900 nm (D1900), and 2200 nm (D2200).
The asymmetry of absorption band, calculated from the ratio between the right width and the left width measured at the half depth of the absorption band. In Figure 4, the asymmetry is about 1400 nm (A1400), 1900 nm (A1900), and 2200 nm (A2200).
The width of the absorption band, measured at half depth. In Figure 4, the width is estimated to be around 1400 nm (W1400), 1900 nm (W1900), and 2200 nm (W2200).
As already mentioned by  or , the geometry of absorption bands around 1900 or 2200 nm is directly linked to the clay mineralogical composition. In particular, these studies show that the depth parameter can be efficiently used to assess the clay composition. If we plot the distribution of mixtures along 3 axes representing the depth parameter for 1400, 1900, and 2200 nm positions, we can identify regions where kaolinite, illite, and montmorillonite are particularly predominant, forming 3 corners of a triangular 3D shape. Elsewhere, kaolinite, illite, and montmorillonite contents in the mixtures decrease from their corner toward the opposite sides of the triangular shape  (Figure 5).
Even if these results are promising, they are not enough accurate to be exploited in real conditions. In particular, the development of a methodology able to statistically invert the abundance of clay species composing the mixtures from the absorption band parameters still needs to be tested. Such a study was carried out by , working with a higher complexity in preprocessing spectral data and trying to identify a robust unmixing method to estimate the clay abundances in the mixtures.
2.2 Processing laboratory hyperspectral images of synthetic mixtures, unmixing issues
To have a statistical assessment of the spectral response measured on the mixtures, the spectrometer was replaced by a hyperspectral optical sensor. This device is similar to that used by , with two cameras, located 1 m from the sample, and a lamp for each camera inclined to 35°. The reflected signal is recorded by two hyperspectral cameras (HySpex—Norsk Elektro Optikk—VNIR-1600 and SWIR-320 m-e). Only SWIR camera data is used, with 256 spectral bands and a spectral resolution of 6 nm in the range 1000–2500 nm. The camera has a measuring field of 240 mm (FOV 13.5°) and a spatial resolution of 0.75 mm. Between measurements, a white reference Spectralon R® is used to overcome any possible drift of instruments. Raw images highlight a nonuniformity of the illumination due to side effects. Experimental variograms realized on each band of reflectance images allowed to analyze this effect and to propose a masking protocol to remove pixels too far from the homogeneous behavior observed at the center of images. The following methodological chain is based on (i) spectral preprocessing to transform reflectance spectra in a standardized form and (ii) linear and nonlinear unmixing algorithms to derive mineral abundance for each mixture (Figure 6). Preprocessing techniques were selected from the literature and concern:
Standard normal variate (SNV) consists in applying a translation and a homothety of the spectrum using its mean and standard deviation .
Continuum-removal (CR) deletes the continuum to normalize the reflectance spectrum .
Continuous wavelet transform (CWT) splits the signal into a wavelet sum of Gaussian function (e.g., “Mexican Hat”). The signal is broken down into 10 scales, the first one (corresponding to the noise) and scales higher than 5 (global variations of the spectrum-continuum) are suppressed .
Hapke’s model  estimates the single diffusion albedo considering that the medium is an isotropic mixture with the same particle size for all components.
First derivative (1St SGD) calculated according to .
Transformation into pseudo-absorbance (Log (1/R)) based on the correlation between the bands of spectral absorption and concentration of compounds .
Once spectra are preprocessed, several unmixing techniques can be tested to determine abundances. Before, it is necessary to compare observed spectrum to reference spectrum, i.e., spectrum of pure minerals (end-members) present in the mixture. On the one hand, if all the minerals present are known, one can use spectral libraries existing in the literature. Otherwise, algorithms able to determine in the observed data those which represent the most pure end-members can be used such as SISAL  or minimum volume . Four linear and nonlinear unmixing algorithms were used to estimate abundances in clay minerals from mixtures described in the previous chapter (Figure 7):
FCLS is the most popular linear unmixing method and has nonnegativity constraints (abundances must be equal or higher than 0), and the sum of abundances of each end-member must equal to one .
MESMA, similar to FCLS, takes into account the intra-class variability of each mixing pole.
The GBM method  can take into account nonlinear effects by the way of an additional parameter.
The multilinear model (MLM) method  uses a parameter to manage nonlinearity; for zero, the model becomes linear.
The results show that the unmixing method performance depends on the mineralogy of the mixture, the difficulty arising when clay species have very similar spectrum in the considered wavelengths. We can also note that the linear and nonlinear methods have similar performances on these mixtures, the recommended method being in fact the simplest to use, i.e., FCLS. Finally, the benefit brought by spectral preprocessing is very important. CWT and first SGD give one of the best performances on unmixing quality by decreasing the intra-sample variability .
2.3 From lab measurements to field observations
A good example of validation and comparison between lab models and field observations is given by . The sampling area is located close to Orleans city (France) along the Loire River. The fluvial deposits are mainly composed of sandy materials contained in a clay matrix, containing also pebbles and boulders. In this study, 332 samples of soil were collected, spread over the various geological formations where swelling risk is present. As in , spectrum where decomposed in geometrical parameters, more suitable for quantitative analyses. As shown in Figure 8a, the ratios of the depth parameters for different absorption bands (D1400 over D2200 vs. D1900 over D2200) demonstrate that the montmorillonite and illite end-members appear in the scattered plot. This approach could be used to roughly evaluate the content of these clay species in the soil samples.
To evaluate the uncertainties related to this approach, 31 samples of the dataset were analyzed using X-ray diffraction, and comparison were carried out between montmorillonite content measured from XRD and montmorillonite content estimated from spectroscopy. Although the distribution of points presents a certain dispersion, the correlation ratio, close to 0.84, confirms the potential of using geometrical characteristics of spectra to assess the abundance of clay species.
The geotechnical issues raised by swelling clays need to be addressed to evaluate the vulnerability of buildings and houses lying on clayed soils geological environment. To reduce costs of analyses, classically consisting in lab measurements (e.g., XRD), methodologies based on spectroscopy can be used. This chapter shows last advances in evaluating clay species abundances, in particular for montmorillonite, from spectroscopy or hyperspectral approaches in the SWIR domain.
A first step was the development of metrics to discriminate clay minerals from their spectral response. For this purpose, mixtures were realized from pure clay minerals, and their spectra were systematically analyzed using geometrical parameter such as the depth of the different absorbing band patterns. From this database, we showed that a discrimination was possible, at least to have a qualitative estimation of the swelling capacities of concerned soils. This result was validated from the field by comparing the abundances estimated coming from spectroscopy and from XRD techniques. Another approach based on hyperspectral image processing was presented. Different preprocessing algorithms and unmixing techniques were applied to the mixture dataset for performance evaluation. The results are also very conclusive since RMS values between estimated and observed abundances are satisfactory.
This overview gives important perspective in the domain. If spectroscopy can evaluate clay mineral abundances in soils and in particular those who have swelling capacities, the possibility to use remote hyperspectral camera for this purpose could be considered. The next perspective are thus to test this probing technique to field data in real condition. The heterogeneous solar lightning; the presence of vegetation, calcite, or quartz pebbles; and possibility of moisture variations in soils are, for instance, the next issues to work on. Due to recent developments in UAV, new possibilities could be found for carrying hyperspectral cameras in SWIR domain and reaching information with higher signal-to-noise ratio and better resolution. These advances should open new perspectives for accurate and less expensive productions of clay maps.
Authors would like to thank BRGM and ONERA for funding these studies. We also thank the different students (C. Truche, G. Duffrechou, and E. Ducasse) who take in charge a large part of experiments, analyses, and processing, as well as the technicians involved in the lab tasks.