Chemometric Analysis of Wetlands Remnants of the Former Texcoco Lake: A Multivariate Approach

The former Texcoco Lake (TL) belongs to a closed basin (Basin of México), with a semiarid climate and soil types Solonchaks and Vertisol, which confer saline and alkaline properties to the waterbodies. Historically, this lake has been facing extraction of salt for agricultural and livestock use. As a result of population growth, the lake has lost more than 95% of its surface; however, currently TL is formed by permanent and temporary shallow wetlands interconnected by the rainy season which are used as buffer zones to avoid possible floods to the City of Mexico and as receptors of wastewater discharges. Due to the above, the former TL has a very diverse mosaic of conditions in the remain‐ ing wetlands, therefore their quality assessment comprises a complex task of interpreta‐ tion and analysis. To perform this, it is important to have a series of indicators and analysis tools that take into account the state of contamination of the waterbodies in different periods of study. Additionally, is imperative to have essential elements of interpreta‐ tion which allow to combine all the features of analysis in a comprehensive scrutiny of the health status of these water bodies. The purpose of this research is to analyze the quality of water and sediments of the remaining wetlands located in the basin of the former TL through various physicochemical parameters and heavy metals, and using multivari‐ ate methods and water and sediment quality indices.


Introduction
Due to constant and rapid human population growth, water bodies around the world have been threatened [1]; even large lakes are considered especially vulnerable because they are the final receivers of a number of pollutants, in most cases due to poor connectivity with other waterbodies, these pollutants are deposited indefinitely altering the quality of water and sediment [2]. Among the main activities affecting lentic systems worldwide are overexploitation of water and other resources associated with water bodies, contamination by direct or indirect sources, agriculture, and industry [3]. It has been estimated that coverage occupying lakes and ponds around the world ranges from 1.3 to 1.8% and although it is numerically dominated by small systems, the largest contribution in terms of the total area is given by the great lakes [4][5][6][7].
Another important aspect regarding the current status of epicontinental waterbodies is the distribution of freshwater lakes in relation to salt water. According to [8], about half of the epicontinental water of the world is saline (in area and volume), while the other half is freshwater; an epicontinental waterbody is considered saline if it exceeds the salt concentration of 3 g/L without marine environment contact [9]. Regarding its location, epicontinental saline lakes are commonly found in endorheic basins, arid or semiarid climates, and geographically within a latitudinal range of 20-40° in both hemispheres [6,10].
Mexico has 0.1% of epicontinental water of the world and more than half a million cubic meters is represented by lakes [8]. Saline lakes in Mexico are located mainly in the north and center of the country associated with arid and low rainfall [6,11]. In this sense, one of the most important saline lakes is the former Lake Texcoco, which has been a great biological and cultural referring in Mexico; Texcoco Lake and lakes of Zumpango, Xaltocan, Xochimilco and Chalco formed a large lake complex where the ancient Aztec civilization was established [12].
The former Texcoco Lake belongs to a closed basin (Basin of México), with a semiarid climate and summer rains, with maximum temperature ranges from 30 to 32°C in April and June, while the minimum from 2 to 5°C in October to March. The soil in the lake are Solonchaks and Vertisol, which give saline and alkaline properties to the waterbodies [10], favoring the concentration of ions of manganese, ammonium, nitrites, nitrates, sulfates, and metals [12]. According to [13], this lake is athalassohaline and alkaliphilic due the high concentration of ions such as magnesium, sulfate, calcium, sodium, and bicarbonate derived from rocks and geological weathering. Historically, this lake has been used for the extraction of salt from the ancient Aztec civilization; later, with the arrival of the Spaniards it has been continuously drained by agriculture, livestock, and expansion of the City of Mexico, losing more than 95% of its surface [6,12,14].
Nowadays, this lake has become a fragmented aquatic ecosystem, formed by the remaining temporary or semi-permanent wetlands interconnected during the rainy season whose function has been to avoid possible floods to the City of Mexico, as well as to act as a wastewater receptor. Because of this, in the former Lake Texcoco, there is a very diverse mosaic of environmental conditions in the remaining wetlands; thus, the assessment of water and sediment quality comprises a complex task of interpretation and analysis. To perform this task, it is important to use a series of indicators and analysis tools that account the assessment of the state of contamination of the waterbodies in different periods of study, using elements of multivariate analysis, which allow the joint interpretation of the parameters analyzed for a comprehensive scrutiny of the health status of these water bodies. In this sense, the aim of this research is to analyze the current condition of water quality of the remaining wetlands located in the basin of the former Texcoco Lake through numerous physical and chemical parameters and heavy metals, using multivariate methods and indices of water quality.

Chemometric analysis and multivariate approach
One of the great challenges in the aquatic ecosystems assessment is the opportunity of integrating natural phenomena and pollution that occur in rivers and lakes [15]. The elements of variation in a waterbody can be discrete or specific depending on the nature of the study site [16]; having a data set that reflects clearly the main trends and sources of pollution or variation of physical, chemical, and biological properties in these environments is crucial.
In this sense, chemometric analysis can be defined as a discipline derived from chemistry that uses mathematics, statistics, and logic to design or select the optimal experimental procedures that provide the most relevant chemical information data analysis for obtaining knowledge from natural systems [17]. This tool in conjunction with multivariate analysis is able to reduce the multidimensionality of the parameters in natural environments to get graphical models or indices that provide insight into the current state of water bodies [18]. In recent years, the application of different statistics multivariate techniques, such as cluster analysis (AC), principal component analysis (PCA), factor analysis (FA), and discriminant analysis (AD) have contributed to identifying possible sources that influence waterbodies, and have offered a valuable tool in the reliable management of water resources [19,20]. These techniques have been successfully used to classify water quality data and detect similarities between samples and/or parameters in many research studies [21,22].

Study area
The area of the former Texcoco Lake is located between 19° 20′N, 98° 05′W and 19° 35′N, 99°0 5′W (Figure 1). The lake is an area where Vertisol, Solonchak, and the saline soil type predominates [23], which characterizes this as one of the few saline lakes of Mexico. It is located at an average altitude of 2200 m, with an average annual temperature of 16°C and rainfall of 705 mm [24].  The waterbodies in the area of the former Lake Texcoco include a system of interconnected wetlands in an interrupted and ephemeral mode with a heterogeneity of uses including historical uses. The Lago Nabor Carrillo is an artificial waterbody that receives input from treated water from local plants and from zones of the east of Mexico City. On the other hand, the evaporation tank (Depósito de evaporación) is an artificial water body constructed for extraction of caustic soda, currently out of operation. In contrast, within these water bodies are the Laguna 16, the House of Hermit (Casa del Ermitaño), and Xalapango that are natural water bodies receiving contributions from rivers coming from the northeast and southeast regions adjacent to the area of Texcoco Lake.
In relation to wells, the depth of static water level fluctuates from 40 to 130 m [25]. The shallowest values are recorded in the southwestern portion of the aquifer near Lago Nabor Carrillo, which gradually increase by topographic effect from north and east, where the foothills of the mountains that delimit the aquifer are located.

Sampling methods and analytical procedures
Water samples were collected from 19 study sites from nine waterbodies, as well as a set of eight wells (Figure 1) in the area of the former Lake Texcoco in June, September, and November 2015. Water samples were taken at surface using plastic containers of 2.5 L and free from metal and detergents following the methods of field sampling for waterbodies receiving wastewater and sampling of water for human consumption proposed by [26].
All water samples were kept at 4°C in dark conditions for transportation for further laboratory analysis.
Water quality analyses included 25 physicochemical parameters (Table 1), of which only temperature, conductivity, pH, and dissolved oxygen were measured in situ. Likewise, ten metals were quantified in water, sediments, and wells. Analytical procedures were performed in accordance with the guidelines of Mexican standards [27,28].

Analysis and data interpretation
Trophic state index (TSI) [29] was calculated from the results of the total phosphorus concentration and using the following mathematical expression: where TSI (PT) is the trophic state index for total phosphorus and TP is the concentration of total phosphorus (mg m −3 ). The index expresses the trophic state based on the following scale: <30 as oligotrophic; 40-50 mesotrophic; eutrophic 50-70; and 70-100 or greater than 100 as hypereutrophic [29].
On the other hand, the Water Quality Index proposed by the National Sanitation Foundation (WQI-NSF) [30] generates results from 0 to 100, with zero considered the worst quality and 100 as excellent quality. The WQI-NSF has the following expression: where p is measured for the ith parameter; Ti is the transformed value from the curve pi quality of each parameter, such that Ti(pi) = qi and wi are the relative weight for the ith parameter. The WQI-NSF represents water quality in general giving different grades depending on the value obtained, but without incorporating qualifications for specific water uses, such as drinking water, recreational use, suitable for agriculture, etc.
Finally, the proposed Pollution Index (Cd) by [31] was calculated, which determines the relative contamination of different metals separately and presents the sum generated by each of these components. This contamination index is calculated according to the following equation: where C f i is the contamination factor for the metal, C A i the analytical result of the metal, C N i the maximum permissible limit of the metal. According to [32] Pollution levels are referred to Cd: Cd < 1 Low contamination

< Cd < 3 Medium contamination
Cd > 3 High contamination Additionally, in the case of sediments, [33] proposed the metal pollution index (MPI), which is a very simple index, calculated as a geometric mean concentration of metals by the expression: This index has the peculiarity that does not use reference levels, safety data sheets, reference sites, or baselines. There is no classification of polluted or unpolluted, but allows comparisons with other aggregations (sites) and highlighting those sites with high concentrations of metals.
The variation of each parameter over the periods and study sites was performed by a factor analysis (FA) to identify which parameters are statistically significant for further analysis. From this information, a cluster analysis (AC) was assessed using the Euclidean Distances Index and Ward ligation method.
The set of indicators of water quality analysis was performed by Pearson correlation in a PCA from a data matrix consisting of the study sites as rows and environmental parameters as columns. The array data was standardized by X = log(x + 1) to make evident the chemometric trends of water bodies from the former Texcoco Lake. All statistical analyses were performed using the software XLSTAT-2015.

Physico-chemical characterization of the waterbodies from the former Lake Texcoco
The water bodies of the former Texcoco Lake have a high conductivity with values ranging between 932 and 12.266 mS/cm, mainly due to the high concentration of chlorides and metals such as Fe and Mg. On the other hand, the concentration of total suspended solids, dissolved solids, and settleable solids is very high, with values of up to 88.39 mg/L in the study sites: S2 (Laguna Cola de Pato) and S5 (Laguna 14); these conditions are due to the shallowness of the waterbodies and because the allochthonous materials entering the waterbodies are easily deposited. Furthermore, the high content of dissolved material in the water increases the concentration of BOD 5 and COD, reaching values of up to 195.76 and 7,858 mg/L respectively, in S2 and S5 sites.

Factor analysis
The FA for the set of physicochemical parameters showed that the first factor F1 (51.15% of the explained variance) includes the parameters related to the input of organic compounds, such as dissolved oxygen, BOD 5 , COD, total nitrogen, ammonia nitrogen, and organic nitrogen, as well as elements that provide ions and suspended materials such as conductivity, total solids, fluorides, phosphates, sulfates, and MBSA. In the second factor F2 (20.62% of the explained variance), the factors related to the obstruction in light penetration were pondered such as suspended solids, turbidity, oil, and grease ( Table 2)

Cluster analysis
The nine water bodies were separated into two groups and four independent units (Figure 2). The first group corresponds to the study sites S3 (Laguna Evaporation), S1 (Lago Nabor Carrillo, the largest waterbody and receiving wastewater), and S4 (Laguna Colorada); these sites are in the south and southeast of the former Texcoco Lake. The second group corresponds to the study sites S2 (Laguna Cola de Pato) and S5 (Laguna 14), which are located in the northwest region of the former Texcoco Lake. These waterbodies have the highest conductivity values, suspended materials (ST, SST, SDT and SD), BOD 5 , and COD. Finally, the four independent units correspond to S6 (Depósito de evaporación), S7 (Laguna 16), S8 (Casa del Ermitaño), and S9 (Xalapango). These sites are in the north and northeastern portion of the lake.

Water quality index
The WQI-NSF scores ranged from 18.51 in S3 in September to 44.19 in S7 during June. However, all scores in the WQI-NSF qualify from waters with poor quality (41 to 50) to very poor quality (0-40) (Figure 3a). Mean values of WQI for all study sites and periods of study are 32.39, 24.71, and 28.95 in June, September, and November, respectively. However, the dispersion of the data in September is wider, indicating low and high values of WQI-NSF in some water bodies.
According to the records of the National Meteorological System [34], the rainy season of this region ends in September (96.6 mm/month), which could contribute to the depletion of WQI-NSF by the increased input of allochthonous materials and increased wastewater flow from the rivers of the eastern region of the former Texcoco Lake. Similarly, the runoff in a closed basin directly increases settleable and suspended solids in the waterbodies, causing further deterioration of water quality.
According to [14,35], the rivers from the eastern portion of the former Texcoco Lake provide to these water bodies high values of organic matter and fecal contamination, reaching 49.75 million fecal coliforms/100 mL −1 . High values of coliform not only exceed the maximum permissible guidelines of the Mexican standards (1000-2000 NMP 100 mL −1 ), but also considerably deplete the results of water quality index.

Trophic state index
The TSI values fluctuated between 106.19 in site S8 to 180.43 in site S4 during June. All sites in the three periods of study showed hypereutrophic condition (70-100+). However, it is remarkable that waterbodies S2 and S5 showed the highest values during November, where the concentration of PO 4 reached 156.8 mg/L and 107.98 mg/L, respectively, with the highest values of this nutrient throughout the study sites and seasons (Figure 4).

Pollution Index
Results of Pollution Index show that study sites S1, S3, S7, and S8 showed values from low pollution (Cd < 1) to medium pollution (1 < Cd < 3), which is consistent because the site S1 receives treated wastewater, while the sites S7 and S8 are small waterbodies isolated without input from wastewater ( Figure 5). In contrast, other sites qualify with high pollution, being the site S2 (168.75), S5 (121.01) and S9 (80.43) in November those sites that reached the highest values of metal contamination. In the case of S9, despite being a natural water body, it receives inputs from rivers located in the eastern portion of the former Texcoco Lake and according to [14,35], they transport in their current a considerable amount of contaminants and heavy metals such as Pb, Zn, Ni, Cu, and Cd, although the concentrations do not exceed the maximum permissible guidelines.
The high metal concentrations, particularly for waterbodies of the central part of the lake, are because many of these waterbodies are small and very shallow, with a high variation in volume. On the other hand, continuous evaporation processes concentrates salts in the water, since they have also the highest concentrations of conductivity (>20,000 mS/cm) chlorides (>120,000 mg/L) and total dissolved solids (>300,000.00 mg/L). Finally, being small and shallow waterbodies, they are influenced by Solonchak soil that predominates in the area. In these soils, aluminum salts (alumina) are predominant, which may be contributing to the increase in the index scores as the aluminum concentration exceeds more than forty times the reference value.

Metal pollution index
The MPI provides a geometric mean of the metal concentration for each waterbody, thus this index is an indicator of the level of contamination by these elements. There are metals whose presence in nature or polluting sources is very low, while other metals may be in abundance (either in nature or polluting sources). The assessment of geometric mean values contributes to the elimination of those extreme values that affect the correct interpretation of the data.
The result set of geometric mean values was considered as the overall average for all the water bodies studied (Figure 6a   According to this index, the sites S7, S8, and S1 were sites with lower concentration of metals during the three periods of study, which in relation to the previous analysis allows to group these waterbodies with the best quality indicators among all water bodies studied.

Metal contamination in sediments
In the case of sediments, the MPI showed a maximum value of 0.54 for S2 in September, while the minimum value was 0.12 for S9 during September. The overall mean was 0.21 (geometric mean of all water bodies and periods of study) (Figure 6b). It is a remarkable increase of metals in sediments during September, which may be due to the effects of the rains that carry metals to waterbodies and these can be settled down with the material being transported, or can be chelated and removed from the water column to the sediments. This analysis shows that there is a higher concentration of metals in water reaching values up to 0.77-fold, related to the concentration in sediments.

Water quality of wells
The scores for the wells qualify as good water quality (Figure 3b). According to the National Water Commission [36], the static level of depth of the wells for the area of Texcoco fluctuates between 40 and 130 m, the wells being closest to Lago Nabor Carrillo with depths of 50 m [37]. The geological profile of the Texcoco area consists of clays with a high concentration of salts and silts up to 80 m deep [35], giving rise to very poor permeability of surface water into the aquifer.
Furthermore, the MPI scores for wells reflect the scarce metal concentration (Figure 6c). However, the high concentration of Mn (0.17-1.86 mg/L) is a distinctive feature of the geology of the area and in this sense, the presence of this metal is removed by a local treatment plant that recovers water from this series of wells for public supply. This suggests that the quality of water in the aquifer in this area can be considered good, although the geological nature and soil type provide metals and a large amount of salts.

Integration of the indices
In the present study, we used four indicators of pollution of water bodies: • Water quality index (WQI-NSF).
• Cd pollution index (indicator of contamination by heavy metals based on international security standards for different uses of water).
The WQI-NSF is an indicator of water quality by different environmental parameters measured in the water and allows decisions about the use of the resource; in this case, all water bodies were classified into two categories: (a) poor quality and (b) very poor quality.
The TSI is an indicator of nutrient enrichment, which indicates that due to the high concentration of nutrients these waterbodies are susceptible to algal blooms, both by microalgae or cyanobacteria, which can become potentially toxic.
Cd pollution index is based on comparative environmental standards. Its purpose is to ensure environmental safety standards regarding the use of water. In this case we used international criteria for aquatic life protection, protection of human health and public supply, of which the strictest criteria were selected.
The MPI is indicator of the level of metals in water and sediments, offering an overall average metal content in each compartment. Based on the above, a principal component analysis (PCA) (Figure 7) was performed, and to identify trends, three of the five indices were used: TSI, WQI-NSF, and MPI; thus the analysis considers nutrient enrichment, the state of water quality and metal contamination. The use of all these indices allows the identification of different kinds of deterioration of the waterbodies.
The PCA showed a 72.50% explained variance, with 55.74% in the first component and 16.75% in the second. On the upper and lower right quadrants the vectors of MPI, TSI, and Cd were located and the three sampling seasons were associated with these vectors. Furthermore, the waterbodies S2, S4, and S5 were also associated with these vectors. These waterbodies, as was already indicated, have the highest concentrations of metals. Similarly, S2 and S5 reached the highest values of the TSI for the three periods of study.
On the upper left quadrant, the vector of WQI-NSF showed that sites S1, S7, S8, and S9 in June and September reached the highest values of WQI; in addition, these sites exhibit the lowest concentration of metals and lowest values of the TSI. In this sense, these water bodies are characterized by better water quality among all the study sites.

Conclusions
Water bodies within the ancient Texococo Lake, assessed with different tools, show a high degree of contamination.
The WQI-NSF, an indicator of physical and chemical quality, notes that water bodies showed poor to very poor water quality.
All water bodies have a high degree of nutrient enrichment and are in hypereutrophic conditions.
The concentrations of heavy metals through various indices (Cd and MPI) showed that the 20 waterbodies are highly contaminated by metals, with a higher concentration in both water and sediments.
Water quality of wells is good despite the high concentration of Mn by natural geological effects. Also by the geological features of the study area, the pollution of the surface water does not compromise the quality of the aquifer, mainly due to low permeability of the lithologic material in this area.
The integration of the WQI-NSF, TSI, Cd, and MPI indices and use of multivariate analysis identified those bodies of water in which metal contamination is more relevant than the deterioration of water quality by other physico-chemical parameters, as well as those waterbodies that have better conditions, despite their poor water quality.