Major ion composition in ground and surface water samples in the Poopó basin from the Bolivian Altiplano.
Abstract
Soil is a fundamental natural resource in the balance for the ecosystems as well as for agriculture, food, and housing. The soil is very susceptible to changes in its structure due to contamination or degradation of anthropogenic origin. Therefore, its evaluation, whether for environmental purposes or as an agricultural or housing resource, must be carried out in depth. This evaluation comprises the analysis of multiple physical, physicochemical and chemical-biological parameters. However, due to these multiple parameters, the use of multivariate statistical methods becomes necessary. In this chapter, the soil data analysis was performed by the method of Principal Components Analysis for a reduction of dimensions and, to carry out a better interpretation of results. This method was applied to carry out a characterization and classification of soil samples. The analysis was performed with data obtained from soil samples from the Bolivian Altiplano. The results show the potential of the principal component of the method in processing data.
Keywords
- principal component analysis
- multivariate analysis
- reduction of dimensions
- Bolivian Altiplano
- contamination
1. Introduction
Soil and water are fundamental natural resources in the balance of ecosystems as well as in agriculture, food, and housing being thus fundamental for life.
Soils are composed of two environments, biotic and abiotic. The first constituted of microorganisms while the second, abiotic, is composed of solid, liquid and gaseous phases. The two environments characterize a particular soil giving it its uniqueness.
Soil and natural waters are highly susceptible to big changes in their structure and composition due to anthropogenic degradation or contamination. Therefore, its evaluation, whether for environmental purposes or as agricultural or housing resources, must be carried out deeply. This evaluation comprises the analysis of multiple physical, physicochemical and chemical-biological parameters. For example, to know the fertility of the soils, it is important to analyze parameters such as: pH, Electrical Conductivity, Organic Matter, exchangeable cations and others. The evaluations are carried out by comparison of these parameters with values established in agricultural or environmental regulations. However, due to complexity and variety and a big number of parameters, its analysis may become tough. Because of this, the use of multivariate statistical methods becomes imperative. Thus, the multivariate analysis applied to the characterization and classification of soils and natural waters according to the field of study that is intended to be carried out, grows up in importance.
The Principal Components Analysis offers us an alternative for the characterization and classification of soils. The different soil samples or soil sampling points constitute the elements of a system, and the physicochemical parameters measured in these samples, the variables. Thus, we have a system with multiple elements and variables since generally, the sampling points and the variables are numerous. The data analysis can be quite complex because the representative points among the sampling points should be represented in multidimensional spaces. Even though the variables could be represented in one-dimensional (considering each variable) or two-dimensional (every two variables) spaces, this is neither practical nor objectively informative.
The analysis through Principal Components with reduction of dimensions offers us precisely an alternative. Since the set of multivariables can be reduced to a few compound variables, the analysis becomes feasible and, the conclusions are more objective.
However, not all data sets are susceptible to Principal Component Analysis (PCA). They must meet certain requirements, for example they should be comprised of numerical variables and, the correlations between the variables must be above an acceptable level. If these requirements are not met, although a PCA could be made, their results would not be valid. The compliance for these requirements is given by the correlation matrix, where correlations must be observed. The Kaiser-Meyer-Olkin Measure (KMO) of sampling adequacy is a statistic that indicates the amount of variance in the variables that can be explained with the PCA. It is somewhat similar to the coefficient of determination R2 from a linear regression analysis. Kaiser proposed the following criteria for KMO [1]:
0.9≤KMO≤1.0 = Excellent sample adequacy.
0.8≤KMO≤0.9 = Good sample adequacy.
0.7≤KMO≤0.8 = Acceptable sample adequacy.
0.6≤KMO≤0.7 = Regular sample adequacy.
0.5≤KMO≤0.6 = Bad sample adequacy.
0.0≤KMO≤0.5 = Unacceptable sample adequacy.
Therefore, the KMO is required to be a value, at least, greater than 0.7 for the PCA to be acceptable. Bartlett’s test of sphericity is a statistical test which null hypothesis is an identity matrix. The acceptance of the null hypothesis means that there are no correlations (Sig. > 0.05). On the other hand, the alternative hypothesis is a non-identity matrix. The acceptance of this hypothesis means that there are correlations (Sig. < 0.05) and thus, a PCA can be performed. The statistical evaluation was made by SPSS Software.
2. PCA in the analysis of soil samples
In a study carried out by Ramos Ramos et al. [2], water samples from the Bolivian altiplano were analyzed. The results of various elements and variables determined in the water are presented in Table 1.
Location | Sample ID | Water type | EC (μS/cm) | pH | Eh (mV) | F (mg/L) | (mg/L) | (mg/L) | Ionic balance (%) | |||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Cayhuasi | CAP1 | Mg–Na–Ca– | 1,120 | 7.45 | 167 | 561.4 | 0.01 | 13.5 | 170.8 | 69.8 | 85 5 | 80.5 | 14.3 | 4.1 |
Soracachi 1 | SOP1 | Mg–Ca–Na– | 912 | 7.33 | 165 | 427.1 | 0.24 | 19.7 | 145.4 | 55.1 | 62.3 | 56.8 | 6.1 | –1.0 |
Soracachi 2 | SOP2 | Mg–Na–Ca– | 936 | 7.15 | 184 | 417.4 | 0.01 | 21.0 | 158.8 | 56.3 | 64.8 | 67.6 | 5.0 | 1.4 |
Paria | PAP1 | Na–Mg– | 2,120 | 7.41 | 173 | 717.6 | 1.04 | 198.2 | 242.8 | 72.5 | 58.6 | 291.5 | 42.1 | –0.7 |
Chusaqueri | CHP1 | Ca–Na–Cl | 1,480 | 7.52 | 160 | 153.8 | 0.03 | 294.9 | 85.6 | 156.3 | 20.4 | 88.3 | 8.2 | 1.9 |
Toledo 1 | TOP1 | Na–Ca–Cl– | 1,400 | 7.72 | 154 | 270.9 | 0.01 | 233.0 | 90.4 | 68.8 | 15.4 | 183.5 | 30.4 | 0.5 |
Toledo 2 | TOP2 | Na–Ca–Cl | 3,860 | 7.38 | 170 | 263.6 | 0.01 | 926.7 | 183.3 | 206.9 | 55.4 | 421.0 | 28.0 | –0.9 |
Kulliri | KUP1 | Na– | 850 | 7.75 | 160 | 297.8 | 0.11 | 61.9 | 97.0 | 27.2 | 3.4 | 135.0 | 19.1 | –4.3 |
Copacabanita | COP1 | Ca–Na– | 1,390 | 7.08 | 190 | 190.4 | 0.47 | 25.5 | 562.5 | 234.5 | 16.4 | 125.5 | 6.9 | 8.6 |
Tolaloma | TOLP1 | Ca– | 660 | 7.61 | 158 | 266.0 | 0.01 | 26.8 | 105.6 | 98.7 | 11.1 | 33.0 | 6.9 | 0.7 |
Andamarca | ANP1 | Na–Ca– | 1,190 | 7.69 | 169 | 483.3 | 0.01 | 74.1 | 142.8 | 114.7 | 13.1 | 179.0 | 18.5 | 7.3 |
Avaroa | AVP1 | Ca–Na– | 590 | 7.29 | 169 | 290.4 | 0.19 | 14.1 | 48.0 | 72.3 | 7.4 | 49.2 | 5.2 | 2.3 |
Pampa Aullagas | PAMP1 | Na–Ca–Cl– | 1,360 | 6.41 | 218 | 219.7 | 0.35 | 164.5 | 144.2 | 69.9 | 22.4 | 137.0 | 26.7 | 1.0 |
Quillacas 1 | QUP1 | Na–B–Cl– | 1,200 | 7.90 | 141 | 234.3 | 0.26 | 189.7 | 79.1 | 36.2 | 6.7 | 175.0 | 16.6 | –2.3 |
Quillacas 2 | QUP2 | Na–Cl | 810 | 6.55 | 210 | 61.0 | 0.20 | 93.7 | 66.8 | 11.5 | 4.9 | 123.8 | 21.4 | –2.0 |
Quillacas 3 | QUP3 | Na–Cl–B– | 410 | 6.74 | 205 | 36.6 | 0.13 | 35.9 | 36.8 | 10.0 | 3.6 | 44.6 | 17.5 | 7.0 |
Condo K2 | CONP2 | Na–Ca– | 660 | 7.55 | 162 | 234.3 | 0.32 | 42.0 | 55.6 | 52.0 | 8.2 | 109.4 | 9.9 | 13.0 |
Condo K4 | CONP4 | Na–Ca– | 570 | 7.55 | 165 | 227.0 | 0.29 | 30.4 | 48.7 | 43.1 | 7.4 | 51.0 | 8.9 | –3.9 |
Caraynacha | CARP1 | Ca–Na– | 400 | 7.48 | 168 | 172.1 | 0.13 | 20.9 | 30.5 | 42.9 | 6.2 | 44.4 | 13.9 | 9.6 |
Llapallapani | LLAP1 | Na–Ca– | 210 | 6.97 | 195 | 43.9 | 0.05 | 7.1 | 40.1 | 11.5 | 4.3 | 14.6 | 6.2 | –3.0 |
Challapata | CHAP1 | Ca–Na–Mg– | 650 | 6.84 | 197 | 135.5 | 0.15 | 47.0 | 35.7 | 55.3 | 15.3 | 44.3 | 6.5 | 10.0 |
Huancane | HUAP1 | Na–Ca– | 870 | 7.13 | 185 | 292.9 | 0.14 | 78.8 | 46.7 | 70.1 | 13.8 | 130.5 | 7.2 | 13.0 |
Irukasa | IRP1 | Na–Cl | 4,630 | 7.35 | 173 | 593.1 | 0.01 | 1,219.7 | 9.1 | 39.6 | 7.2 | 1,333 | 26.9 | 16.1 |
Realenga | REP1 | Na–Ca–Mg– | 770 | 6.81 | 206 | 178.2 | 0.30 | 37.1 | 176.6 | 56.0 | 23.1 | 101.0 | 5.8 | 8.9 |
Pazña | PAZP2 | Na–Ca– | 1,270 | 7.19 | 180 | 300.2 | 0.67 | 72.1 | 282.6 | 114.3 | 20.5 | 211.5 | 23.6 | 14.3 |
Totoral | TOTP1 | Na–Cl– | 1,450 | 6.87 | 198 | 98.8 | 0.55 | 168.8 | 131.1 | 51.3 | 10.4 | 208.0 | 22.8 | 0.5 |
Cayumalliri | CAYP1 | Ca–Na– | 640 | 6.70 | 212 | 205.0 | 0.07 | 33.9 | 72.1 | 64.5 | 15.1 | 35.0 | 3.0 | –1.1 |
Sora Sora | SORP1 | Al–Ca–Mg– | 4,500 | 3.79 | 378 | 0.0 | 4.06 | 28.1 | 1,020.2 | 130.7 | 63.6 | 45.5 | 6.7 | 1.6 |
Chapanar | CHAO1 | Ca–Mg–Na– | 280 | 7.60 | 157 | 109.8 | 0.01 | 9.5 | 44.2 | 20.8 | 8.8 | 16.4 | 3.9 | –7.7 |
Totoralr | TOR1 | Ca– | 2,580 | 3.88 | 369 | 0.0 | 0.51 | 104.0 | 1,546.9 | 333.5 | 40.4 | 28.2 | 9.2 | –2.1 |
Avicayar | AVR1 | Ca– | 2,530 | 3.10 | 417 | 0.0 | 1.09 | 106.9 | 1,261.1 | 255.3 | 36.8 | 56.7 | 6.6 | –2.1 |
Paznar | PAZR1 | Ca–Na– | 1,960 | 4.71 | 221 | 14.6 | 0.06 | 151.7 | 874.9 | 191.3 | 36.2 | 128.5 | 12.0 | 3.3 |
Table 2 shows the correlation matrix of parameters. It can be seen that there are correlations between different variables, which is an indicative of underlying structures. In principle, a PCA would be feasible.
Correlaciones | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
EC | pH | Eh | HCO3 | F | Cl | SO4 | Ca | Mg | Na | K | ||
EC | Pearson correlation | 1 | −.287 | .429* | .091 | .531** | .709** | .505** | .493** | .419* | .625** | .342 |
sig. (2-tailed) | .112 | .014 | .620 | .002 | .000 | .003 | .004 | .017 | .000 | .055 | ||
N | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
pH | Pearson correlation | −.287 | 1 | −.569** | .378* | −.393* | .120 | −.624** | −.361* | −.109 | .133 | .102 |
sig. (2-tailed) | .112 | .001 | .033 | .026 | .514 | .000 | .043 | .554 | .468 | .580 | ||
N | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
Eh | Pearson correlation | .429* | −.569** | 1 | −.462** | .582** | −.107 | .815** | .569** | .276 | −.145 | −.325 |
sig. (2-tailed) | .014 | .001 | .008 | .000 | .559 | .000 | .001 | .126 | .428 | .070 | ||
N | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
HCO3 | Pearson correlation | .091 | .378* | −.462** | 1 | −.228 | .307 | −.415* | −.262 | .313 | .475** | .456** |
sig. (2-tailed) | .620 | .033 | .008 | .210 | .087 | .018 | .148 | .081 | .006 | .009 | ||
N | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
F | Pearson correlation | .531** | −.393* | .582** | −.228 | 1 | −.128 | .523** | .225 | .358* | −.098 | −.033 |
sig. (2-tailed) | .002 | .026 | .000 | .210 | .486 | .002 | .215 | .044 | .594 | .860 | ||
N | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
Cl | Pearson correlation | .709** | .120 | −.107 | .307 | −.128 | 1 | −.085 | .123 | .033 | .897** | .497** |
sig. (2-tailed) | .000 | .514 | .559 | .087 | .486 | .643 | .501 | .857 | .000 | .004 | ||
N | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
SO4 | Pearson correlation | .505** | −.624** | .815** | −.415* | .523** | −.085 | 1 | .832** | .400* | −.151 | −.171 |
sig. (2-tailed) | .003 | .000 | .000 | .018 | .002 | .643 | .000 | .023 | .408 | .350 | ||
N | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
Ca | Pearson correlation | .493** | −.36l* | .569** | −.262 | .225 | .123 | .832** | 1 | .353* | −.057 | −.107 |
sig. (2-tailed) | .004 | .043 | .001 | .148 | .215 | .501 | .000 | .048 | .756 | .559 | ||
N | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
Mg | Pearson correlation | .419* | −.109 | .276 | .313 | .358* | .033 | .400* | .353* | 1 | −.060 | .063 |
sig. (2-tailed) | .017 | .554 | .126 | .081 | .044 | .857 | .023 | .048 | .746 | .733 | ||
N | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
Na | Pearson correlation | .625** | .133 | −.145 | .475** | −.098 | .897** | .151 | −.057 | −.060 | 1 | .512** |
sig. (2-tailed) | .000 | .468 | .428 | .006 | .594 | .000 | .408 | .756 | .746 | .003 | ||
N | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
K | Pearson correlation | .342 | .102 | −.325 | .456** | −.033 | .497** | −.171 | −.107 | .063 | .512** | 1 |
sig. (2-tailed) | .055 | .580 | .070 | .009 | .860 | .004 | .350 | .559 | .733 | .003 | ||
N | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 |
The results of the KMO and Bartlett's sphericity are (Table 3):
KMO and Bartlett’s Test | ||
---|---|---|
Kaiser Meyer Olkin Measure of Sampling Adequacy | .478 | |
Bartlett’s Test of Sphericity | Approx. Chi-Square | 349.409 |
df | 55 | |
Sig. | .000 |
The KMO statistics has a value of 0.478 which indicates that the data is not suitable for PCA, although Bartlett's test of sphericity has a Sig of 0.000. This means that the data is not suitable for a PCA.
Table 4 correspond to an analysis of metals in soils samples carried out in the Bolivian altiplano, Poopó Basin, Bolivia [2]. There are 36 sampling points in which 16 parameters have been determined. Therefore, we have a system with 33 elements and 16 variables.
No | Code | Al | As | B | Cd | Cr | Cu | Fe | Mn | Mo | Ni | P | Pb | S | Si | Sr | Zn |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
mg/kg | mg/kg | mg/kg | mg/kg | mg/kg | mg/kg | mg/kg | mg/kg | mg/kg | mg/kg | mg/kg | mg/kg | mg/kg | mg/kg | mg/kg | mg/kg | ||
1 | COR 1H | 13181.1 | 13.8 | 11.3 | 0.5 | 12.9 | 20.6 | 16708.9 | 233.7 | 0.2 | 8.0 | 532.5 | 35.6 | 556.9 | 429.9 | 42.5 | 86.8 |
2 | COR 1C | 11823.0 | 17.8 | 4.5 | 0.5 | 15.2 | 21.1 | 23063.1 | 474.0 | 0.3 | 10.1 | 337.6 | 40.9 | 300.5 | 519.2 | 32.5 | 84.2 |
3 | COR 1P | 12705.8 | 15.0 | 6.3 | 0.3 | 14.8 | 18.1 | 20914.0 | 370.3 | 0.3 | 9.5 | 345.5 | 35.1 | 161.0 | 459.4 | 26.3 | 85.2 |
4 | COR 1AA | 30996.3 | 38.5 | 15.0 | 0.9 | 40.2 | 48.6 | 57156.4 | 1029.9 | 0.8 | 27.4 | 976.4 | 88.3 | 581.1 | 1183.3 | 55.5 | 232.5 |
5 | COR 2C | 7945.6 | 28.0 | 8.9 | 0.7 | 10.2 | 20.1 | 15650.0 | 262.1 | 0.3 | 4.4 | 298.4 | 61.3 | 241.5 | 509.1 | 30.1 | 77.8 |
6 | COR 2P | 11231.6 | 31.2 | 6.9 | 0.7 | 10.9 | 21.3 | 16403.3 | 299.2 | 0.5 | 7.5 | 285.8 | 104.4 | 424.9 | 98.4 | 31.4 | 76.3 |
7 | COR 2AA | 8736.8 | 26.1 | 12.1 | 0.5 | 10.0 | 20.9 | 15818.3 | 332.4 | 0.3 | 5.8 | 407.7 | 48.7 | 373.7 | 449.3 | 37.8 | 99.5 |
8 | COR 3H | 10296.9 | 27.8 | 13.1 | 0.6 | 10.5 | 17.6 | 16405.1 | 394.2 | 0.5 | 6.8 | 457.7 | 40.3 | 348.8 | 535.2 | 33.3 | 81.4 |
9 | COR 3C | 4372.9 | 16.2 | 4.7 | 0.2 | 5.9 | 8.7 | 10264.0 | 192.5 | 0.3 | 3.4 | 218.9 | 20.4 | 168.4 | 375.9 | 15.6 | 54.5 |
10 | COR 3P | 6899.8 | 20.8 | 5.4 | 0.4 | 8.9 | 12.8 | 14938.4 | 281.6 | 0.3 | 5.1 | 279.2 | 30.6 | 153.6 | 455.2 | 21.8 | 67.0 |
11 | COR 3AA | 6910.1 | 21.5 | 6.7 | 0.5 | 9.1 | 16.0 | 15030.0 | 303.4 | 0.2 | 5.1 | 274.4 | 37.2 | 181.8 | 504.6 | 22.7 | 77.8 |
12 | VM 1H | 42276.0 | 30.1 | 40.0 | 1.2 | 27.1 | 27.8 | 22425.4 | 546.1 | 2.8 | 7.7 | 910.0 | 85.9 | 506.3 | 983.3 | 51.2 | 159.4 |
13 | VM 1C | 25081.6 | 23.0 | 27.5 | 0.3 | 17.4 | 11.5 | 16043.0 | 365.0 | 0.4 | 4.7 | 381.1 | 26.0 | 264.1 | 985.4 | 44.1 | 58.0 |
14 | VM 2H | 43636.7 | 17.3 | 29.5 | 0.3 | 29.2 | 18.8 | 22690.3 | 448.1 | 0.3 | 10.3 | 469.2 | 28.0 | 227.9 | 1138.1 | 39.1 | 78.0 |
15 | VM 2P | 40756.0 | 19.3 | 27.5 | 0.3 | 28.2 | 18.7 | 22097.1 | 428.5 | 0.4 | 8.8 | 449.3 | 25.6 | 220.4 | 1042.0 | 39.3 | 72.3 |
16 | VM 3H | 5509.4 | 9.3 | 8.3 | 0.4 | 5.2 | 9.4 | 8763.7 | 144.1 | 0.5 | 3.7 | 198.2 | 15.8 | 170.1 | 558.5 | 18.5 | 95.6 |
17 | VM 3C | 11299.4 | 13.3 | 3.8 | 0.6 | 15.3 | 22.9 | 23362.0 | 454.6 | 0.5 | 10.8 | 352.7 | 35.4 | 162.2 | 460.5 | 27.1 | 81.4 |
18 | VM 3P | 12348.8 | 15.9 | 4.8 | 0.4 | 16.3 | 25.5 | 25623.8 | 497.5 | 0.3 | 11.5 | 393.5 | 41.5 | 180.9 | 458.4 | 28.9 | 91.7 |
19 | VM 4H | 12362.6 | 19.7 | 20.5 | 0.8 | 11.1 | 21.8 | 17023.4 | 292.1 | 0.6 | 8.0 | 533.1 | 39.5 | 424.4 | 1395.9 | 51.3 | 203.0 |
20 | VM 4C | 25008.0 | 40.2 | 14.5 | 0.7 | 30.6 | 41.1 | 46040.2 | 928.5 | 0.9 | 18.9 | 813.9 | 70.8 | 398.4 | 1109.0 | 63.3 | 146.4 |
21 | VM 4P | 8614.9 | 18.5 | 23.3 | 0.8 | 9.1 | 16.6 | 18128.2 | 627.7 | 0.4 | 7.5 | 457.9 | 32.4 | 584.8 | 617.7 | 35.2 | 179.4 |
22 | VM 5H | 6887.9 | 28.9 | 11.8 | 0.9 | 10.0 | 22.1 | 238.7 | 503.4 | 0.3 | 7.8 | 362.5 | 70.2 | 844.3 | 487.6 | 25.5 | 130.2 |
23 | VM 5C | 7839.3 | 28.4 | 13.3 | 0.9 | 10.8 | 22.5 | 21060.7 | 505.6 | 0.3 | 7.3 | 371.0 | 73.0 | 818.4 | 488.2 | 26.0 | 137.5 |
24 | VM 6C | 7941.5 | 19.0 | 9.3 | 0.4 | 8.9 | 14.8 | 16193.1 | 510.6 | 0.3 | 6.5 | 328.9 | 43.9 | 360.1 | 575.9 | 24.6 | 88.5 |
25 | VM 6P | 6511.7 | 27.2 | 8.6 | 1.1 | 9.3 | 21.1 | 21475.7 | 544.7 | 0.2 | 6.7 | 364.7 | 53.6 | 414.0 | 506.9 | 19.5 | 161.2 |
26 | POO 1C | 10751.3 | 19.8 | 5.7 | 0.9 | 12.3 | 20.3 | 19361.9 | 424.3 | 0.5 | 8.6 | 348.3 | 43.4 | 191.3 | 633.8 | 27.3 | 90.8 |
27 | POO 1P | 13660.9 | 20.6 | 7.1 | 0.5 | 14.4 | 21.0 | 22414.2 | 485.6 | 0.6 | 9.4 | 348.3 | 33.7 | 181.2 | 606.7 | 33.8 | 76.1 |
28 | POO 2C | 10290.1 | 15.7 | 13.5 | 1.0 | 10.3 | 14.0 | 14995.4 | 340.5 | 0.4 | 5.2 | 337.6 | 27.9 | 179.9 | 516.9 | 36.1 | 117.9 |
29 | POO 2P | 10725.5 | 19.3 | 6.7 | 0.4 | 12.2 | 15.7 | 17443.0 | 386.6 | 0.2 | 6.8 | 295.3 | 30.9 | 214.1 | 552.8 | 29.0 | 68.3 |
30 | POO 3C | 7681.3 | 27.0 | 8.4 | 2.1 | 9.9 | 19.1 | 17784.7 | 297.0 | 0.3 | 6.4 | 688.2 | 138.7 | 300.1 | 496.8 | 30.3 | 246.6 |
31 | POO 3P | 9274.6 | 21.2 | 12.1 | 1.5 | 11.0 | 20.8 | 17636.1 | 337.4 | 0.4 | 6.0 | 803.9 | 95.2 | 289.3 | 517.9 | 36.9 | 193.8 |
32 | POO 4C | 7099.8 | 14.3 | 9.9 | 0.3 | 10.9 | 12.2 | 14469.9 | 286.8 | 0.7 | 7.5 | 250.7 | 24.8 | 171.4 | 440.3 | 26.0 | 51.7 |
33 | POO 4C (Alta) | 9838.7 | 17.5 | 11.7 | 0.4 | 10.3 | 13.0 | 14625.4 | 277.9 | 0.3 | 5.1 | 331.2 | 24.4 | 182.9 | 503.2 | 46.3 | 47.7 |
34 | POO 4P | 10407.3 | 16.3 | 11.7 | 0.6 | 11.3 | 15.1 | 16412.7 | 359.6 | 0.3 | 6.5 | 348.0 | 38.3 | 292.0 | 664.4 | 33.9 | 69.1 |
35 | POO 4 AA | 25849.4 | 14.6 | 28.9 | 0.3 | 18.6 | 12.7 | 14794.3 | 311.3 | 0.3 | 5.6 | 315.7 | 27.1 | 583.4 | 1439.7 | 54.3 | 57.0 |
36 | PUÑ 4P | 22049.2 | 16.5 | 21.2 | 0.2 | 15.2 | 9.3 | 12327.1 | 348.1 | 0.5 | 4.8 | 317.3 | 16.6 | 232.0 | 1308.1 | 43.7 | 45.5 |
37 | Referencia | 33277.5 | 18.4 | 23.7 | 3.9 | 74.5 | 356.4 | 26390.7 | 491.4 | 0.4 | 300.2 | 969.5 | 101.5 | 1841.5 | 501.3 | 81.0 | 744.7 |
Table 5 shows the correlation matrix, where it can be seen that there are correlations between the different variables. KMO’s and Bartlett’s tests give the following results: The KMO statistics is 0.704 and indicates that the data is acceptable (Table 6) for performing a PCA. The Sig. of the Bartlett test is 0.000 and indicates that the alternative hypothesis is valid, thus, the correlation matrix is not an identity. Then, these values indicate that the data can be subjected to a PCA. Therefore, we proceed with the PCA.
Correlations | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Al | As | B | Cd | Cr | Cu | Fe | Mn | Mo | Ni | P | Pb | S | Si | Sr | Zn | ||
Al | Pearson Correlation | 1 | ,207 | ,796** | −,115 | ,876** | ,353* | ,452** | ,400* | ,512** | ,412* | ,505** | ,012 | ,092 | ,716** | ,651** | ,039 |
Sig. (2-tailed) | ,227 | ,000 | ,504 | ,000 | ,035 | ,006 | ,016 | ,001 | ,012 | ,002 | ,946 | ,595 | ,000 | ,800 | ,819 | ||
N | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | |
As | Pearson Correlation | ,207 | 1 | ,124 | ,449** | ,421* | ,726** | ,528** | ,635** | ,334* | ,508** | ,595** | ,700** | ,481** | ,106 | ,354* | ,503** |
Sig. (2-tailed) | ,227 | ,472 | ,006 | ,011 | ,000 | ,001 | ,000 | ,047 | ,002 | ,000 | ,000 | ,003 | ,539 | ,034 | ,002 | ||
N | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | |
B | Pearson Correlation | ,796** | ,124 | 1 | ,003 | ,528** | ,056 | ,083 | ,184 | ,530** | ,028 | ,414* | −,034 | ,327 | ,732** | ,642** | ,136 |
Sig. (2-tailed) | ,000 | .472 | ,988 | ,001 | ,745 | ,632 | ,282 | ,001 | ,871 | ,012 | ,845 | ,052 | ,000 | ,000 | ,429 | ||
N | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | |
Cd | Pearson Correlation | −,115 | ,449** | ,003 | 1 | −,030 | ,355* | ,133 | ,189 | ,238 | ,099 | ,581** | ,823** | ,310 | −,118 | ,057 | ,826** |
Sig. (2-tailed) | ,504 | ,006 | ,988 | ,864 | ,033 | ,438 | ,271 | ,163 | ,567 | ,000 | ,000 | ,066 | ,491 | ,743 | ,000 | ||
N | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | |
Cr | Pearson Correlation | ,876** | ,421* | ,528** | −,030 | 1 | ,700** | ,788** | ,705** | ,443** | ,786** | ,660** | ,167 | ,154 | ,638** | ,685** | ,218 |
Sig. (2-tailed) | ,000 | ,011 | ,001 | ,864 | ,000 | ,000 | ,000 | ,007 | ,000 | ,000 | ,331 | ,368 | ,000 | ,000 | ,202 | ||
N | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | |
Cu | Pearson Correlation | ,353* | ,726** | ,056 | ,355* | ,700** | 1 | ,856** | ,834** | ,366* | ,898** | ,760** | ,553** | ,396* | ,225 | ,492** | ,587** |
Sig. (2-tailed) | ,035 | ,000 | ,745 | ,033 | ,000 | ,000 | ,000 | ,028 | ,000 | ,000 | ,000 | ,017 | ,187 | ,002 | ,000 | ||
N | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | |
Fe | Pearson Correlation | ,452** | ,528** | ,083 | ,133 | ,788** | ,856** | 1 | ,825** | ,272 | ,915** | ,665** | ,298 | ,091 | ,348* | ,521** | ,414* |
Sig. (2-tailed) | ,006 | ,001 | ,632 | ,438 | ,000 | ,000 | ,000 | ,109 | ,000 | ,000 | ,077 | ,596 | ,037 | ,001 | ,012 | ||
N | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | |
Mn | Pearson Correlation | ,400* | ,635** | ,184 | ,189 | ,705** | ,834** | ,825** | 1 | ,310 | ,857** | ,622** | ,309 | ,401* | ,336* | ,442** | ,470** |
Sig. (2-tailed) | ,016 | ,000 | ,282 | ,271 | ,000 | ,000 | ,000 | ,065 | ,000 | ,000 | ,067 | ,015 | ,045 | ,007 | ,004 | ||
N | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | |
Mo | Pearson Correlation | ,512** | ,334* | ,530** | ,238 | ,443** | ,366* | ,272 | ,310 | 1 | ,224 | ,564** | ,271 | ,152 | ,296 | ,413* | ,257 |
Sig. (2-tailed) | ,001 | ,047 | ,001 | ,163 | ,007 | ,028 | ,109 | ,065 | ,190 | ,000 | ,110 | ,376 | ,079 | ,012 | ,130 | ||
N | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | |
Ni | Pearson Correlation | ,412* | ,508** | ,028 | ,099 | ,786** | ,898** | ,915** | ,857** | ,224 | 1 | ,629** | ,299 | ,238 | ,321 | ,476** | ,421* |
Sig. (2-tailed) | ,012 | ,002 | ,871 | ,567 | ,000 | ,000 | ,000 | ,000 | ,190 | ,000 | ,077 | ,163 | ,056 | ,003 | ,010 | ||
N | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | |
P | Pearson Correlation | ,505** | ,595** | ,414* | ,581** | ,660** | ,760** | ,665** | ,622** | ,564** | ,629** | 1 | ,622** | ,352* | ,395* | ,648** | ,744** |
Sig. (2-tailed) | ,002 | ,000 | ,012 | ,000 | ,000 | ,000 | ,000 | ,000 | ,000 | ,000 | ,000 | ,035 | ,017 | ,000 | ,000 | ||
N | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | |
Pb | Pearson Correlation | ,012 | ,700** | −,034 | ,823** | ,167 | ,553** | ,298 | ,309 | ,271 | ,299 | ,622** | 1 | ,414* | −,147 | ,137 | ,701** |
Sig. (2-tailed) | ,946 | ,000 | ,845 | ,000 | ,331 | ,000 | ,077 | ,067 | ,110 | ,077 | ,000 | ,012 | ,392 | ,425 | ,000 | ||
N | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | |
S | Pearson Correlation | ,092 | ,481** | ,327 | ,310 | ,154 | ,396* | ,091 | ,401* | ,152 | ,238 | ,352* | ,414* | 1 | ,168 | ,284 | ,453** |
Sig. (2-tailed) | ,595 | ,003 | ,052 | ,066 | ,368 | ,017 | ,596 | ,015 | ,376 | ,163 | ,035 | ,012 | ,328 | ,093 | ,006 | ||
N | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | |
Si | Pearson Correlation | ,716** | ,106 | ,732** | −,118 | ,638** | ,225 | ,348* | ,336* | ,296 | ,321 | ,395* | −.147 | ,168 | 1 | ,737** | ,163 |
Sig. (2-tailed) | ,000 | ,539 | ,000 | ,491 | ,000 | ,187 | ,037 | ,045 | ,079 | ,056 | ,017 | ,392 | ,328 | ,000 | ,342 | ||
N | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | |
Sr | Pearson Correlation | ,651** | ,354* | ,642** | ,057 | ,685** | ,492** | ,521** | ,442** | ,413* | ,476** | ,648** | ,137 | ,284 | ,737** | 1 | ,253 |
Sig. (2-tailed) | ,000 | ,034 | ,000 | ,743 | ,000 | ,002 | ,001 | ,007 | ,012 | ,003 | ,000 | ,425 | ,093 | ,000 | ,136 | ||
N | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | |
Zn | Pearson Correlation | ,039 | ,503** | ,136 | ,826** | ,218 | ,587** | ,414* | ,470** | ,257 | ,421* | ,744** | ,70l** | ,453** | ,163 | ,253 | 1 |
Sig. (2-tailed) | ,819 | ,002 | ,429 | ,000 | ,202 | ,000 | ,012 | ,004 | ,130 | ,010 | ,000 | ,000 | ,006 | ,342 | ,136 | ||
N | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 |
KMO and Bartlett’s Test | ||
---|---|---|
Kaiser Meyer Olkin Measure of Sampling Adequacy | .704 | |
Bartlett’s Test of Sphericity | Approx. Chi-Square | 725.706 |
df | 120 | |
Sig. | .000 |
Four main components have been extracted. The principal extracted components are presented in Table 7. The table shows that with four components, 85.13% of the variability would be explained.
Total variance explained | |||||||||
---|---|---|---|---|---|---|---|---|---|
Component | Initial eigenvalues | Extraction sums of squared loadings | Rotation sums of squared loadings | ||||||
Total | % of variance | Cumulative % | Total | % of variance | Cumulative % | Total | % of variance | Cumulative % | |
1 | 7.584 | 47.401 | 47.401 | 7.584 | 47.401 | 47.401 | 4 732 | 29.578 | 29.578 |
2 | 3.192 | 19.948 | 67.350 | 3.192 | 19.948 | 67.350 | 4.070 | 25.435 | 55.013 |
3 | 1.905 | 11.904 | 79.254 | 1.905 | 11.904 | 79.254 | 3.639 | 22.743 | 77.756 |
4 | .941 | 5.880 | 85.134 | .941 | 5.880 | 85.134 | 1.181 | 7.379 | 85.134 |
5 | .735 | 4.596 | 89.731 | ||||||
6 | .466 | 2.913 | 92.643 | ||||||
7 | .381 | 2.381 | 95.024 | ||||||
8 | .289 | 1.805 | 96.829 | ||||||
9 | .164 | 1.025 | 97.854 | ||||||
10 | .110 | .690 | 98.544 | ||||||
11 | .074 | .462 | 99.006 | ||||||
12 | .065 | .405 | 99.411 | ||||||
13 | .049 | .304 | 99.715 | ||||||
14 | .029 | .181 | 99.896 | ||||||
15 | .011 | .072 | 99.968 | ||||||
16 | .005 | .032 | 100.000 |
Table 8 shows the rotated component matrix with the Varimax rotation method. According to this matrix, the representativeness of the main components with respect to the variables is determined. The highest correlation that the variable has with the main component was taken as a criterion.
Rotated component matrixa | ||||
---|---|---|---|---|
Component | ||||
1 | 2 | 3 | 4 | |
Al | .320 | .874 | −.056 | −.101 |
As | .523 | .083 | .547 | .314 |
B | −.116 | .945 | .057 | .197 |
Cd | −.011 | −.068 | .946 | .056 |
Cr | .719 | .647 | .016 | −.065 |
Cu | .861 | .132 | .412 | .125 |
Fe | .928 | .203 | .155 | −.104 |
Mn | .850 | .199 | .191 | .233 |
Mo | .099 | .609 | .432 | −.250 |
Ni | .956 | .142 | .115 | .046 |
P | .500 | .470 | .658 | .001 |
Pb | .220 | −.078 | .885 | .129 |
S | .119 | .161 | .328 | .874 |
Si | .210 | .825 | −.145 | .166 |
Sr | .376 | .746 | .093 | .163 |
Zn | .280 | .073 | .809 | .214 |
Table 9 shows the Component Score Coefficients Matrix, which are the coefficients of the variables in each PCA. For example, for Principal Component 1 (PC1), the following Eq. (2) applies:
Matriz de coeficiente de puntuación de componente | ||||
---|---|---|---|---|
Componente | ||||
1 | 2 | 3 | 4 | |
Al | −,007 | ,236 | −,027 | −,140 |
As | ,078 | −,050 | ,079 | ,192 |
B | −,184 | ,308 | ,020 | ,145 |
Cd | −,125 | −,003 | ,351 | −,124 |
Cr | ,145 | ,108 | −,066 | −,117 |
Cu | ,206 | −,076 | ,018 | ,017 |
Fe | ,259 | −,058 | −,043 | −,168 |
Mn | ,220 | −,063 | −,084 | ,166 |
Mo | −,106 | ,200 | ,213 | −,365 |
Ni | ,278 | −,089 | −,093 | −,008 |
P | ,009 | ,093 | ,193 | −,161 |
Pb | −,041 | −,043 | ,282 | −,044 |
S | −,051 | ,000 | −,057 | ,803 |
Si | −,031 | ,222 | −,099 | ,150 |
Sr | ,002 | ,178 | −,032 | ,101 |
Zn | −,034 | −,009 | ,231 | ,048 |
Similarly, the equations for the components PC2, PC3 and PC4 can be expressed.
According to this matrix of score coefficients and their corresponding equations, the score values of the main components are obtained for the 36 sampling points. These are shown in Table 10.
Sample | ||||
---|---|---|---|---|
1 | −0.21118 | −0.0777 | −0.33467 | 0.80961 |
2 | 0.6651 | −0.67391 | −0.53521 | −0.20777 |
3 | 0.39744 | −0.60917 | −0.64339 | −0.954 |
4 | 4.07162 | 0.56719 | 0.76641 | 0.55046 |
5 | −0.34113 | −0.59646 | 0.24423 | −0.13876 |
6 | −0.07872 | −0.94299 | 0.7625 | 0.27764 |
−0.18808 | −0.3646 | 0.02199 | 0.54976 | |
8 | −0.16801 | −0.19284 | 0.04181 | 0.26754 |
9 | −0.79917 | −0.90177 | −0.74427 | −0.65343 |
10 | −0.33788 | −0.78329 | −0.45122 | −0.68191 |
11 | −0.27713 | −0.79168 | −0.32166 | −0.4073 |
12 | −0.67221 | 3.28014 | 2.51067 | −1.29964 |
13 | −0.61199 | 1.3241 | −0.81815 | 0.22432 |
14 | 0.33766 | 1.77831 | −1.07201 | −0.46932 |
15 | 0.23938 | 1.63463 | −0.98365 | −0.54454 |
16 | −1.18136 | −0.4418 | −0.4608 | −0.8548 |
17 | 0.68138 | −0.72139 | −0.37866 | −1.24838 |
18 | 1.03018 | −0.79119 | −0.53675 | −0.89173 |
19 | −0.59117 | 1.09146 | 0.47815 | 0.82755 |
20 | 2.92182 | 0.70194 | 0.35112 | 0.21752 |
21 | −0.32097 | 0.11616 | 0.21485 | 1.62766 |
22 | −0.52482 | −0.6866 | 0.55143 | 2.96642 |
23 | 0.00703 | −0.71324 | 0.53004 | 2.46939 |
24 | −0.07964 | −0.61843 | −0.43321 | 0.45935 |
25 | 0.13395 | −1.0284 | 0.67803 | 0.67771 |
26 | 0.16106 | −0.54494 | 0.12387 | −0.82888 |
27 | 0.54161 | −0.31881 | −0.42751 | −0.80632 |
28 | −0.6818 | −0.06185 | 0.23561 | −0.69627 |
29 | 0.06159 | −0.56759 | −0.68723 | −0.29851 |
30 | −0.80934 | −0.64774 | 3.33121 | −0.66866 |
31 | −0.59454 | −0.14172 | 2.187 | −0.70855 |
32 | −0.38801 | −0.34591 | −0.64539 | −0.98714 |
33 | −0.47597 | 0.01196 | −0.67739 | −0.40583 |
34 | −0.32046 | −0.14616 | −0.38936 | −0.05153 |
35 | −0.83523 | 1.86013 | −1.28317 | 1.80646 |
36 | −0.76102 | 1.34416 | −1.20524 | 0.07187 |
After these scores, which represent the new composite reduced variables, different graphical representations and interpretations can be made. For example, the plotting of each principal component with respect to the sampling points (Figure 1).
Figure 1a shows that in the sampling points there is a certain homogeneous distribution of PC1 with respect to the points, except for points 4 (Code sample, CORR 1AA) and 20 (Code sample, VM 4C). This indicates that there are high outliers in PC1 (Ni, Fe, Cu, Mn and Cr). In environmental terms, special attention is required at these sampling points. The points in question are (Table 11):
No | Code Soil | Cr mg/kg | Cu mg/kg | Fe mg/kg | Mn mg/Kg | Ni mg/kg |
---|---|---|---|---|---|---|
4 | COR 1AA | 40.2 | 48.6 | 5756.4 | 1029.9 | 27.4 |
20 | VM 4C | 30.6 | 41.1 | 46040.2 | 928.5 | 18.9 |
Max | 40.2 | 48.6 | 57156.4 | 1029.9 | 27.4 | |
Min | 5.2 | 8.7 | 238.7 | 144.1 | 3.4 | |
Mean | 14.3 | 19.3 | 19049.5 | 411.18 | 7.9 |
The samples have high values in these metals. Particularly, sample number 4 presents the maximum values and, sample number 20 is well above the average values. Similar analysis can be done for the other principal components.
Regarding structure, the samples numbers 4 and 20 form a group with relatively high values of PC1. Another group is formed by the rest of the points. Figure 1b represents the distribution of PC2 (B, Al, Si, Sr and Mo). It can also be seen that there are atypical points forming a group with relatively high values in this component. These are sampling points 12, 14, 15, 35 and 36. The rest of the points form a group with a homogeneous distribution in relation to PC2.
Figure 2a shows that the behavior of the PC3 is also homogeneous, with the exception of sampling points 12, 30, and 31 that would have high outliers in Cd, Pb, Zn, P and As. Regarding structure, sampling points 12, 30 and 31 form a group with high values of PC3, and the other group formed by the rest of the sampling points. Figure 2b represents the distribution of sulfur in the sampling points. It can be seen that one group would be made up of sampling points 21, 22, 23 and 35 with relatively high sulfur values, and the other group made up of the rest of the points.
Figure 3 is the representation of PC1 against PC2. There are three groups observed; the first group formed by sampling points 12, 13, 14, 15, 19, 35 and 36, with high values of PC2 (B, Al, Si, Sr and Mo) with respect to their values of PC1 (Ni, Fe, Cu, Mn and Cr); a second group made up of sampling points 4 and 20 which would have high values of PC1 in relation to PC2; and a third group made up of the rest of the sampling points which would have an homogeneous distribution in PC2 and PC1. Again, from the environmental point of view, the sampling points of the first and second groups should be analyzed more carefully.
Figure 4 shows the distribution of the samples in relation to the main components PC1 against PC3. While the PC3 in most of the sampling points does not show variability except for points 12, 30 and 31; PC1 is highly distributed with the greatest variability in the sampling points.
In the same way, graphical representations of the rest of the combinations of the main components can be made (Figures 5 and 6).
In the following example, two types of soils have been considered. The characteristics of both are different, and this fact will allow us to see the ability of the main components to characterize and classify soils [4].
The following chemical parameters have been considered: pH in H2O, pH in KCl solution, Electrical Conductivity (EC), Change acidity, Total Nitrogen, Organic Matter, Assimilable Phosphorus and Exchangeable Cations (Ca2+, Mg2+, Na+, K+).
The first group of samples comes from the inter-Andean valley of the Municipality of Inquisivi – Yamora, which is located between the coordinates: 66o43′29″ and 67o17′58″ West longitude; 15o47′34″ and 17o18′20″ South latitude and at an average altitude of 2840 m (a.s.l.). The second sample comes from the Northern Altiplano Viacha Municipality, located between the coordinates: 68o16′56″ and 68o22′72″ West longitude and 16o32′39″ and 16o54′44″ latitude, with an average altitude of 4070 m (a.s.l.), both in La Paz, Bolivia [4].
The ten soil samples have been taken in the Yamora community, and another 10 soil samples from the Viacha community. The mentioned 11 parameters have been analyzed. The evaluation does not take into account the environmental conditions of Yamora or Viacha. It is only carried out based on the chemical parameters for the evaluation of fertility from the chemical point of view of the soils (Table 12).
Location | pH (H2O) | pH (KCl) | CE | H-Al | % MO | % N | Na | K | Ca | Mg | P |
---|---|---|---|---|---|---|---|---|---|---|---|
Yamora | 6.75 | 5.8 | 0.075 | 0.0329 | 3.4 | 0.28 | 0.128 | 0.688 | 17.761 | 2.548 | 273.916 |
Yamora | 6.76 | 4.98 | 0.075 | 0.0609 | 3.2 | 0.30 | 0.128 | 0.688 | 17.760 | 2.577 | 255.876 |
Yamora | 6.72 | 5.73 | 0.068 | 0.0339 | 3.3 | 0.31 | 0.134 | 0.688 | 18.331 | 2.636 | 250.994 |
Yamora | 6.76 | 5.89 | 0.074 | 0.0082 | 3.4 | 0.32 | 0.134 | 0.655 | 18.674 | 2.684 | 257.253 |
Yamora | 6.73 | 5.89 | 0.072 | 0.0329 | 3.2 | 0.30 | 0.134 | 0.655 | 17.455 | 2.518 | 246.810 |
Yamora | 6.79 | 5.92 | 0.068 | 0.0329 | 3.4 | 0.32 | 0.146 | 0.655 | 17.799 | 2.548 | 266.998 |
Yamora | 6.79 | 5.37 | 0.069 | 0.0391 | 3.4 | 0.3 | 0.128 | 0.688 | 17.874 | 2.587 | 253.086 |
Yamora | 6.8 | 5.84 | 0.073 | 0.0329 | 3.4 | 0.3 | 0.134 | 0.655 | 17.074 | 2.450 | 259.345 |
Yamora | 6.83 | 6.01 | 0.072 | 0.0349 | 3.4 | 0.3 | 0.134 | 0.655 | 18.027 | 2.606 | 261.420 |
Yamora | 6.82 | 5.95 | 0.072 | 0.0329 | 3.1 | 0.3 | 0.134 | 0.622 | 17.341 | 2.479 | 275.347 |
Viacha | 8.54 | 7.13 | 0.727 | 0.0934 | 0.7 | 0.086 | 4.663 | 0.459 | 5.385 | 4.008 | 16.010 |
Viacha | 8.72 | 7.16 | 0.732 | 0.0934 | 0.7 | 0.096 | 5.012 | 0.491 | 5 155 | 4.066 | 15.870 |
Viacha | 8.78 | 7.12 | 0.736 | 0.1054 | 0.7 | 0.101 | 4.605 | 0.426 | 5.042 | 3.988 | 18.241 |
Viacha | 8.74 | 7.11 | 0.735 | 0.1054 | 0.5 | 0.093 | 4.605 | 0.426 | 5.080 | 3.988 | 19.287 |
Viacha | 8.81 | 7.16 | 0.737 | 0.0934 | 0.6 | 0.092 | 4.663 | 0.426 | 5.118 | 4.027 | 19.845 |
Viacha | 8.78 | 7.12 | 0.738 | 0.1054 | 0.7 | 0.091 | 4.663 | 0.459 | 5.118 | 4.027 | 20.612 |
Viacha | 8.94 | 7.01 | 0.731 | 0.1091 | 0.7 | 0.093 | 4.605 | 0.426 | 5.080 | 3.959 | 17.683 |
Viacha | 8.69 | 6.78 | 0.733 | 0.0813 | 0.6 | 0.093 | 4.663 | 0.426 | 5.309 | 3.802 | 18.241 |
Viacha | 8.49 | 7.14 | 0.732 | 0.0818 | 0.7 | 0.093 | 5.129 | 0.491 | 5.233 | 3.978 | 18.311 |
Viacha | 8.81 | 7.14 | 0.780 | 0.0813 | 0.5 | 0.093 | 5.246 | 0.491 | 4.889 | 3.939 | 16.010 |
PCA was also performed and the correlation matrix is shown in Table 13.
Correlaciones | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
pHenH2O | pHenKCl | CE | H_Al | MO | N | Na | K | Ca | Mg | P | ||
pHenH2O | Correlación de Pearson | 1 | .944** | .996** | .945** | −.994** | −.992** | .991** | −.978** | −.996** | .991** | −.993** |
Sig. (bilateral) | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | ||
N | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | |
pHenKCl | Correlación de Pearson | .944** | 1 | .947** | .836** | −.941** | −.941** | .948** | −.940** | −.947** | .947** | −.941** |
Sig. (bilateral) | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | ||
N | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | |
CE | Correlación de Pearson | .996** | .947** | 1 | .936** | −.998** | −.997** | .998** | −.971** | −.999** | .995** | −.998** |
Sig. (bilateral) | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | ||
N | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | |
H_Al | Correlación de Pearson | .945** | .836** | .936** | 1 | −.939** | −.942** | .925** | −.921** | −.943** | .936** | −.938** |
Sig. (bilateral) | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | ||
N | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | |
MO | Correlación de Pearson | −.994** | −.941** | −.998** | −.939** | 1 | .995** | −.996** | .975** | .997** | −.991** | .996** |
Sig. (bilateral) | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | ||
N | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | |
N | Correlación de Pearson | −.992** | −.941** | −.997** | −.942** | .995** | 1 | −.994** | .967** | .997** | −.990** | .994** |
Sig. (bilateral) | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | ||
N | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | |
Na | Correlación de Pearson | .991** | .948** | .998** | .925** | −.996** | −.994** | 1 | −.960** | −.996** | .993** | −.996** |
Sig. (bilateral) | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | ||
N | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | |
K | Correlación de Pearson | −.978** | −.940** | −.97l** | −.921** | .975** | .967** | −.960** | 1 | .974** | −.962** | .969** |
Sig. (bilateral) | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | ||
N | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | |
Ca | Correlación de Pearson | −.996** | −.947** | −.999** | −.943** | .997** | .997** | −.996** | .974** | 1 | −.991** | .997** |
Sig. (bilateral) | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | ||
N | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | |
Mg | Correlación de Pearson | .991** | .947** | .995** | .936** | −.991** | −.990** | .993** | −.962** | −.991** | 1 | −.995** |
Sig. (bilateral) | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | ||
N | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | |
P | Correlación de Pearson | −.993** | −.941** | −.998** | −.938** | .996** | .994** | −.996** | .969** | .997** | −.995** | 1 |
Sig. (bilateral) | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | .000 | ||
IM | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 |
In the correlation matrix, high correlations between the variables are observed, the KMO with 0.865 and a Bartlett Significance of 0.000 indicate that the reduction of dimensions by principal components is feasible and adequate (Table 14). Therefore, we proceeded to obtain two main components (Table 15) and the rotated component matrices and component score coefficients for the samples (Table 16) with the application of Varimax rotation and Kaiser normalization.
KMO and Bartlett’s Test | ||
---|---|---|
Kaiser Meyer Olkin Measure of Sampling Adequacy | .865 | |
Bartlett’s Test of Sphericity | Approx. Chi-Square | 715.671 |
df | 55 | |
Sig. | .000 |
Total variance explained | |||||||||
---|---|---|---|---|---|---|---|---|---|
Component | Initial eigenvalues | Extraction sums of squared loadings | Rotation sums of squared loadings | ||||||
Total | % of variance | Cumulative % | Total | % of variance | Cumulative % | Total | % of variance | Cumulative % | |
1 | 10.707 | 97.338 | 97.338 | 10.707 | 97.338 | 97.338 | 5.673 | 51.572 | 51.572 |
2 | .166 | 1.511 | 98.848 | .166 | 1.511 | 98.848 | 5.200 | 47.276 | 98.848 |
3 | .063 | .569 | 99.417 | ||||||
4 | .038 | .342 | 99.759 | ||||||
5 | .011 | .103 | 99.862 | ||||||
6 | .006 | .056 | 99.918 | ||||||
7 | .004 | .041 | 99.959 | ||||||
8 | .003 | .024 | 99.983 | ||||||
9 | .001 | .012 | 99.995 | ||||||
10 | .000 | .003 | 99.998 | ||||||
11 | .000 | .002 | 100.000 |
Rotated component matrixa | ||
---|---|---|
Component | ||
1 | 2 | |
pHenH2O | .711 | .699 |
pHenKCl | .876 | .461 |
CE | .728 | .683 |
H_Al | .476 | .870 |
MO | −.717 | −.694 |
N | −.710 | −.699 |
Na | .739 | .667 |
K | −.725 | −.657 |
Ca | −.718 | −.694 |
Mg | .723 | .682 |
P | −.717 | −.693 |
Component score coefficient matrix | ||
---|---|---|
Component | ||
1 | 2 | |
pHenH2O | .011 | .123 |
pHenKCl | 1.195 | −1.122 |
CE | .106 | .024 |
H_Al | −1.184 | 1.367 |
MO | −.041 | −.092 |
N | −.007 | −.128 |
Na | .185 | −.059 |
K | −.174 | .049 |
Ca | −.047 | −.086 |
Mg | .095 | .035 |
p | −.045 | −.087 |
The rotated component matrix shows that there is a structure. A group of parameters that have a positive correlation with the principal components, Group 1 (PC1): pH in KCl, Na+, CE, Mg2+, pH in H2O and H-Al with positive correlation. There is another group of parameters that have a negative correlation, Group 2 (PC2): N, MO, P, Ca2+ and K+. This leads to a competition between these groups of parameters in the soil. If Group 1 overlaps Group 2, the soil would have high pH and EC values, high Na+ and Mg2+ contents, and positive values of the main components, poor exchange content of OM, P and N. It means that there is an unfavorable soil for agriculture purposes. However, if Group 2 of parameters overlaps Group 1, then the soil is rich in OM, P and N. This means that the soil is more suitable for agriculture purposes, and it would have negative values of the main components.
The score coefficient matrix of the components generates the functions of PC1 and PC2:
The score values are the following (Table 17):
Location | CP1 | CP2 |
---|---|---|
Yamora | −0.6039 | −0.7968 |
Yamora | −2.9276 | 1.6177 |
Yamora | −0.7392 | −0.6686 |
Yamora | 0.4893 | −2.0047 |
Yamora | −0.3933 | −0.9410 |
Yamora | −0.3570 | −1.0346 |
Yamora | −1.5191 | 0.1140 |
Yamora | −0.4907 | −0.8760 |
Yamora | −0.2710 | −1.0577 |
Yamora | −0.2547 | −1.0504 |
Viacha | 0.8095 | 0.5245 |
Viacha | 0.8482 | 0.5004 |
Viacha | 0.4159 | 1.0341 |
Viacha | 0.4046 | 1.0660 |
Viacha | 0.9214 | 0.4989 |
Viacha | 0.3742 | 1.0584 |
Viacha | 0.1010 | 1.3822 |
Viacha | 0.6979 | 0.5567 |
Viacha | 1.2224 | 0.0200 |
Viacha | 1.2719 | 0.0567 |
The representation of the components for the Yamora and Viacha samples are shown in Figure 7. For both PC1 (Figure 7a) and PC2 (Figure 7b), the positive values indicate that the pH parameters in KCl, Na+, EC, Mg2+, pH in H2O, and H-Al overlap the parameters of N, OM, P, Ca2+, K+. This means that if the soils have positive values of PC1 and PC2, then the soil has high pH values, high Na+ concentration, and high EC. On the other hand, if the soil has negative values of the components, then the soil is rich in OM, P, N, which represents a much more suitable land for agriculture.
In the case of the Yamora samples, its PC1 and PC2 is negative, therefore, this soil is rich in OM, P and N, which represents a much more suitable soil for agriculture. In the results for the Viacha samples, the PC1 and PC2 are positive, therefore, this soil is shown as a soil not so suitable for agriculture (Figure 8).
It can be observed that the main components accurately classify the two types of soils. In addition, a correlation can be observed for each type of soil (Figure 9).
The slope of both is approximately the same and the characterization of the soils is given by the ordinate to the origin (Figure 9). Soils with more suitable characteristics for cultivation, that is, the parameters N, OM, P, Ca2+, K+ overlap the pH in KCl, Na+, EC, Mg2+, pH in H2O and H-Al tend towards smaller or even negative ordinates to the origin. In this case, the main components are capable of classifying and characterizing the soils with high precision. Thus, the multivariate analysis of soils constitutes an important tool for classifying soils.
It should be considered that the principal components give us a stand point in the data analysis. These must be complemented with other methods of multivariate analysis. In this case; for example, multivariate discriminant analysis can be applied [5].
The coefficients of the standardized canonical discriminant function indicate that the most appropriate parameters considered in the discriminant function are N, Na+, K+, Mg2+ and P. The parameters that are important to define soil fertility are: pH and OM. In addition, other factors that intervene in soil formation are the presence of minerals that contain exchange cations (Na+, K+, Mg2+ and Ca2+), decreases in soil acidification and, the decomposition process of minerals.
The general discriminant function obtained for the two types of soils is [6]:
While the discriminant functions by group are:
The results of the application of the discriminant function in the classification of the samples in both places indicate that the 20 samples can be classified 100% correctly. Therefore, the application of these functions in the classification of new soil samples has a high probability of classifying them correctly. In this way, it is possible to classify the soils through five parameters and the discriminant function, and thus, determine its chemical fertility. This information can be complemented to the main components.
3. Conclusions
The data analysis by main Principal Components Analysis for the reduction of dimensions in data was applied to soil samples. It is shown that this tool is fundamental and fully applicable, since it allows the characterization and classification of soil samples with precision. This brings a better interpretation of the results.
Acknowledgments
Due acknowledgement to Springer Nature for giving permission to reproduce Table 1 from “Sources and behavior of arsenic and trace elements in groundwater and surface water in the Poopó Lake basin, Bolivian Altiplano” by Oswaldo Eduardo Ramos Ramos, Luis Fernando Cáceres, Mauricio Rodolfo Ormachea Muñoz, Prosun Bhattacharya, Israel Quino, Jorge Quintanilla, Ondra Sracek, Roger Thunvik, Jochen Bundschuh, Maria Eugenia Garcia., Environmental Earth Sci., 66: 793 – 807, 2012.
A due acknowledgement to Revista Boliviana de Química for giving permission to reproduce the Table and the equation from “Chemometric evaluation of internal reference material (IRM) of agricultural soils in the two provincial municipalities of La Paz” by Rolando Mamani Quispe, Leonardo Guzmán Alegria, Jorge Chungara Castro, Oswaldo Eduardo Ramos Ramos, Revista Boliviana de Química, Vol. 39, No 4, pp 181 – 189, 2019; and “Análisis multivariable en la clasificación de suelos para la agricultura en el valle y Altiplano Boliviano” by Rolando Mamani Quispe, Oswaldo Eduardo Ramos Ramos, Jorge Chungara Castro, Leonardo Guzmán Alegría, Revista Boliviana de Química, Vol. 38, No 3, pp 126 – 132, 2021.
A due acknowledgment to José Antonio Bravo, Ph.D., Chief Editor of Revista Boliviana de Química for gramatical revision of the document.
References
- 1.
Martín Q., Cabero M. T., de Paz Y. 2007. Tratamiento estadístico de datos con SPSS, pg. 328, España, Universidad de Salamanca. Ed. Thomson - 2.
Eduardo RRO, Fernando CL, Rodolfo OMM, Prosun B, Israel Q, Jorge Q, et al. Sources and behavior of arsenic and trace elements in groundwater and surface water in the Poopó Lake basin. Bolivian Altiplano. Environmental Earth Science. 2012; 66 :793-807 - 3.
Eduardo Ramos Ramos Oswaldo. Geochemistry of trace elements in the Bolivian Altiplano – Effects of natural processes and anthropogenic activities. PhD Thesis, TRITA LWR PHD-2014:04. 2014 - 4.
Rolando MQ, Leonardo GA, Jorge CC, Eduardo RRO. Chemometric evaluation of internal reference material (IRM) of agricultural soils in the two provincial municipalities of La Paz. Revista Boliviana de Química. 2019; 39 (4):181-189 - 5.
Rolando MQ, Eduardo RRO, Jorge CC, Leonardo GA. Análisis multivariable en la clasificación de suelos para la agricultura en el valle y Altiplano Boliviano. Revista Boliviana de Química. 2021; 38 (3):126-132 - 6.
Mongay FC. Quimiometría. España: Universitat de Valencia; 2005. p. 245 Ed. PUV