The isotopic analysis of precipitation provides useful information on a variety of hydrological and atmospheric processes. The dynamical characteristics of precipitation isotopes have been well investigated, but a systematic study of their statistical behavior seems to be lacking. We have performed the statistical analysis, basically the distribution characteristics of precipitation isotopes vis-a-vis rainfall data for specific regions. The probability distribution functions of precipitation isotopes have been calculated from local to global scales. It has been observed that the isotopic values, in general, followed a pattern that is similar to the normal distribution, though the rainfall distribution patterns are very different. Under certain circumstances, the isotopic distribution patterns closely resemble the normal distribution, implying a well-constrained moisture source contributing to precipitation. The distribution patterns of oxygen and hydrogen isotopes on continental and global scales show similar behavior. It was observed that the distribution patterns of primary isotopic variables (δ18O and δD) are not very sensitive to the outliers. On the contrary, the secondary parameter, d-excess, is very sensitive to outliers, which offers an effective means to quality control of the precipitation isotopic values.
- precipitation isotopes
- Indian monsoon
- probability distribution function
Systematic collection and analysis of the isotopic content of precipitation across the globe have been carried out under the GNIP (Global Network of Isotopes in Precipitation) framework to study the temporal and spatial variabilities of the environmental stable isotopes. A large number of precipitation isotope data, mostly on a monthly timescale, is available with GNIP. Various investigators have used the data in understanding a variety of atmospheric and hydrological processes [1, 2, 3, 4, 5, 6, 7, 8]. Specific statistical techniques have also been proposed to interpret the data [9, 10]. But one essential characteristic, viz. understanding the statistical behavior of the precipitation isotope data, did not receive much attention by the isotope hydrologists.
The probability distribution of atmospheric variables such as temperature, pressure, wind speed, precipitation, greenhouse gas concentrations, effective radiative forcing, aerosols etc. [IPCC 2013, ] is routinely used by the environmental scientists and operational meteorologists for different purposes. One of the primary applications is to assess the occurrence of extreme weather events or environmental conditions (see Chapter-1 of the IPCC Report ). Temperature variations usually exhibit a normal distribution pattern. Because of its symmetric behavior, it offers an intuitive method to assess the change in temperature characteristics for different periods or distinct spatial boundaries. For example, if the pattern shifts towards the right, say for a different time interval, then it would indicate an increase in mean temperature for that particular time interval (see Figure 1.8 in Ref: ). So the analysis of the distribution pattern of temperature over a long period provides an excellent means to assess the climate change.
Similarly, if the temperature distribution of a given region differs from the usual standard Gaussian pattern, then it may be concluded that the temperature in that particular region experiences different environmental controls. Unlike temperature, the moisture parameters such as rainfall, humidity, cloud size, cloud water content, etc. almost always deviate from normal distribution patterns . Their distribution characteristics provide useful information on weather and environmental conditions and hence are extensively studied by the meteorologists.
Several investigators studied the statistical distributions of rainfall over the Indian region as well as from other parts of the world. The use of the probability distribution function of precipitation was described by Sharma and Singh . Wither (2001).  and Nadarajah (2005).  examined the distribution characteristics of rainfall over sixteen locations in New Zealand and fourteen sites in Florida, respectively. They have also studied the maxima of rain throughout 1961–2001 for five areas in South Korea. Generalized extreme value distribution was fitted to those data to describe the anomalous distribution characteristic and, in turn, predict their future behavior. It has been shown that the annual maximum of daily rainfall in Japan closely followed the Weibull distribution pattern .
Ghosh et al.  studied the probability distribution of rainfall over Bangladesh. These authors observed that the generalized extreme value distribution provides a good fit to the rainfall data over Chittagong, Rajshahi, and Sylhet. However, the rainfall data from the Dhaka stations are best described in terms of Gamma distribution function .
In the Indian context, several people have reported the rainfall distribution characteristics. For example, Molay et al.  observed that the South West (S.W.) and North East (N.E.) monsoon rainfalls at individual representative stations in India followed the Gamma distribution pattern. Statistical analysis of rain for Uttarakhand, India, was carried out by Vikram and Jahangeer . These authors observed that the rainfall data covering 46% of all the districts is best described by the Weibull distribution function, followed by the Chi-squared and log-Pearson distribution patterns. Statistical properties of rainfall at Coimbatore in South India were discussed by Lavanya et al. .
The isotopic composition of precipitation, an essential attribute of the rainfall variability, is widely studied by the hydro-meteorologists for understanding a variety of processes. In the Indian context, one of the first comprehensive analysis was carried out based on the precipitation isotope data across the country by Bhishmkumar et al. . The local meteoric water lines were calculated for several locations. They show significant variations in slope and intercept across the continent. Several other works were undertaken to address a variety of issues related to hydrological and atmospheric processes [22, 23, 24, 25, 26]. One of the most intriguing aspects is the relation between rainfall amount and its isotopic values; the so-called amount effect has been examined for different regions and for varying time scales. Isotopic values are essentially derived from rainwater, so their physical properties are expected to mimic that of the rainwater. However, deviations were observed mostly in their dynamical behavior. For example, during the summer monsoon season (June to September) in India, the wind is primarily south-westerly. After the monsoon withdrawal in early October, the wind is pre-dominantly north-easterly. Because of this reversal in the wind and hence in the moisture source, the isotopic values of precipitation in southern peninsular India experience strong seasonality . But this characteristic is not well manifested in the rainfall pattern. One pertinent question may be posed: do the statistical distributions of precipitation isotopes also follow a similar pattern of the rainfall distribution? If it is not, what are the implications of having a different kind of distribution pattern? Do the distribution patterns show significant spatial variability? Since the isotopic values of proxy records are widely used for paleoclimatic investigation, especially the rainfall reconstruction, do the statistical distribution patterns of precipitation isotope have any implication in this regard? In this work, we characterize the probability distributions of precipitation isotopic values and endeavor to address the above-mentioned issues.
2. Data and methodology
Precipitation isotopic data on a daily scale has been collected from several locations in India. These sites are Port Blair in the Andaman Islands (data available: 2012–2018; [26, 28, 29, 30]), Minicoy Island in the Arabian Sea (2015–2018; ); Kolkata in east India (2015–2018; ); Darjeeling (2013–2018) and Tezpur in northeast India (2016–2018; ); Ahmedabad (2007)  and Pune in west India (2014–2018; [32, 33]), Bandipora in north India (July-2016 to June-2018; ) and Kozhikode in south India . Precipitation isotope data of a few sites in Bangladesh have also been used [29, 35]. Figure 1 shows the precipitation sampling sites and the core monsoon zone of India (shaded). Additionally, precipitation isotope data across the tropics have also been used to examine the statistical behavior of the isotopic data on a global scale. The data is available in .
Rain-gauge data have also been used to study the distribution pattern of rainfall. To have regional-scale rainfall variability, the TRMM (Tropical Rainfall Measuring Mission) satellite-derived gridded rainfall data available at 0.25° × 0.25° resolution  was used. We have also used the oxygen isotopic values of the speleothem that had been used to investigate the monsoon variability on the past timescale. Berkelhammer et al.  reported the oxygen isotopic time series of two speleothem samples belonging to the core monsoon zone (CMZ) of India. A composite δ18Ospeleothem record was prepared by them spanning from AD 625 to A.D. 2007. The CMZ consists of a wide area (approx. 18-27°N, 69-88°E; see Figure 1) in central India, and the rainfall in this region maintains a strong correlation with the all India summer monsoon rainfall . Hence the speleothem derived rain is believed to be representing the large scale rainfall pattern over India.
The isotopic ratio is expressed in δ notation and is defined as the relative difference between the heavy to light isotopic values (i.e., 18O/16O) of the sample and reference material. The difference is normalized with the isotopic ratio of the reference material and then multiplied by a factor of 1000. The isotopic value is denoted in δ notation and expressed in permil (‰). As an example, the oxygen isotopic value of precipitation is defined as follow: δ18O (‰) = [(18O/16O)precipitation − (18O/16O)reference]/[(18O/16O)reference] * 1000.
The reference material is the Vienna Standard Mean Ocean Water (VSMOW), whose value has been assigned to 0‰ (IAEA 2009) . In tropical regions, precipitation δ18O typically varies from 0 to −15‰.
A secondary parameter, d-excess  is commonly used to identify the source moisture. It is defined as follows: d-excess = δD − 8 * δ18O.
It is to be noted that the control mechanisms on δ18O or δD, and d-excess are not necessarily the same. Isotopic ratios (18O/16O and D/H) are controlled by the mass-dependent fractionation process. They typically show a strong correlation indicating that they are controlled by similar environmental processes . But, since d-excess is a difference between these two isotopic values, the individual fractionation effect is likely to be minimized in normal environmental conditions. The d-excess is mainly controlled by kinetic processes, say during evaporation of water from the seas, in which the environmental condition, such as relative humidity, plays a significant role. On the other hand, the evaporation of raindrops causes enrichment in heavier isotopes. But since the rates of escape of D and 18O differ, which is a function of their distribution coefficient, the d-excess would decrease. So intense evaporation would cause a significant change in d-excess values. Because of these physical processes, the distribution patterns of the isotopic values and d-excess may be different.
2.1 Skewness and kurtosis
Skewness (S) is a statistical parameter that is used to examine the symmetry of a distribution. A distribution is said to be symmetrical if the frequencies are symmetrically distributed about the mean . For a normal distribution, the S value is zero.
Kurtosis (K) is defined as the fourth standardized moment. It is a measure of the relative flatness of the distribution function. For a normal distribution K = 3, based on which excess kurtosis is used in which three is subtracted from the observed kurtosis . If the excess kurtosis is negative for a given distribution, then it implies that a particular distribution contains less number of extreme values compared to a dataset that is normally distributed. Analysis of kurtosis provides a useful means to identify extreme values. We will discuss how this parameter could be used to differentiate the extreme values (occurring naturally) from the outliers (the probable reason for an analytical artifact). To calculate S and K, both the rainfall and the isotopic data of all the years are separately clubbed together to form two combined time series. Separate analyses were done for local scale (Port Blair and Pune), regional scale (northeast India and Bangladesh), continental-scale (entire India and Bangladesh), and global scale (the whole tropics). The S and K values were calculated using the OriginPro software, where bin size is decided by default; Excel-based SKEW() and KURT() function were also used for this purpose.
The distribution pattern of the isotopic profiles is examined on a wide scale. Isotopic data from Port Blair were used to study the pattern on a local scale. Similarly, isotopic data from Pune and its surrounding region, representing a relatively larger spatial scale, were used for studying the distribution pattern. Precipitation isotopic values of a vast area in northeast India, including Bangladesh, were used to study the distribution pattern on a regional scale. Finally, a large dataset spanning approximately from A.D. 2013 to A.D. 2017 from around the tropics was used to examine the isotopic distribution pattern on a global scale.
3. Results and discussion
3.1 Distribution characteristics of precipitation isotopes
3.1.1 Site: Port Blair
Figure 2(a) and (b) show the probability distribution functions (PDF) of rainfall and its oxygen isotopic values, respectively. At the same time, Figure 2(c) and (d) show the PDFs of the hydrogen isotopic ratio (δD) and d-excess, respectively. The values of S, K, and the number of samples (N) for each distribution are shown in the respective panels. The normal distribution curve in each case, shown as a thin line, has been drawn using the mean and standard deviation of the respective distribution.
The Bay of Bengal region experiences cyclonic disturbances typically during the pre-monsoon and post-monsoon seasons. Heavy to very heavy rainfalls are received during these events, mostly during the post-monsoon season, which is associated with highly depleted oxygen isotopic values, occasionally registering as low as −17 to −18‰ . Hence, to avoid these extreme rainfall events and, in turn, the highly depleted isotopic values, isotopic distribution patterns for the summer monsoon season (May to September) have been redrawn. Figure 3 depicts the corresponding rainfall (a), δ18O (b), δD (c) and d-excess (d) distribution patterns.
The annual scale Port Blair δ18Oprecipitation shows a negatively skewed distribution pattern (Figure 2(b)). δ18Oprecipitation typically ranges from −14 to 2‰. A few high positive values up to +4‰ are observed, which mostly occur during the relatively dry environment prior to monsoon onset in May. But towards the negative side, low values up to −18‰ are observed, which make the distribution pattern left asymmetric, yielding a skewness of −1.16, and kurtosis: 2.78. Incidentally, the rainfall distribution pattern is highly positively skewed, with S and K values being 3.04 and 12.63, respectively (Figure 2(a)).
A high value of kurtosis means the occurrence of intermediate values (falling within ±1 sigma) is less likely, but the extreme values are more likely. Hence, in this case, the high positive kurtosis indicates a significant amount of high to very high rainfall (≥ 100 mm/day) events populating in the tail region of the distribution. According to the amount effect concept, high rainfall produces more depleted isotopic values. Hence the distribution patterns are expected to show opposite behavior, somewhat mirror image to each other.
We have also examined the distribution patterns during the monsoon season (May to Sep) in order to eliminate the effect of high rainfall events arising due to intense cyclonic activities. Figure 3 shows the distributions. In the case of rainfall distribution (Figure 3(a)), the K value decreases marginally (12.63 to 11.61). Still, the same for δ18Oprecipitation decreases significantly from 2.78 to 0.37, implying the effect of the heavy rainfall events on the isotopic value is eliminated. δD also shows a similar behavior, though the K values of d-excess in both cases are similar (2.43 and 2.58, respectively). The residual kurtosis in the d-excess cases differs only marginally, i.e., by −0.57 and −0.42, respectively, from the normal distribution. These imply the d-excess distribution pattern to have a close resemblance with a normal distribution and be less sensitive to the intense rainfall events, unlike δ18O and δD.
3.1.2 Site: Pune
Pune belongs to a semi-arid region in western India. Being in the leeward side of the Western Ghats mountain range, the area is characterized by rain shadow region. As a result, the raindrop evaporation is significant, which makes the isotopic values relatively enriched in heavier isotopes. The distribution patterns of the above mentioned four parameters are shown in Figure 4. The S and K values of precipitation for the Pune region also show high values, 3.38 and 13.69, respectively (Figure 4(a)) comparable to that of the Port Blair values, implying a positively asymmetric distribution pattern. δD and δ18O show moderate values of skewness and kurtosis (Figure 4(b) and (c), respectively).
3.1.3 Northeast India and Bangladesh
This region contains a large forest area and receives one of the highest rainfall amounts in the world. As a result, the climate is humid subtropical. December to January is dry season, and little rain is received during this time. However, a significant amount of rainfall is received during the pre-monsoon season (Mar to May) due to the thunderstorm activities. The monsoon onset takes place in early June and continues until early October. October to December is considered as the post-monsoon season. The precipitation isotope analysis for this region has been carried out by several investigators. We have used those data available since A.D. 2007. Additionally, we have analyzed samples from two sites, Darjeeling and Tezpur. The records are available from A.D. 2013 to A.D. 2018 for Darjeeling and A.D. 2015 to A.D. 2018 for Tezpur. All the data have been compiled, and frequency distribution plots of δ18O and d-excess were generated. Since δD has a distribution pattern very similar to the pattern of δ18O, their distribution patterns are not shown. Similarly, the rainfall distribution patterns are similar to other sites, and hence those plots are not shown here. Figure 5 shows the season-wise distribution patterns of δ18O. The plot for d-excess has been shown only for the pre-monsoon season.
The distribution patterns of δ18Oprecipitation for this region have been shown (Figure 5) for three seasons. Figure 5a shows the δ18Oprecipitation pattern for the pre-monsoon (Mar to May) season, including Jan and Feb. The low value of S (0.48) yields a near-normal distribution of δ18Oprecipitation. And a very low K value (0.002) means a near absence of extreme values. Similarly, d-excess also shows nearly a symmetric distribution (Figure 5(b)), and a low K value implies the frequency of extreme d-excess is negligible. Interestingly, other seasons, that is, monsoon (Figure 5(c)) and post-monsoon (Figure 5(c)) season, do not show such behavior in the distribution pattern. In both cases, the distribution patterns are negatively skewed. The reason for such kind of behavior may be explained as follow.
Northeast India receives a considerable amount of rainfall during the pre-monsoon time, mainly due to the thunderstorm activities. Choudhury et al.  demonstrated that a large amount of recycled water vapor, mainly generated by the evapotranspiration processes, contributes to rainfall during this time. The contribution of marine vapor during this time is negligibly small. Hence the moisture source is reasonably well constrained, implying that the isotopic variability of precipitation would be limited. Mean and standard deviation are −0.86 and 1.99‰, respectively. The extreme values −6 and + 5‰ are nearly equidistant from the mean value. Hence, a near-normal distribution of δ18Oprecipitation results in this season.
On the other hand, the d-excess for this season also shows a very similar behavior. It possesses low S (−0.03), and K (0.047) values, and the mean value (12.86‰) is equidistant from the extreme values of −10 and + 35‰; hence a near-perfect normal distribution plot is produced. During the advent of the monsoon season in early June, a huge amount of marine vapor sweeps across this region. With the progress of the monsoon, the moisture source is also believed to be shifted towards the South Bay of Bengal . These long-range transports cause further depletion in precipitation isotopic values. On the other hand, though the extent of recycled vapor is reduced considerably, it still contributes to some extent. Because of these multiple vapor sources, the isotopic values of precipitation suffer a wide variability.
Additionally, intense convective activities during the monsoon season cause large depletions in isotopic values. As a result, the precipitation isotopes may have values lower than −20‰ (Figure 5(c)). This characteristic feature drives the isotopic distribution pattern and makes it asymmetric. It is also known that intense convective activities cause high d-excess  due to the feeding of recycled moisture into the cloud system . The northeast region experiences such kind of convective activities quite often during the monsoon season but not much during the pre-monsoon season . This process is also likely to contribute to yielding a relatively high K value for the monsoon season. The post-monsoon season, on the other hand, gets moisture from multiple sources , which also makes the corresponding distribution pattern skewed in the negative direction.
3.2 Continental and global scales
Precipitation isotopic distribution patterns (δ18O and d-excess) on a continental scale covering India and Bangladesh are presented in Figure 6(a,b). The lower panel plots Figure 6(c,d) show the same approximately for the entire tropical region.
In this context, it is to be noted that the precipitation isotope data were obtained from nineteen sites across the tropics, including an extra-tropical area Nagoya, in Japan. Description of the sites and data are available in . When the K value of d-excess was calculated for the entire dataset, an unusually high value (117) was obtained. A careful analysis of individual sites revealed that one particular location, Mulu, in Malaysia, was mainly responsible for producing such a high value. Unrealistically high values of d-excess (>50) were detected in several instances. These values were not associated with extreme rainfall events; average rainfall during these events was 24.9 mm/day. A running kurtosis analysis was carried out on the d-excess dataset. A total of 16 events (out of 1091 data points) were found, which showed sharp changes in K value. When these anomalous data points were excluded, the K value dropped to 1.44. So the anomalous values may have occurred due to analytical error. Because of this reason, the Mulu site data set were excluded. Similarly, two events in Kuala Terengganu, Malaysia, and one event at Windhoek, Namibia, showed anomalous K values. Hence they were also excluded. Recalculation yielded a K value of 3.58. In this context, it may be noted that extreme rainfall events do not necessarily create extreme value in d-excess. This is apparent in the case of Port Blair data. When the Oct-Dec data series were excluded as this season is prone to intense cyclonic disturbances, the K value changed only from 2.78 to 2.58. Hence, the analysis of kurtosis may be used in separating the extreme events from the outliers caused by the experimental artifact. However, more study is required to ascertain this characteristic.
The continental-scale distribution pattern of δ18O precip (India and Bangladesh) and that of the global tropics show similar characteristics. In both cases, the δ18Oprecipitation shows a similar range, approx +8 to −20‰. But skewness and kurtosis in the former case (−1.06, 0.53) differ slightly relative to the global case (−0.64, 0.69). On the other hand, the d-excess ranges approximately from −20 to +30‰ in both cases. Low S values illustrate symmetrical distribution patterns. But the K values differ to some extent. The residual kurtosis in the case of India and Bangladesh (−0.34) is lower than that for the global tropics (+0.58). This means the d-excess distribution for the global case is relatively more populated by extreme values. Since d-excess primarily depends on the moisture source, a high proportion of extreme values means rapid changes in moisture sources, or in other words, rapid changes in circulation patterns. This is likely for the tropical region, as the tropics basically serve as the meeting point of the inter-hemispheric circulations, which is the inter-tropical convergence zone. On a shorter spatial scale, such as the Indian subcontinent, the effect is expected to be moderate.
3.3 Paleoclimatic implication
Here we discuss if there is any implication in analyzing the statistical behavior of the precipitation isotope variation. Isotopic analysis of certain natural archives is widely used to reconstruct the past monsoon rainfall variability. Speleothems or cave deposits have been extensively used for this purpose [37, 47, 48, 49, 50]. The basic premise of this approach is that the isotopic signal of the ancient rainwater is preserved in the oxygen isotopic composition of calcium carbonate of the speleothem samples. But, the rainwater that percolates through the bedrock may have a residence time of a few years, typically 7–8 years. Hence the rainfall signal that is obtained from speleothem isotopic analysis represents an integrated effect of a few years. So the large variability of precipitation isotope would be attenuated, and a mean signal would be obtained. Hence, the variability of δ18Ospeleothem is expected to be much less than that of the δ18Oprecipitation. Similarly, the extreme values in δ18Ospeleothem are also likely to be low because of the spatial and temporal integrations of the precipitation isotopic values. As a result, the δ18Ospeleothem would possess low skewness and kurtosis values to yield a near-normal distribution. We have examined this hypothesis using a speleothem isotopic record. Berkelhammer et al.  produced a 1400 years record of δ18Ospeleothem belonging to the core monsoon zone of India and demonstrated that this record was able to make monsoon variability for the last 1400 years. The distribution pattern of δ18Ospeleothem is shown in Figure 7(a). A near-zero skewness (0.025) means the distribution pattern is highly symmetric.
Similarly, a low kurtosis value indicates that the central and the intermediate values mainly populate the distribution, while the extreme values play a minor role. In other words, the mean rainfall pattern is well captured. Now, a question may be raised: what would be an ideal precipitation isotope distribution pattern of the CMZ that would reliably be recorded by the speleothem samples? Obviously, if the anticipated pattern possesses a near-normal distribution, the signal would be expected to be accurately recorded by the speleothem sample. But, if the same pattern deviates widely, meaning high values of skewness and kurtosis characterize it, then the signal recorded in the speleothem sample could be noisy. So to get a realistic rainfall reconstruction from a proxy sample, the isotopic values of the rainfall in the neighboring region are expected to have near-normal distribution patterns. We have collected precipitation samples at Sagar and Nagpur approximately for the time period of 2016 to 2019 and used precipitation isotope data of Ahmedabad , all of whom belong to the CMZ of India (Figure 1). The δ18Oprecipitation distribution pattern of these samples is shown in Figure 7(b). The low values of skewness and kurtosis imply that the pattern is indeed near normal. Hence the choice of speleothem sample from the CMZ seems to serve well as far as the rainfall reconstruction is concerned.
4. Summary and conclusions
Statistical analysis of precipitation isotopes was carried out on a large number of samples with a variety of spatial scales. Precipitation samples almost always showed a positively skewed pattern with high values of skewness and kurtosis, typically 2 to 3 and 11 to 13, respectively. On the other hand, the oxygen isotopic values of precipitation mostly showed small negative skewness (typically −0.5 to −1.0). The kurtosis value varied from 0 to 4. The secondary parameter d-excess is characterized by low negative skewness (−1 to 0), and its kurtosis ranges from 0.5 to 5. A comparison of these statistical parameters among precipitation and its isotopic values reveals that precipitation pattern is always skewed and contains a significant amount of extreme values. But the isotopic distribution patterns are near-symmetrical, and they are not very sensitive to extreme rainfall events. This may mean that the isotopic values are less constrained by environmental forcing in comparison to the precipitation. Hence the mean values of precipitation isotopes may be estimated/predicted with better precision than the precipitation. This may have application in paleoclimatic reconstruction.
It was observed that the kurtosis value of d-excess is very sensitive to outliers that may have occurred due to analytical error, but similar behavior was not found in the case of δ18O or δD. Hence, the distribution parameters may offer a means to better quality control of the precipitation isotope data.
One of the potential applications of this technique is to study the interannual variability of monsoon rainfall. If a significant change is found especially in K value of d-excess distribution pattern in the year to year rainfall, then that change could imply different dynamical control and hence the monsoon circulation.
IITM is fully supported by the Earth System Science Organization of the Ministry of Earth Science, The Govt. of India. We thank the Director, IITM and R. Krishnan, Executive Director, Centre for Climate Change Research, IITM, for their support and encouragement.
Conflict of interest
The authors declare no conflict of interest.