Theoretical and experimental values of the scaling law for the first order moment for different values of the cross-correlation.
Water is essential to all forms of life. The development of humanity is associated to the use of water, and nowadays, the constant availability and satisfaction of water demand is a fundamental requirement in modern societies. Although water seems to be abundant on our planet, fresh water is not an inexhaustible resource and has to be managed in a rational and sustainable way. The demand for water is dynamic and influenced by various factors, from geographic, climatic and socioeconomic conditions, to cultural habits. Even within the same neighbourhood the user-specific water demand is elastic to price, condition of the water distribution system (WDS), air temperature, precipitation, and housing composition (regarding only residential demand in this case). On top of all these factors, demand varies during the day and the week.
Traditionally, for WDS modelling purposes, water demand is considered as being deterministic. This simplification worked relatively well in the past, since the major part of the studies on water demands were conducted only with the objective of quantifying global demands, both on the present and on the long-term. With the development of optimal operating schedules of supply systems, hourly water demand forecasting started to become increasingly more important. Moreover, taking in consideration all the aforementioned factors that influence water use, it is clear that demand is not deterministic, but stochastic. Thus, more recently, in order to guarantee the requested water quantities with adequate pressure and quality, the studies began to focus on instantaneous demands and their stochastic structure.
1.1. Descriptive and Predictive Models for Water Demand
The first stochastic model for (indoor) residential water demands was proposed by Buchberger and Wu . According to the authors, residential water demand can be characterized by three parameters: frequency, duration and intensity, which in turn can be described by a Poisson rectangular pulse process (PRP). The adopted conceptual approach is relatively similar to basic notions of queuing theory: a busy server draws water from the system at a random, but constant, intensity and, during a random period of time. Residential demands were subdivided into deterministic and stochastic servers. Deterministic servers, including washing machines and toilets, produce pulses which are always similar. Stochastic servers, like water taps, instead produce pulses with great variability, and their duration and intensity are independent. The PRP process found to best describe water demand is non-homogeneous, i.e., when the pulse frequency is not constant in time. Different authors used real demand data to assess the adequacy of the non-homogeneous PRP model, achieving good results . Moreover, the PRP model was confirmed to allow the characterization of the spatial and temporal instantaneous variability of flows in a network, unlike the traditional models that use spatial and temporal averages and neglect the instantaneous variations of demand. One drawback to the rectangular pulse based models is the fact that the total intensity is not exactly equal to the sum of the individual intensities of overlapping pulses, due to increased head loss caused by the increased flow . This problem can however be solved by introducing a correction factor. The daily variability of demand represents another drawback to the PRP model, since it can invalidate the hypothesis that pulses arrive following a time dependent Poisson process . One possible solution to this question is to treat the time dependent non-homogeneous process as a piecewise homogeneous process, by dividing the day into homogeneous intervals . Another solution consists in using an alternative demand model: the cluster Neyman-Scott rectangular pulse model (NSRP), proposed by Alvisi . The model is similar to the PRP model, but the total demand and the frequency of pulses are obtained in different ways. In the PRP model the total water demand follows a Poisson process resulting from the sum of the single-user Poisson processes, with a single arrival rate. In the NSRP model, a random number of individual demands (or elementary demands) are aggregated in demand blocks. The origin of the demand blocks is given by a Poisson process, with a certain rate between the subsequent arrivals. The temporal distance between the origins of each of the elementary demands to the origin of the demand block, follows an exponential distribution with a different rate. The variation of these parameters during the day reflects the cyclic nature of demands. A good approximation of the statistical moments for different levels of spatial and temporal aggregation was achieved; however, the variance of demand becomes underestimated for higher levels of spatial aggregation.
The aforementioned models are mainly descriptive. More recently, Blokker and Vreeburg  developed a predictive end-use model, based on statistical information about users and end uses, which is able to forecast water demand patterns with small temporal and spatial scales. In this model, each end-use is simulated as a rectangular pulse with specific probability distribution functions for the intensity, duration and frequency, and a given probability of use over the day. End-uses are discriminated into different types (bath, bathroom tap, dish washer, kitchen tap, shower, outside tap, washing machine, WC). The statistical distribution for the frequency of each end-use was retrieved from survey information from the Netherlands. The duration and intensity were determined, partly from the survey and partly from technical information on water-using appliances. From the retrieved information, a diurnal pattern could be built for each user. Users represent a key point in the model and are divided into groups based on household size, age, gender and occupation. Simulation results were found to be in good agreement with measured demand data. The End-Use model has also been combined with a network solver, obtaining good results for the travel times, maximum flows, velocities and pressures .
The PRP and the End-Use model were compared against data from Milford, Ohio. The achieved results showed that both models compare well with the measurements. The End-Use model performs better when simulating the demand patterns of a single family residence, while the PRP models describes more accurately the demand pattern of several aggregated residences . The main difference between the models is the number of parameters they use: the PRP model is a relatively simple model that has only a few parameters, while the End-Use model has a large number of parameters. However, the End-Use model is very flexible towards the input parameters, which also have a clearer physical meaning and hence more intuitive to calibrate. The PRP model describes the measured flows very well. From the analytic description provided by the PRP model, a lot of mathematical deductions can be made. Thus, one can classify the PRP model as a descriptive model with a lot of potential to provide insight into some basic elements of water use, such as peak demands and cross-correlations . The End-Use model is a Monte Carlo type simulation that can be used as a predictive model, since it produces very realistic demand patterns. The End-Use model can be applied in scenario studies to show the result of changes in water using appliances and human behaviour. Possible improvements to the model include the incorporation of leakage, the consideration of demands as a function of the network pressure and the application of the model outside the Netherlands . Li studied the spatial correlation of demand series that follow PRP processes. It was verified that while time averaged demands that follow a homogeneous PRP process are uncorrelated, demands that follow a non-homogenous PRP process are correlated, and that this correlation increases with spatial and temporal aggregation. A similar conclusion about the correlation was achieved by Moughton from field measurements.
1.2. Uncertainty and reliability-based design of water distribution systems
The problem of WDS design consists in the definition of improvement decisions that can optimize the system given certain objectives. As aforementioned, in the earliest works regarding the optimal design of water distribution systems (WDS), input parameters, like water demand, were considered as being deterministic, often leading to under-designed networks. A robust design, allowing a system to remain feasible under a variety of values that the uncertain input parameters can assume, can only be achieved through a probabilistic approach. In a probabilistic analysis the input parameters are considered to be random variables, i.e., the single values of the parameters are replaced with statistical information that illustrates the degree of uncertainty about the true value of the parameter. The outcomes, like nodal heads, are consequently also random variables, allowing the expression of the networks’ reliability.
Uncertainty in demand and pressure heads was first explicitly considered by Lansey . The authors developed a single-objective chance constrained minimization problem, which was solved using the generalized reduced gradient method GRG2. The obtained results showed that higher reliability requirements were associated to higher design costs when one of the variables of the problem was uncertain.
Xu and Goulter  proposed an alternative method for assessing reliability in WDS. The mean values of pressure heads were obtained from the deterministic solution of the network model. The variance values were obtained using the first-order second moment method (FOSM). The probability density function (PDF) of nodal heads defined by these mean and variance values was used to estimate the reliability at each node. The approach proved to be suitable for demands with small variability. Kapelan  developed two new methods for the robust design of WDS: the integration method and the sampling method. The integration method consists in replacing the stochastic target robustness constraint (minimum pressure head) with a set of deterministic constraints. For that matter it is necessary to know the mean and standard deviation of the pressure heads. However, since pressure heads are dependent of the demands, it is not possible to obtain analytically the values for the standard deviations. Approximations of the values of the standard deviations are obtained by assuming the superposition principle, which makes it possible to estimate the contribution of the uncertainty in demand on the uncertainty of pressure heads. The sampling method is based on a general stochastic optimization framework, this is, a double looped process consisting on a sampling loop within an optimization loop. The optimization loop finds the optimal solution, and the sampling loop propagates the uncertainty in the input variables to the output variables, thus evaluating the potential solutions.
The aforementioned optimization problems are formulated as constrained single-objective problems, resulting in only one optimal solution (minimum cost), that provides a certain level of reliability. More recently, these optimization problems have been replaced with multi-objective problems. Babayan  formulated a multi-objective optimization problem considering two objectives at the same time: the minimization of the design cost and the maximization of the systems’ robustness. Nodal demands and pipe roughness coefficients were assumed to be independent random variables following some PDF.
At this point, all the aforementioned models assume nodal demands as independent random variables. However, in real-life demands are most likely correlated: demands may rise and fall due to the same causes. Kapelan  introduced nodal demands as correlated random variables into a multi-objective optimization problem. The authors verified that the optimal design solution is more expensive when demands are correlated than the equivalent solution when demands are uncorrelated. A similar conclusion was achieved by Filion . These results sustain that assuming uncorrelated demands can lead to less reliable network designs. Thus, even if increasing the complexity of optimization problems, demand correlation should always be taken into account in the design of WDS.
The robust design of WDS has gained popularity over the last years. Researchers have been focusing on methods and algorithms to solve the stochastic optimization problems, and great improvements have been made in this aspect. However, the quantification of the uncertainty itself has not been addressed. Values for the variance and correlation of nodal demands are always assumed and no attention is being paid in properly quantifying these parameters. The optimization problems could be significantly improved if more realistic values for the uncertainty would be taken into account.
This work addresses the need to understand in which measure the statistical parameters depend on the number of aggregated users and on the temporal resolution in which they are estimated. It intends to describe these dependencies through scaling laws, in order to derive the statistical properties of the total demand of a group of users from the features (mean, variance and correlation) of the demand process of a single-user. Being part of the first author’s PhD research, which aims the development of descriptive and predictive models for water demand that provide insight into peak demands, extreme events and correlations at different spatial and temporal scales, these models will, in future stages, be incorporated in decision models for design purpose or scenario evaluation. Through this approach, we hope to develop more realistic and reliable WDS design and management solutions.
2. Statistical characterization of water demand
Recent studies on uncertainty in water distribution systems (WDS) refer that nodal demands are the most significant inputs in hydraulic and water quality models . The variability of water demand affects the overall reliability of the model, the assessment of the spatial and temporal distributions of the pressure heads, and the evaluation of water quality along the different pipes. These uncertainties assume a different importance depending on the spatial and temporal scales that are considered when describing the network. The degree of uncertainty becomes more relevant when finer scales are reached, i.e., when small groups of users and instantaneous demands are considered. Thus, for a correct and realistic design and management, as well as simulation and performance assessment of WDS it is essential to have accurate values of water demand that take into account the variability of consumption at different scales. For that matter, the thorough description of the statistical properties of demand of the different groups of customers in the network, at specific temporal resolutions, is essential.
For a better understanding of this aspect, let us consider the distribution of the customers in a network. Figure 1 shows the network of a small town where the customers can be classified mainly as residential.
The most peripheral pipe serves the inhabitants of one single building. When moving upwards in the network, the number of customers increases reaching a maximum of 1258 customers near the tank. Obviously, as a consequence, the mean flow increases from the peripheral building to the tank. The increase of the variance of the flow is, however, less obvious. For larger networks and more densely populated towns, the difference between the number of customers that are close and far from the tank, and consequently, the variations of the mean and variance of the flow is even more pronounced.
Another important aspect when modelling a network is the choice of the adequate temporal resolution. This choice depends on the characteristics of the available measurement instruments and on the type of analysis to perform. When modelling the peripheral part of a network, characterized by a significant temporal variation of demand, it is important to adopt fine temporal resolutions, i.e., in the order of seconds. For the estimation of peak flows in design problems Tessendorff  suggests the use of different temporal resolutions on different sections of the network: the author suggests the use of a 15 second time interval for customer installation lines, two minutes for service lines, 15 minutes for distribution lines, and 30 minutes for mains and secondary feeders. The statistical properties of water demand are affected by the considered temporal resolution. The use of longer sampling intervals causes an inevitable loss of information about the signals, resulting in lower estimates for the variance [21, 22]. This aspect is particularly relevant at the peripheral pipes of the network that, as aforementioned, are characterized by large demand fluctuations. Therefore, understanding the spatial and temporal scaling properties of water demand is essential to build a stochastic model for water consumption.
Water demand can be described by a stochastic process in which represents the demand of water of the single-user at time instant. In order to estimate the statistical properties of water demand, it is necessary to have a historical series of observations, extended to sufficiently wide number of users of each type. From this data it is then possible to estimate the mean and variance of the process.
If the consumers are assumed to be of the same type, the properties of demand can be considered to be homogeneous in space, this is, they are independent of the particular consumer that is taken into consideration. Regarding the temporal variability, the stochastic process can only be assumed to be stationary in time intervals during which the mean stays constant. Once the length of this time interval, , is established, it is possible to determine the temporal mean, , and variance,, of the demand signal of the single-user, as followed:
For homogeneous and stationary demands, the expected values for the mean and variance, and, obtained from N observations, provide the mean and variance of the process.
2.1. Correlation between consumers
The definition of the mean and variance for each type of consumer is not enough for a complete statistical characterization of demand. In order to obtain a realistic representation of the demand loads at the different nodes in a network; essential for the assessment of the network performance under conditions as close as possible to the actual working conditions, the correlation between nodal demands cannot be ignored. This correlation can be expressed through the cross-covariance and cross-correlation coefficient functions.
The cross-covariance,, and cross-correlation coefficient,, between user of group and user of group, during the observation period, are expressed, respectively, as followed:
As known, the WDS need to guarantee minimum working conditions, this is, the minimum pressure requirements have to be satisfied at each node even under maximum demand loading conditions. If all the consumers in the network are of the same type, it seems reasonable to assume a perfect correlation between demands, and to simplify the analysis of the network by assigning the same demand pattern to all the consumers. The synchronism of demands is the worst scenario that can occur on a network, causing the widest pressure fluctuations at the nodes. The assumption of a perfect correlation for design purposes results in reliable networks, but it also requires the increase of the pipe diameters, which consequently increases the networks cost. In fact, as mentioned earlier, each consumer has his own demand pattern based on specific needs and habits, without knowing what other consumers are doing at the same time. This means that demand signals in real networks are correlated, but are not synchronous. Thus, in order to obtain the optimum design of a network, it is essential to estimate the accurate level of correlation between the consumers. On the other hand, to estimate accurately the spatial correlation between demands, it is necessary to collect and analyse historical series, resulting in additional costs in the design phase. However, these additional costs will most certainly be compensated by the achieved reduction of the following construction costs.
2.2. The scaling laws approach in modelling water demand uncertainty
Water demand uncertainty is made of both aleatory or inherent uncertainty, due to the natural and unpredictable variability of demand in space and time, and epistemic or internal uncertainty, due to a lack of knowledge about it. Hutton  distinguishes epistemic uncertainty in two types. The first type concerns the nature of the demand patterns, and the lack of knowledge about this variability when modelling WDS both in time and space. This uncertainty is defined as ‘two-dimensional’ uncertainty since it is composed by both aleatory and epistemic uncertainty. It can be reduced with extended and expensive spatial and temporal data collection or through the employment of descriptive and predictive water demand models. The second type of epistemic uncertainty takes the spatial allocation of water demand into account when modelling WDS .
Dealing with the ’two-dimensional’ uncertainty when modelling WDS, requires not only a complete statistical characterization of demand variability, but also the determination of the correlation among the different users and groups of users. The natural variability of demand can be expressed using probability density functions (PDF). A PDF is characterized by its shape (e.g. normal, exponential, gamma, among others) and by specific parameters like the population mean and variance. Thus, in order to represent uncertain water demand using a PDF, it is necessary to identify and estimate the values of these parameters. The consideration of different spatial and temporal aggregation levels induces changes in the PDF parameters, often leading to a reduction of the uncertainty. The auto-correlation and cross-correlation that characterize the water demand signals affect the extent to which the PDF parameters vary, and can introduce an additional sensitivity to the specific period of observation in question.
In order to understand the effects of spatial aggregation and sampling intervals on the statistical properties of demand, it is possible to develop analytical expressions for the moments (mean, variance, cross-covariance and cross-correlation coefficient) of demand time series, at a fixed time sampling frequency, of aggregated users as a function of the moments of the single-user series sampled in the observation period T. These expressions are referred to as “Scaling Laws”, and can be expressed as:
Whereis the expected value of the moment for users for the time interval;is the expected value of the moment for the single-user for the same time interval; is the exponent of the scaling law; and is a function that expresses the influence of both sampling rate and observation period.
The development of the scaling laws is based on the assumption that the demand can be described by a homogeneous and stationary process, which implies that the aggregated users are of the same type (residential, commercial, industrial, etc.), and that the statistical properties of demand, mean and variance, can be assumed constant in time. The scaling laws for the mean, variance, and lag1 covariance were derived by Magini . The expected value of the total mean demand can be expressed as followed:
Whereis the expected demand value for the single user or ‘unit mean’. This expression shows that the mean demand increases linearly with the number of users according to a factor of proportionality equal to the expected value of the single user and is independent of the sampling rate and observation period.
In order to estimate the expected value of the demand variance it is necessary to consider the covariance function of the single-user demand at the spatial and temporal lags, and, respectively. The following expression is obtained (see  for the mathematical passages):
Where is the covariance function at lag, and is the space-time covariance function. This expression shows that the expected value for the sample variance of the -users process depends on the correlation structure of the single-user demands. The term decreases as the period of observation increases, becoming negligible when, beinga parameter, connected to the cross-correlation of the demands and similar to the scale of fluctuation for the auto-correlation of a single signal.
The term is independent from and, and assumes the following values:
Where is the spatial cross-covariance between different single-user demands, and the variance of the single user. For large values of, equation (7) can be simplified into:
This equation represents the scaling law for the variance, neglecting the bias that can be caused when using small the demand series (short observation periods).
Introducing the Pearson cross-correlation coefficient given by, and considering as the cross-correlation coefficient between each couple of single-user demands, the spatial covariance can be expressed as, and Equation (9) becomes:
If demands are perfectly correlated in space then is equal to one, and equation (10) is simplified into:
If demands are uncorrelated in space then is equal to zero, and equation (9) is simplified into:
Since the cross-correlation coefficient can assume values between and, equations (10) and (11) represent the maximum and minimum expected values for the variance. Equation (9) can be simplified into a more generic form given by:
In conclusion, it can be stated that the variance in the consumption signal of a group of users, homogeneous in type, is proportional to the mean variance of the single-user according to an exponent, which varies between 1 and 2. The value of the scaling exponent depends on the structure of the spatial correlation, i.e., the correlation that exists between the different consumptions during the observation period: if demands are uncorrelated in space, the scaling law is linear, if demands are perfectly correlated in space, the scaling law is quadratic.
The variance function, , measures the reduction of the variance of the instantaneous signal when the sampling interval increases , as followed:
Where is the variance of the instantaneous signal for the single user. Introducing the variance function in equation (13), the following is obtained:
Similarly, the expected value of the cross-covariance is given by:
Neglecting the term he expected value of the cross-covariance between the demands of aggregated users of group A and aggregated users of group B is given by:
Where,is the expected Pearson cross-correlation coefficient between the single-user demands of the two groups; and and are the standard deviations of the single-user demands of groups and, respectively, at the sampling rate. The expected value of the cross-covariance increases according to the product between the number of users of each group. In the particular case in which both groups have the same statistical properties, i.e., they belong to the same process, and assuming that, the scaling law of the cross-covariance becomes quadratic.
As a consequence, the expected value of the Pearson cross-correlation coefficient between the demands of aggregated users of group A and aggregated users of group B, is given by:
This equation shows that this coefficient depends separately on the spatial aggregation levels of each group, na and nb, and not only on their product as happens for the cross-covariance. If na = nb = n equation  becomes:
From equation (19) it is possible to observe that the expected value increases with the number of users, and, reaching the following limit value:
Since by definition, the maximum value that the expected value of the cross-correlation coefficient between the single-user demands of group and can assume is:
From equation (21) it is also possible to observe that the Pearson cross-correlation coefficient between the aggregated users of group and the aggregated users of group depends on both the cross-correlations inside each group and the cross-correlation between the groups. Therefore, it seems interesting to investigate the way in which these two aspects, one at a time, affect the expected value of the cross-correlation when the number of aggregated users increases and for a fixed sampling rate. In order to do so let us first consider a fixed value of and varying values of and. Figure 2 shows the graphical results for and different pairs of and.
As expected, all the curves have a common starting point, since is fixed. According to equation (19) a gradual flattening of the curves and a reduction of the shape ratio can be noticed when the product increases. Let us now consider a different case in which and are fixed and varies. The results are shown graphically in figure 3. The curves have now different starting points and equal shape ratios. Increasing produces only an upward shift of the curves, extending their transient.
In the particular case in which both groups of users have the same statistical properties, i.e., they belong to the same process, and assuming, the scaling law for the cross-correlation coefficient, considering no differences in the sampling time intervals, is:
From equation (22) it is clear that the cross-correlation coefficientincreases with the number of aggregated users, tending to one. This limit value is reached as sooner as the cross-correlation coefficient, , between the single-user demands is higher.
3. Validation of the Analytical expressions
3.1. Synthetically generated signals: scaling laws for the mean and the variance
In order to confirm the analytical development reported in the previous paragraph, the scaling laws were derived for groups of synthetically and simultaneously generated consumption signals. At this aim the Multivariate Streamflow model , with a normal probability distribution, was used. Each group was assumed to contain 300 consumption signals with 3600 realizations each, distinguished by different values of the cross-correlation coefficient between them. The correctness of the procedure used to generate each demand series was tested by checking that the mean, the variance and the cross-correlation coefficient of the generated signals were equal to the input parameters of the model. Only little differences were observed (Table 1), which are explained due to the fact that the generated demand series are realizations of a stochastic process and, consequently, their moments necessarily differ from the theoretical ones.
Once the single consumption signals of each group were generated, they were aggregated randomly selecting one at a time, until a maximum of 100 aggregated consumption signals was reached. The first and second order moments, mean and variance, were calculated for each aggregation level. In order to obtain a result as general as possible, the same procedure has been repeated 50 times, aggregating each time different users . The obtained results are summarized in Table 1 and 2, with reference to equation 5.
Results confirm the linear scaling for the first order moment and show that the variance increases with the spatial aggregation level according to an exponent that varies between 1 and 2. In theory, for spatially uncorrelated demands the scaling laws is linear and for perfectly correlated demands the scaling law is quadratic.
3.2. Synthetically generated signals: scaling laws for the cross-covariance
In this case pairs of aggregated consumption series, A and B, were obtained by randomly selecting among pairs of the previously generated groups of signals. Different values of the product, where is the number of signals in group A and the same number in group B, were considered, up to the maximum value. Each aggregation process was characterized by the cross-correlation value between the single signals in the same group and the cross-correlation value between the single signals of the two native groups. The cross-covariance was computed for the different aggregation levels and the scaling law were derived for each process. The results are summarized in Table 3 with reference to equation 17, considering and α as the exponent of the product
Results confirm that is always equal to one. However, in this case the scaling does not consider the number of aggregated users, but their product, and thus the law is not linear but quadratic. A similar approach was also applied in the particular case in which, and, that is, when all the consumptions are homogeneous, and with.
3.3. Real consumption data: scaling laws for the mean and the variance
The parameters of the scaling laws were also derived for a set of real demand data. The indoor water uses demand series of 82 single-family homes, with a total of 177 inhabitants, in a building belonging to the IIACP (Italian Association of Council Houses) in the town of Latina were considered [29, 30]. The apartments are inhabited by single-income families, belonging to the same low socioeconomic class.The daily demand series of four different days (4 consecutive Mondays) of the 82 users were considered . For each user the different days of consumptions can be considered different realizations of the same stochastic process. In this way the number of customers was artificially extended to about 300, preserving at the same time the homogeneity of the sample. The temporal resolution of each time series is 1 second.
|Time||E [μ1]||E [σ2 1]||αvar|
|confidence limits 95%||0.082||0.759||0.024|
The series were divided into time periods of 1 hour to guarantee the stationarity of the process. In Table 5 the estimated values of the expected values of the mean and the variance of the unit user and the exponent α for the scaling law of the variance are reported. The same exponents for the mean were always trivially equal to 1. In these results the first six hours of the day and the last one were excluded because, during the night hours consumptions are very small and therefore their statistics have a poor significance. It was observed that the mean scales linearly with the number of customers. Differently, the variance shows a slight non-linearity with the number of users. It must be underlined that the average daily value of the exponent α is 1.1, showing that there is a very weak correlation between the considered users.
3.4. Real consumption data: scaling laws for the cross-covariance andcross-correlation coefficient
Considering the consumption signals belonging to homogeneous users, equation 23 is valid and a quadratic scaling law for the cross-covariance should be expected. This behaviour was confirmed by the measured data for all the time intervals considered. In Figure 4 the scaling law of the consumption signals between 11:00 and 12:00 am is graphically reported.
The obtained cross-correlation coefficient between the single user signals was low, being always less than 0.05, but increased noticeably when the number of aggregated users increased, as expected according to equation 22. For groups of 150 aggregated users the cross-correlation coefficient reached the values shown in Table 6. These results enhance the importance of evaluating the cross-correlation degree at different levels of spatial aggregation. Even if the cross-correlation between single-user demand signals is relatively low and less likely to significantly affect the performance of a network, it can largely increase with the spatial aggregation of users, becoming not negligible at those larger scales.
4. Stochastic simulation of a network
To illustrate the effect of the uncertainty of water demands on the performance of a network, particularly, the effect of the level of correlation between consumptions on the outcome pressure heads, a simple network simulation was performed. The water distribution network of Hanoi (Fujiwara and Khang, 1990) was considered for this matter (Figure 5).
The data for the Hanoi network were taken from the literature (Fujiwara and Khang, 1990), and the pipe diameters were assumed to be the ones obtained by Cunha and Sousa (2001). The demand data from the literature was used to estimate the number of users at each node, assuming a single-user mean demand of 0.002 l/s. All the users in the network were assumed to be residential and having the same characteristics. The standard deviation of demand was assumed to be 0.06 l/s. The Multivariate Streamflow model  was used to generate synthetic stochastic demands with different levels of cross-correlation between the single-users. The nodal demands were then introduced in the network and the performance of the network was simulated using EPANET . For each considered degree of cross-correlation between demands, 100 simulations were performed, resulting in series of pressure heads for each node and for each correlation level.
The first aspect that emerges from the simulations, is that the number of nodes that fail, i.e., which do not satisfy the minimum pressure requirements, increase when the cross-correlation degree increases. Higher correlations imply more synchronous consumptions, leading to pressure failures. Figure 6 illustrates this result.
Observing Figure 6 it is clear that the cross-correlation between demands significantly affects the outcome pressure heads. The number of nodes that do not satisfy the minimum pressure requirements in the network increase from 194 nodes (total nodes in the network that fail in 100 simulations) when the cross-correlation between demands is equal to 0.001, to a total of 543 nodes when the cross-correlation between demands is 0.999. In other words, the probability of failure increases from 6.3% to 17.5% between the minimum and maximum levels of cross-correlation that were considered.
Another aspect that emerges from the simulations is the increase of standard deviation of the pressure heads at each node of the network, which is illustrated in Figure 7.
The standard deviation of the pressure head verified at each node increases when the cross-correlation between demands increases. The average standard deviation of the pressure heads along the network when the cross-correlation between demands is equal to 0.01 is 1.35m, while the average standard deviation of the pressure heads when the cross-correlation is 0.99, is 5.75m. This means that the cross-correlation increases from 0.01 to 0.99 the standard deviation of the pressure heads increases more than 4 times.
The obtained results clearly show that the level of cross-correlation between demands significantly affects the performance of a network and should, therefore, not be ignored when designing and managing WDS.
Understanding and modelling the stochastic nature of water demand represents a challenging field for researchers. Stochastic modelling faces difficulties like scarce availability of data for calibration purposes, high computational efforts associated to simulations, and the complexity of the problem itself. Moreover, the statistical properties of water demand change with the spatial and temporal scales that are used, which makes it even more difficult to accurately model the stochastic structure of demand. The proposed scaling laws represent a step forward in understanding the relation between the parameters that describe probabilistic demands and the spatial and temporal scales in which demands are measured and in which they should be modelled for WDS design or management purposes. The use of scaling laws allow a more accurate quantification of the statistical parameters, like variance and correlation, based on the real demand patterns, number of users at each node and the sampling time that is used. The scaling laws also allow to easily change the scale of the problem, since the statistical parameters and levels of uncertainty can be derived for any desired time or spatial scale.
The scaling laws were derived analytically and validated using synthetically generated stochastic demands and real demand data from Latina, Italy. A good agreement was found between the theoretical expressions, the synthetic demand data and the real demand data. Results show that the mean increases linearly with the number of aggregated users. The variance increases with spatial aggregation according to an exponent that varies between 1 and 2. In theory, for spatially uncorrelated demands the scaling laws is linear and for perfectly correlated demands the scaling law is quadratic. This aspect is clearly verified by the synthetic data. The scaling law for the covariance between 2 groups of users increases according to the product between the numbers of users in each group. The cross-correlation coefficient depends separately on the number of users in each group, and increases towards a limit value. Even for small values of cross-correlation between single-user demands, this parameter cannot be ignored since it significantly increases with the aggregation of consumers.
The performed network simulation considering stochastic demands with different pre-defined levels of correlation show a clear influence of the degree of correlation on the outcome pressure heads: higher levels of correlation lead to larger fluctuations of the pressure heads and to more frequent pressure failures. At this point, the stochastic correlated demands were only used for simulation purposes. However, in future work a similar approach, can be used for design and management purposes. The consideration of correlated stochastic demands will result in more realistic and reliable water distribution networks.
The participation of the first author in the study has been supported by Fundação para a Ciência e Tecnologia through Grant SFRH/BD/65842/2009.