Estimating the Photovoltaic Hosting Capacity of a Low Voltage Feeder Using Smart Meters’ Measurements

Maximizing the share of renewable resources in the electric energy supply is a major challenge in the design of the future energy system. Regarding the low voltage (LV) level, the main focus is on the integration of distributed photovoltaic (PV) generation. Nowadays, the lack of monitoring and visibility, combined with the uncoordinated integration of distributed generation, often leads system operators to an impasse. As a matter of fact, the numerous dispersed PV units cause distinct power quality and cost-efficiency problems that restrain the further integration of PV units. The PV hosting capacity is a tool for addressing such power system performance and profitability issues so that the different stakeholders can discuss on a common ground. Photovoltaic hosting capacity of a feeder is the maximum amount of PV generation that can be connected to it without resulting in unacceptable power quality. This chapter demonstrates the usefulness of smart metering (SM) data in determining the maximum PV hosting capacity of an LV distribution feeder. Basically, the chapter introduces a probabilistic tool that estimates PV hosting capacity by using customer-specific energy flow data, recorded by SM devices. The probabilistic evaluation and the use of historical SM data yield a reliable estimation that considers the volatile character of distributed genera‐ tion and loads as well as technical constraints of the network (voltage magnitude, phase unbalance, congestion risk). As a case study, an existing LV feeder in Belgium is analysed. The feeder is located in an area with high PV penetration and large deploy‐ ment of SM devices.


Introduction
Maximizing the share of renewable resources in the electric energy supply is a major challenge in the design of the future energy system. Concerning the low voltage (LV) distribution system, this objective aims at increasing the self-sufficiency of LV feeders, based on local resources, while responding to the climate change. In such feeders, distributed photovoltaic (PV) generation is the mostly met distributed energy resource (DER).
So far, the biggest share of distributed PV units came with no previous planning or reinforcement of the network while monitoring data in the residential or commercial sector were absent almost everywhere in Europe. Given the lack of controllability in common LV networks, the uncoordinated integration of PV units often leads to distinct power quality issues. As a result, the connection of new PV units and therefore the increase of renewable energy share slows down. Adding to this fact the growing volatility of electricity consumption in the distribution network, the adoption of a streamlined planning approach for the future energy system becomes urgent.
In this evolving framework, distribution system operators (DSOs) are called to safeguard a stable and secure power supply in all possible demand conditions while fostering the massive integration of DER generation. In cost-efficiency terms, this fact highlights the necessity of leaving behind deterministic worst case planning approach. This traditionally applied approach focuses on the least favourable network operation states, which are very rare. Naturally, it leads to very restrictive decisions in terms of PV hosting capacity or to costly network reinforcements.
Given the current uncertainty of DSO costs and revenues, new planning tools are required for considering the constant variability of the energy network [1]. This argument becomes even more solid in view of the upcoming integration of electric vehicles and the development of flexibility services. As a matter of fact, both are seen as basic components of the future energy model. The large deployment of smart metering (SM) devices in the residential and commercial sector will drastically enlarge the potential of cost-effective planning approach. Indeed, customer-specific data will result in a better insight of the power distribution system.
Considering the above facts and the probabilistic character of the EN 50160 technical standard [2], [3] (which addresses the LV network), this chapter presents a feeder-and customer-specific probabilistic method that estimates the DER hosting capacity of an LV feeder. Practically it introduces a probabilistic tool that uses customer-specific energy flow data recorded by SM devices (that are installed in the studied feeder), over a period of 2 to 4 years depending on the customer. The deployment and probabilistic elaboration of long-term energy measurements (SM readings) yield a reliable estimation that considers the volatile character of distributed generation and loads as well as several operational metrics. Section 2 of this chapter outlines some of the existing scientific contributions that address this subject and presents the drivers for developing the proposed methodology and algorithm. Section 3 presents the overall structure of the developed algorithm and Section 4 thoroughly describes the primordial contribution of customer-specific SM readings in this development. Section 5 explains the computation process of the maximum acceptable PV hosting capacity.
As a case study, a real LV feeder in Belgium is analysed. The feeder is located in an area with high PV penetration and large deployment of SM devices. When the probabilistic character of EN 50160 standard's voltage limits is considered, the estimated PV hosting capacity is proved to be much higher than the one obtained with a deterministic approach, based on worst case energy flow profiles. At the same, the use of the SM readings verifies the computation of the technical constraints in the feeder that cannot be considered with a probabilistic approach (violation of the maximum current capacity of the lines).

Current Framework
In many regions worldwide, DER integration is hampered due to slow or over rigid hosting capacity review processes. As a result, customers who want to invest and play an active role in managing their energy usage are increasingly unable, in expediency and cost-efficiency terms, to do so. In this context, a stream-lined approach together with the expansion of allowable DER integration approvals seem to be a necessity [4].
However, the expansion of allowable approvals depends heavily on DER admissible penetration levels, which are determined by local DSOs. In order to increase penetration levels while facilitating the application review process, DSOs should incorporate automated DER hosting capacity analyses. An example process flow for incorporating such analysis into the DER integration review process is outlined in Figure 1.  Currently, many energy utilities are adapting their DER hosting capacity review so as to remove or update restrictive maximum allowable limits [5]. To this end, the Electric Power Research Institute (EPRI) presents a set of models that could be used by DSOs or electric utilities [6], [7]. These feeder-based methodologies are very solid computation examples that take account of all steady state operational criteria.
Focusing on PV hosting capacity, the EPRI's report presents stochastic analysis as a highly appropriate tool for determining feeder hosting capacity for distributed PV units. The stochastic deployment concerns the position and size of future PV units while the steady state estimation of the feeder is done with deterministic approach. Indeed, the analysed state estimation scenarios are based on four worst case load/ PV generation profiles.
In the same vein, a set of studies addressing the European framework and the EN 50160 standard highlight the efficiency of stochastic and probabilistic analysis in determining hosting capacity or otherwise the impact of PV generation in LV feeders [5]- [12].
Meanwhile, the European Photovoltaic Industry Association (EPIA) and the technical standard EN 50160 suggest that distribution networks should be designed on a probabilistic basis. For example, EN 50160 standard deals with the voltage characteristics of LV feeders in probabilistic terms. It gives recommendations that, for a percentage of measurements (e.g. 95%) over a given time, the voltage value must be within specified limits.
Most of the existing methodologies deploy the stochastic analysis regarding the size and position of PV units and not the load/generation profiles of customers. However, the ongoing integration of SM devices in LV networks enlarges the potential of using feeder-specific or even customer-specific data for modelling energy flows. According to [9], performing longterm measurements in the LV network is highly valuable and strongly recommended, not only for estimating the maximum PV hosting capacity, but also for voltage coordination of the network in general. Measurements of the voltage magnitude for a large time of customers may be time-consuming and expensive; however, many countries have already been installing energy meters that also allow the recording of voltage, current, active power and other metrics.
Considering these facts, the EPRI's report [7] estimates PV hosting capacity using feederspecific data to create either absolute worst case scenarios (maximum recorded generationminimum recorded load) or load/PV time-of-day coincident worst case scenarios. Therefore, feeder-specific data are indeed used; however, the steady state estimation of the feeder is still done with a deterministic approach. Consequently, this approach does not consider the fact that the time-of-day in which worst case values apply for a specific customer does not necessarily coincide with the one of other customers connected to the same feeder. Nevertheless, the operational criteria of the feeder are determined both by the individual user's demand and by the simultaneous demands of other network users. Since the demands of every user and the degree of coincidence between them constantly vary, so does the operation of the feeder [3].
The above argument demonstrates that although customer-specific SM data are primordial for creating reliable network models, there is another challenge that needs to be addressed. The latter lies in the fact that customers follow volume-wise (kWh) or capacity-wise (kW) an almost stable daily pattern. However, this pattern does not necessarily remain the same on the time axis. In long-term decision making, profiles should be based on the recorded ones considering all possible deviations. Those deviations could be inserted either as random statistical errors or by making random possible combinations of the recorded values or even by combining both approaches.
Consequently, reliable models that use customer-specific real SM readings and take into account load/PV time-and customer-variability are necessary for applying a less conservative and more cost-effective hosting capacity review. Probabilistic and particularly Monte Carlo approach are very suitable to address this modelling challenge.

The PV Hosting Capacity Computation Tool
Hosting capacity is defined as the maximum amount of PV that can be accommodated in the feeder without impacting system operation (reliability, power quality, etc.) under existing control and infrastructure configurations [7]. This chapter presents a tool that uses probabilistic state estimation, 15-min customer-specific SM energy flow readings and feeder-specific technical parameters to estimate the PV hosting capacity of a given LV feeder.
The proposed methodology aims to address the central block of Figure 1 ("Feeder-specific hosting capacity review") by providing a detailed location-and customer-specific DER hosting capacity analysis. The analysis takes into account the EN 50160 standard operational criteria [2], [3]. In particular, the focus is on voltage magnitude and unbalance which are the primary technical concerns in LV feeders with distributed PV generation. The maximum line capacity is also taken into account so as to address important reverse power flows due to high PV injection.
Although the EN 50160 standard sets the same voltage limits in all European countries (except from cases where stricter limits are locally imposed), the maximum line capacity heavily depends on the respective DSOs. In certain countries, line sections are chosen based on a longterm strategy that aims at minimising voltage and congestion risk even if loads and generation increase importantly in the future. However, such approach leads to higher initial investment which is not necessarily cost-effective. In other cases, line sections are chosen based on actual conditions or short term future scenarios so that customised solutions are applied as soon as problems arise.
Apart from steady state constraint management, there are other considerations that could be accounted for, such as transformer aging factor, line losses, etc. Such criteria are usually considered in an overall cost-benefit analysis (CBA); however, at present they are not addressed by the EN 50160 standard. Depending on the country and the applied DSO tariff methodology ("cost-plus", "revenue cap", etc.), DSOs are incentivised to reduce certain operation costs that can or cannot be integrated in their tariffs. Thus, the impact of such criteria on decision making, varies in function of the distribution utility. For this reason, this chapter computes PV hosting capacity focusing on commonly adopted EN 50160 standard criteria and line capacity issues. Line losses in the feeder, during PV injection hours, are also addressed however their rise is not imposed as a constraint to the further increase of admissible hosting capacity.

Overview of the simulation tool
As previously said, this chapter presents a probabilistic algorithm that determines the PV hosting capacity of an LV feeder by elaborating feeder-specific SM measurements. The SM measurements are the necessary input for performing a reliable steady state analysis of various possible energy flow scenarios in the studied feeder. The flowchart in Figure 2 presents the structure of the simulation algorithm, which is entirely developed in MATLAB®. The energy exchange scenarios are generated by the Monte Carlo algorithm sampling from the historic SM data of the feeder [13], [14]. The power flow analysis is performed with the three-phase algorithm that is presented in [14] and outlined in Appendix A. Both balanced and unbalanced situations can be considered in this study.

Feeder model
The feeder model is constructed based on the technical parameters of the lines, the position of the customers, the installed PV power per node, the voltage at the MV/LV transformer secondary output and the respective set points and bandwidths in case voltage control algorithms are integrated. The feeder model also assigns the load/PV generation SM datasets to the respective customers. This necessary information is available to the DSO.
Regarding the PV hosting capacity computation, the possible future locations of the PV units have to be specified in the feeder model. This analysis is not based on stochastic random distribution of PV units along the feeder. A set of scenarios regarding the positions of future PV nodes is specified and each one of them is studied separately so as to focus on its specific impact on the feeder, which is possible thanks to the customer-specific SM datasets.
The technical constraints that must be respected for the current situation and for future scenarios are the ones specified in local, regional or national directives. However, these operational constraints can be determined in a more restrictive manner, depending on the case. In the EU framework, the steady state constraints are set by the EN 50160 standard. Regarding voltage magnitude and unbalance, 95-percentile limits are suggested. For considering this EU standard, the simulation tool verifies that the following criteria apply for the whole system (in current and future installed PV power scenarios): where P overvoltage , P undervoltage and P unbalance represent respectively the probability of having an overvoltage, an undervoltage or exceeding the phase voltage unbalance limit at node i (phase j) over a number M of simulated network states. In V i,j , i stands for nodes 1 to N (total number of nodes in the feeder) and j stands for phase a, b or c. VUF i stands for the Voltage Unbalance Factor at node i.
The thermal limits of the cables are also considered in the computation. The current carrying capacities of the lines should not exceed the DSO requirements or the recommended values in technical standards such as [15]. The load flow analysis of each system state is performed with the three-phase algorithm that is explained in the Appendix [14].

Customer profiles and feeder state modelling based on historic SM datasets
The load/ PV profiles of existing customers are created by using their respective SM recorded datasets. Practically, each dataset consists of values of 15-min PV energy injection to the grid (E inj, grid ) , 15-min PV energy generation (E inj,pv ) and 15-min energy consumption (E cons, grid ) , recorded at the respective customer. For the 15-min time step q, the net energy consumption of customer i is computed by introducing the respective SM recorded values in the following formula: where E inj,gridi,q is the total PV energy that was injected by customer i to the grid during the time step q, E inj,pvi,q is the total PV energy that was generated by the PV unit of customer i during the time step q (a part of this energy is locally consumed by the customer) and E cons,gridi,q is the total energy consumption that was absorbed by customer i during the time step q. These three values are recorded by the SM device that is installed at customer i.
The two 15-min resolution datasets of E inj,pv and E load (energy exchanged at the coupling point of the customer with the feeder) are used to build two Typical Day Profiles (TDPs) for each customer; the "typical day" represents and characterizes a selected period (which can be a month, a season, a year or so on). Each TDP reflects the variation that the respective parameter can have at every individual quarter of an hour of a "typical day".
The TDP of energy consumption E load created for the month of April for one of the customers is graphically represented in Figure 3. Practically, if an SM device has been monitoring the energy flow of a user over a period of one month, each dot represents the total amount of net energy that was consumed by the user during time step q (q=1:96, represented on the horizontal axis) in one of the days of the month. Therefore, for each time step q, D dots (D= number of days of the month) have been drawn on the diagram of Figure 3, each one representing the energy consumption of the user during the respective time step in each one of the D days. Similar graphs can be created for all feeder users that are equipped with an SM device, both for their net energy consumption E load and their PV energy generation E inj,pv .
As a next step, the customer-specific TDPs are statistically transformed to 2 x 96 Cumulative Distribution Functions (CDFs) of Probability, one for PV generation and one for energy consumption (=2) multiplied by the number of quarters-hourly time steps in a day (=96), representing each one of the studied months. This transformation is made by applying the basic statistical formula, for each 15-min dataset: Therefore, the "typical day" of each customer (for the respective month) can be illustrated by two diagrams like the one presented in Figure 4, one for PV injection and one for net energy consumption. Exactly the same methodology is applied to build TDPs for the r.m.s. voltage at the secondary output of the MV/LV transformer, using 15-min data, also recorded at the specific MV/LV substation. The created statistical distributions (CDFs) of the time varying parameters (PV injection, net energy consumption and voltage at the MV/LV transformer) are used for defining multiple possible network states corresponding to each 15-min time step. Practically, a MC algorithm is applied for randomly sampling the values of the variable parameters at each node of the studied LV feeder as explained in [13], [14] . The combination of the nodal sampled values defines each network state that will be afterwards analysed by the power flow algorithm where E loadi is the 15-min net energy consumption at node i, E inj,pv,i is the 15-min PV energy generation at node i, t i is the time repartition factor of the consumed or generated energy at node i and V MV/LV is the voltage at the MV/LV transformer node. In case there are no sufficient data for intra 15-min intermittencies of energy flow, it can be assumed that the power flow is stable during the 15 minutes of each time step. This means that the time repartition factor t i corresponds to 15 minutes and is thus equal to 0.25.
The power flow analysis of the feeder requires considering each system state as instantaneous, and therefore the sampled energy values have to be transformed into instantaneous power values (E inj,pv→ P inj,pv and E load → P load ). This means that for each node we can consider either power injection or power consumption since both of them cannot be applied simultaneously at an instant. In such a way, the instantaneous power value that represents the power flow at the point of common coupling (PCC) of each customer i with the feeder is determined as follows: If P i is positive the respective customer i is instantaneously consuming power from the grid whereas if P i is negative, the customer is instantaneously injecting power into the grid.
The probabilistic deployment of this simulation tool relies on the principle that load/PV generation profiles of customers are highly time-varying. The generation of the system states is therefore based on a very large number of random combinations of customers' energy flow values. This time-variability induces another variability that concerns the time coincidence of the load profiles of various customers. Both arguments are very important when assessing the impact of PV generation on a LV network. Indeed, the consideration of this variability, both in the time axis and regarding customers coincidence, makes more realistic the simulation of the network operation. Such an approach can lead to less restrictive and more cost-effective decisions that do not rely on rare extreme cases but on the most frequent ones.

Generation profiles of future PV nodes
A key component in accurately assessing the impact of future PV units is reliably representing their generation profiles. Based on the findings of several studies, geographically close customers are entirely correlated as far as their PV generation profiles are concerned [16]. For this reason, this study considers that the generation profiles of future PV customers will be very similar, along the time axis, to the ones of the existing PV units.
As previously explained, the load/PV generation profiles of customers with SM devices are made of 96 Cumulative Distribution Functions (CDFs) of probability built with the 15-min recorded datasets. Concerning PV generation, such CDFs are apparently not available for the future PV units. For this reason, the available SM datasets are used in this case to create a reference CDF, based on the 15-min generation SM datasets of the existing PV owners [17]. This reference CDF is used to simulate the time-variability of PV generation at the future PV nodes. In reality, customers that are connected to the same LV feeder can have different PV units' sizes. Assuming an equivalent statistical distribution of their PV power profiles due to geographical proximity, the principle is to create a standardized reference CDF for PV generation in the specific feeder, based on the measurements of the available SM devices. Initially, the CDF for the 15-min PV energy generation E inj,pv,j,q of each existing PV node j is normalized by applying the following relation, for each time step q: , , , , , , SM , ,, for 1: where N SM is the number of users in the feeder that are equipped with an SM device,E inj, pv, j,q values are the normalized 15-min energy generation values of customer j during time step q, E inj,pv,j,q values are the recorded 15-min energy generation values of customer j during time step q and E tot,j is the total yearly PV energy generation of customer j.
Once this is done, the 15-min CDFs of every user are aggregated, as graphically outlined in Figure 5, in order to create one reference CDF that can represent all PV owners in the specific feeder. For creating the CDF of each particular future PV owner, this reference CDF should be normalised in function of his annual PV generation. For existing PV owners, such information is usually available to the DSO even if the customer is not monitored by an SM device. In case of future PV nodes, such information is apparently not available since no PV unit is connected. Consequently, the reference CDF is normalised with the annual PV generation of an existing PV unit (in the feeder or in proximity) multiplied by a reference factor f, as explained in the following section.

PV hosting capacity computation
Practically, the algorithm starts with the probabilistic analysis of the current situation (existing PV units), by simulating a large number M of possible system states. It is important to note that although system states are based on 15-min resolution data, each one of them is considered as a possible instantaneous state of the system. Thus, the accuracy and reliability of the computation increase with the number of treated system states.
The probabilities P overvoltage , P undervoltage and P unbalance are computed at every node, based on the analysis results. Compliance with the conditions set by (1) is verified for the whole feeder. Moreover, compliance with the maximum current capacity is verified in the entire feeder for all the studied system states. In case both conditions are respected, the algorithm increases the installed PV power at the future (specified by the user) PV nodes by the defined increase step. Therefore, let us consider an LV feeder that is simulated with a total number N of PV nodes. Some of the simulated N nodes may be currently existing PV nodes while the rest of them are the considered future PV nodes. If the total number of future PV nodes is equal to K (K ≤ N), the new installed power at each future PV node i is computed as follows: where P rated,l,i is the new installed PV power at node i in the current configuration l that will be analysed by the algorithm (in step 5, Figure2), P rated,l − 1,i is the installed PV power at node i that was analysed (and accepted in terms of impact on the technical constraints) in configuration l-1 and P step,i is the increase step (defined by the user for the respective node). A small P step value (≈0.5-1kVA for residential or small commercial customers) is recommended so as to make a more precise computation. Note that in several countries, concerning residential and smallbusiness customers, the maximum admissible installed power per distributed PV unit in the LV network is equal to 10kVA. In such cases, the condition P rated,l,i ≤ 10kVAshould be integrated in step 5 of the algorithm.
Once relation (6) is applied, the new installed PV power P rated,l,i is defined at every new PV node before the algorithm performs the next "hosting capacity review" iteration (step 5, Figure 2). However, the reference CDF that represents the time-variability of generation at the new PV nodes needs to be scaled in function of P rated,l at each node. To do so, the reference CDF could be normalised in function of the annual PV generation of the PV unit. For existing PV owners, this information is usually available to the DSO even if the customer is not monitored by an SM device. In this case, such information is not available since there are currently no PV units at the specified nodes. Consequently, the reference CDF is normalised with the annual PV generation of an existing PV unit (in the feeder or in proximity). Then, a reference factor f is introduced for scaling the normalised CDF in function of P rated,l The factor f i is computed as follows: , , , ,i 1: where P rated,ref is the installed PV power of the existing PV unit that has been used to normalize the reference CDF.
Once the generation profiles have been set up for the future PV nodes, the algorithm repeats steps 2 and 3 for analysing the current configuration l. At this point, it is important to clarify that each "hosting capacity review" iteration l practically performs the power flow analysis of configuration l by applying a full MC simulation, similar to the one of step 2. This means that each "hosting capacity review" iteration l runs the same large number of MC iterations M that was analysed in step 2. Thus, in every iteration l, a very large number of system states is analysed (=M·96) so that the values of P overvoltage , P undervoltage and P unbalance converge. Thanks to this procedure, the verification of compliance with equations (1) for each configuration l is assumed to be reliable. If the analysis of M system states, in configuration l, demonstrates that the operational constraints are not violated, the installed PV power is again increased at each future node. Then, the algorithm passes again to steps 4 and 5.
The described iterations stop as soon as the operational constraints are for the first time exceeded at least at one of the nodes. The PV size of some units could probably increase even more, given that the operational constraints at their PCC are not violated. However, this study treats the LV feeder as a whole since the violation of limits at one node is always affected by the energy flow at all nodes. The P rated,l,i that is applied in the last iteration l, which led to a violation of acceptable limits, is the one considered as the maximum admissible hosting capacity per node.
The aggregated PV hosting capacity of the feeder is computed by adding P rated,l,i (existing and new) along the feeder: where N is the total number of PV nodes in the feeder. In order to make a more detailed computation, different increase steps could be applied per node in function of its position in the feeder. The voltage limits are usually more easily violated at the end of the line. Consequently, the PV power steps could be bigger for the nodes at the head of the line. However, this strategy could eventually result to an earlier (in terms of PV size) violation of the limits at the last nodes, which does not tally with a common welfare among end-users.

Description of the simulation
This section describes the application of the previously described analysis tool for computing the PV hosting capacity of an LV feeder in Flobecq. Flobecq is a municipal area in Belgium with high penetration of distributed PV generation (≈25% of Flobecq LV network customers) and large deployment of SM devices. Thanks to an official research fellowship between the local DSO and the authors' affiliation, the technical parameters of the feeder and SM datasets of the respective customers have been communicated strictly for research purposes. The datasets that were used in this case study cover a total period of one year (2013). The topology of the simulated three-phase feeder is presented in Figure 6. Currently, four PV units are installed in the feeder which supplies a total of 16 residential customers. These PV units are located at nodes 4,5,12 and 14, by means of single-phase inverters, and their installed PV power is respectively 5kVA, 10kVA, 2.63kVA and 5kVA. A spatial correlation study had already been performed for the specific feeder and the generation profiles of the customers were proved to be entirely correlated [16]. This consideration is taken into account in this analysis, including the future PV nodes. Practically, this means that for every simulated system state, the randomly sampled probability value for defining PV generation is common for all PV units.
Concerning operational constraints, the ones of EN 50160 standard have been considered in the simulation. Therefore, compliance with the group of equations (1) has been verified for each system state, as far as voltage magnitude and unbalance are concerned. The maximum current capacity of the lines has been determined based on table [15]. The PV size increase step is defined equal to 1kVA and the power factor of all PV inverters is considered equal to 1, unless reactive power control is considered in the simulation.
A set of different scenarios have been simulated regarding the position and phase connection of future PV units as well as the action of voltage control schemes. The analysed scenarios are listed in Table 1.
Concerning scenarios A-D, only the on-off control scheme is considered, which is currently implemented by most DSOs in Europe. This control scheme enables a total cut-off of the PV unit (in most cases during 3 minutes) as soon as the voltage limit has been locally exceeded for a period longer than 10 minutes. This analysis considers each simulated state as instantaneous. Therefore, each violation of the 95-percentile limit of EN 50160 standard is counted in the probabilities even though in reality it might had lasted less than 10 minutes. This means that the computed maximum PV hosting capacity is possibly slightly lower than the one that the feeder can really support. The control scheme applied in scenario F is the three-phase damping control scheme which behaves resistively towards the negative-and zero-sequence voltage component, without modifying the injected power, so as to eliminate phase voltage unbalance [18]. This control scheme requires a three-phase PV inverter and it is very promising in terms of voltage magnitude and unbalance mitigation. It is actually implemented in a EU pilot program (FP7 INCREASE Project). The third control scheme is reactive power control in the way it is implemented in the Italian distribution system [20] concerning new PV units in the LV network.

Comparing with a deterministic approach
One of the main purposes of this study is to investigate, up to which extent, a probabilistic method based on customer-specific SM readings leads to a less restrictive computation of PV hosting capacity, compared to a deterministic approach. For this purpose, a deterministic approach has been implemented simulating worst case energy flow profiles. The load profiles of all customers and the PV generation profiles of existing PV units have been also based on SM recorded data. The deterministic steady state analysis has been conducted for scenarios A-D, F, G. Scenario E is not mentioned because, although SM readings are used, only 100percentile limits are considered which means that probabilities are not accounted for in the computation of hosting capacity. Thus, this scenario is practically a deterministic scenario.
The following load/ PV generation profiles have been considered in the deterministic approach:

3.
Minimum recorded load in the feeder during PV injection hours -Coincident PV generation/load values for the other nodes.

Results and discussion
The probabilistic hosting capacity review results are illustrated in Figure 7 and analytically listed in Table 2. The aggregated maximum admissible PV hosting capacity in the feeder is presented for each individual scenario, considering separately the EN 50160 standard's voltage limits and the maximum current capacity of the lines. This separate presentation has been chosen because voltage limits are treated with a probabilistic approach in the EN 50160 standard while congestion risk is treated by each DSO with a different approach. Most of them apply a deterministic approach that considers an upper (100-percentile) current limit. The violation which forbade further increase of the PV hosting capacity is also presented and quantified for each scenario. The aggregated PV hosting capacity obtained with deterministic analysis is presented in Figure 7 and Table 3 for all treated scenarios ( §5.2).   Firstly, one should note that the results considering the maximum current capacity coincide for the probabilistic and the deterministic computations since this metric is not addressed with probabilistic terms. In case the maximum value is exceeded at one point of the feeder during just one of the simulated states, the PV hosting capacity is not further increased in the respective simulation. Also, the result of scenario E (applying 100-percentile limits) is close to the ones of the deterministic scenarios A.I, A.II and A.III which analyse the same topology as scenario E but with a deterministic approach. Based on these remarks, one can reasonably assume that the probabilistic computation covers (samples and analyses) almost the whole range of possible system states, including the ones recorded in reality (the combination of coincidently recorded values) which are treated in the deterministic scenarios A.II and A.III.
However, accounting only for voltage violation, the restrictive condition of scenario E based on which voltage limits must never be exceeded (in none of the simulated states), results in a quite lower admissible PV hosting capacity compared to scenario A (same topology as scenario E). Basically, in scenario E, PV hosting capacity could not further increase because the computed P overvoltage resulted equal to 99.99% (>95% is the condition in EN 50160). Therefore, if the admissible PV hosting capacity does not exceed 94.63kVA, the operational limits will most probably never be violated in the feeder, based on the elaboration of the available historic data. Otherwise, if the admissible PV hosting capacity increases up to 154.63kVA, as in scenario A, voltage limits' violation will only take place in less than 5% of total system states. Therefore, even with such an increase of the aggregated PV hosting capacity, the temporary cut-offs of the PV units due to overvoltage will be very rare. Thus, scenario A takes advantage of the probabilistic character of EN 50160 standard (limits violation allowed during 5% of week time), which is not the case in scenario E or in the deterministic approach. However, EN 50160 defines that the limits should be respected in at least 95% of cases. Thus, the maximum PV power that can be added to the feeder, considering this configuration, is 132kVA (11kVA per new PV unit). The above arguments should be considered in a cost-benefit analysis (CBA) that compares costs for exceeding operational limits to losses due to an eventual penalty for low DER integration or loss of potential revenue for customers and energy utilities. Such considerations may allow a much more cost-effective PV integration strategy which also respects the applied standard's criteria. At the same time, the identification of critical points regarding congestion risk should also be considered. Both arguments highlight the usefulness of considering SM historic datasets in similar studies, so that critical points and probabilities are carefully mapped and quantified.
For highlighting the cost-effectiveness of deploying long-term measurements in the LV network and analysing it with a probabilistic approach, a more detailed computation of line losses in the feeder was performed for scenario A. Assuming that the computed maximum admissible PV power is installed (=154,63kVA if one considers only the voltage limits), the study focuses on the total energy losses along the lines of the feeder during hours of high PV injection in a typical day. The worst case approach considers only one system state which will more likely take place during hours with the highest PV injection. Based on the available historic data for the feeder, this period is between 12:00AM and 18:30PM on a typical July day. The sum of energy losses has been computed along the feeder for the considered period, for each simulated day. Figure 10 illustrates the statistical distribution (CDF) of the computed line losses, obtained with the probabilistic approach. The probabilistic approach and the consideration of the SM measurements demonstrated that total energy losses in the feeder vary significantly, depending on the system state. Consequently, in 95% of the simulated days, total energy losses during high PV injection hours (12:00AM to 18:30PM) do not exceed 35kWh in a day. In the deterministic approach which assumes the worst case scenario taking place all along the high PV injection period, the respective energy losses result equal to 148kWh. This important difference is due to the fact that the probabilistic approach considers the extremely low frequency of worst case scenarios to take place simultaneously for all feeder users. Considering such probabilities, the DSO could manage a less conservative and more cost-effective long-term strategy.
Undoubtedly, the computed PV hosting capacity values depend on the load profiles of the customers that are located in the feeder. However, the results clearly indicate in relative terms, that smaller distributed PV units have a much smoother impact than the bigger ones concentrated in one small area of the feeder. This fact is demonstrated by the comparison of scenario A to scenarios C and D. Moreover, as previously mentioned, in several cases the maximum admissible installed power per PV unit connected to the LV network is equal to 10kVA. In such cases, scenarios C and D might not be appropriate based on the probabilistic simulation results. As a matter of fact, the admissible total installed power would have to limit to 32.63kVA (for scenario D) although the network would be able to support 37. Based on the results of scenario G, reactive power control does not result in higher PV hosting capacity compared to scenario A (on-off control). Voltage profile in the feeder is however improved compared to scenario A. As a matter of fact, voltage limits are not violated in scenario G whereas the maximum current capacity limit is exceeded for the same amount of PV integration compared to scenario A.
In the first two cases (scenarios A and B), comparing the probabilistic simulation results to the respective ones of the deterministic approach, an important difference in the aggregated admissible hosting capacity is observed. At this point, it is important to mention that the violated parameter in the deterministic approaches is mainly the voltage magnitude and secondly the maximum current capacity of the lines. The deterministic approach led to 74-146% lower aggregated PV hosting capacity (compared to the one computed with the probabilistic approach) due to a violation that according to the probabilistic elaboration of the historic SM dataset took place for much less than 5% of the simulated system states. Indeed, based on Figure 8, the addition of 12 new PV units of 4kVA each (result of deterministic scenario A.I) generated an overvoltage risk that is lower than 1%.
In such cases, the probabilistic analysis demonstrates that the simulated worst case scenarios are extremely unlikely to happen in the studied feeder. However, such worst case approach is currently implemented by most DSOs when performing hosting capacity reviews. As a result, decisions for connecting new PV units in certain networks are very often extremely restrictive, with a big impact on the cost-efficiency of the network.
When it comes to scenarios C and D, one can note that if less but bigger size PV units are connected, the results of the probabilistic and the deterministic approach do not differ significantly. This result proves that in case of many distributed PV units, an approach that considers all worst case customers' profiles coinciding in time is mostly extreme. Deterministic approach cannot accurately simulate the volatile character of PV generation and the random loading parameters of residential and small commercial customers.
A general remark would concern the design strategy of distribution feeders like the studied one. The studied feeder currently hosts 22.63kVA of distributed PV generation and supplies 19 residential customers. The analysis of the current conditions (based on the historic SM datasets) demonstrated that both voltage violation risk and congestion risk are very low. Moreover, the above probabilistic load-flow analysis demonstrated that congestion and voltage problems will only appear if 48kVA and 132kVA respectively of distributed PV generation (scenario A) are further integrated. This remark highlights the cost-efficiency of designing distribution networks based on the most frequent system states or on well-studied future scenarios. This approach can lead to customised solutions and help to avoid overdimensioning and costly initial investments for the DSO.
Based on the above analysis, certain renewable integration scenarios could increase to an important extent the self-sufficiency of feeders like the studied one. As a result, their dependency on big conventional power plants, connected at the transmission level, could be efficiently reduced. However, big conventional plants are important for maintaining grid stability. In a high DER integration scenario, without large and reactive storage facilities and/or flexibility services, the amount of RES should be carefully reviewed. To this end, costs induced by the use of grid services, including insurance against periods when it is not possible to consume own generated electricity, should be considered and reflected in the bill of generator owners [1]. Reliable feasibility studies and comprehensive CBAs are necessary for evaluating various strategies in the decision making process.

The role of customer-specific SM data in PV hosting capacity reviews in LV networks
The above analysis is based on the use of customer-specific SM energy flow readings. Various maximum PV hosting capacity scenarios have been analysed by applying a probabilistic steady state analysis of the feeder on a 15-min time scale, sampling from the available SM data. In this way, the real probability of worst case scenarios has been accounted for and as a result, a probabilistic view of several technical metrics has been enabled (voltage and current magnitudes, voltage phase unbalance, line losses). Even if DSO long-term planning is based on deterministic approach, the use of SM datasets can validate the considered worst case scenarios. Besides, the wide deployment of SM devices can offer other possibilities such as better coordination and control of technical parameters of the LV network as well as better visibility and adaptability to actual load and generation profiles of LV customers. The long-term planning of LV networks can become more customised to local conditions and therefore more cost-effective.

Conclusions
This chapter addresses the problem of determining the maximum PV hosting capacity that can be accommodated in a LV distribution feeder, while respecting local technical standards. To this purpose, a probabilistic simulation tool that uses as input customer-specific SM energy flow data and feeder-specific parameters is presented. A PV hosting capacity review for a municipal area in Belgium is used as a case study for evaluating the usefulness and reliability of the proposed tool. The study outcome demonstrates that it is to the interest of the DSO and of the grid users to deploy probabilistic analysis that considers the time-variability of load/PV generation, both in the time axis and between different customers' profiles. This variability of network state can be taken into account thanks to the deployment of long-term SM measurements in the studied network. Consequently, the further deployment of SM devices is strongly recommended for achieving a more cost-effective long-term planning and coordination of the LV network.

Appendix: The power flow algorithm
The sequence admittance matrix for the main line is constructed with (A.1), where 0, 1, 2 represent respectively the zero-, positive-and negative-sequence whereas the superscript Z stands for series admittance. The probabilistic algorithm samples the voltage value V n at the slack node (secondary output of the MV/LV transformer), which remains fixed throughout the forward/backward sweep process as in (A.2) [21]- [22]. The bold type stands for vectors and the under bar stands for complex numbers. The length of vector V ¯i nitial,1 or 2 or 3 is equal to the number N of the nodes in the feeder.
in symmetrical components The unbalanced laterals are solved with a forward-backward method in phase components, which gives the power injections S lat,x,i per phase (x=a,b or c) at each lateral's root node i (the total transited power by the respective lateral). In the described case the unbalanced laterals are 1-phase lines; therefore, S lat,x,i is computed per phase by a 1-phase algorithm. The computed S lat,x,i replaces each unbalanced lateral at its root node in the main line. During the backward step, the phase currents due to the nodal loads are computed for the nodes of the main line with (A.4 Active power values P x,i are defined by the MC algorithm, whereas reactive values Q x,i are calculated with constant values cosφ load <1 (in case the node consumes energy from the network) and cosφ inj =1 (in case the node injects energy in the n). Once the I load,abc (phase components) matrix is constructed, it is transformed by means of the Fortescue transformation into the respective I load,012 (sequence components) matrix. The specified nodal loads for the positive sequence S load,1 are calculated with (A.7). At this point, the positive sequence nodal voltages V 1 are computed by applying the 1-phase load flow algorithm of [23].