The chapter describes effective algorithms that are often used in processing data measurements in Global Navigation Satellite Systems (GNSSs). Existing effective algorithm was developed for detection and elimination of outliers from GNSS data measurements. It is based on searching for a so-called optimal solution for which standard deviation and maximum absolute deviation of the measured data from mean values do not exceed specified threshold values, and the number of the detected outliers is minimal. A modification of this algorithm with complexity of Nlog2N is discussed. Generalization of the existing algorithm to the case when data series included some unknown trend will be presented. The processing trend is assumed to be described by an unknown function of time. The generalized algorithm includes the outlier detection algorithm and trend searching algorithm that has been tested using simulated data. A new algorithm will be presented for cycle slip repair using Melbourne-Wübbena linear combination formed from GNSS data measurements on two carrier frequencies. Test results for repair data in the case of multiple (cascade) cycle slips in actual observation data will also be presented in this chapter.
- global navigational satellite systems (GNSSs)
- GNSS measurements
- data screening
- optimal solution
- trend function
- Melbourne-Wübbena combination
- cycle slips
At present, five constellations of GNSS satellites are involved in the formation of observational data, which serve as a source for many applications related to navigation, geodesy, geodynamics, and in the performance of solving of many fundamental problems. These are American Global Positioning System (GPS), Russian Global Navigation Satellite System (GLONASS), European Galileo, Chinese BeiDou, and Japanese Quasi-Zenith Satellite System (QZSS). The satellites of each of the operating systems transmit signals, as a rule, on two L-band carriers, which are received by GNSS receivers. A large number of stations equipped with GNSS receivers and located around the World are part of the International GNSS Service (IGS) network. These stations generate observation data files and transmit them to international databases in real time [1, 2], after which these data become available for use by many institutions and laboratories over the World. When solving applications, the measurement data go through various processing steps. Significant element of the data processing is the detection of rough measurements and removal them from the further processing. Despite the fact that many of the laboratories use a high-end application of the software regarding accuracy, reliability, and robustness, the presence of rough measurements in the observational data excludes the possibility of obtaining an accurate final result. In order to obtain results of unprecedented accuracy, the measurement data must be cleared of coarse measurements or outliers. It should be noted that the concept of outliers is key in the measurement processing theory , and there is no general definition for it. In order to distinguish outliers from the rest of the measured data, in some cases, the deviation of the data series values from some average value of the data is considered. If the deviation from the average is exceeded by a predetermined threshold value, the measured value is considered as an outlier. Such an approach has a significant disadvantage that the exact mean is generally unknown, and the estimate obtained by averaging a series may be very inaccurate due to outliers. Existing iterative procedures are also based on the idea of calculating deviation from the average and often result in the unjustified rejection of many observations. Reducing the data involved in processing may, in turn, result in a loss of accuracy of the final result.
This chapter describes the outliers cleaning algorithm for GNSS data. The proposed algorithms are based on the search for the so-called optimal solution with the minimum amount of invalidly rejected data. The algorithm for accelerated detection of outliers in a large amount of measurements has been developed, as well as an algorithm for detecting outliers in data containing an unknown trend. In conclusion, the algorithm of jump detection in the Melbourne-Wübbena combination [3, 4, 5], including the developed procedure of cleaning data from outliers, is considered.
In Section 2, the problem of searching for the so-called optimal solution is formulated. Section 3 provides a search algorithm, with a common number of arithmetic operations not exceeding ∼N2. Section 4 presents the test results for actual measurements in global navigational satellite systems at two carrier frequencies. The searching of outliers was performed in the Melbourne-Wübbena combination. In Section 5, the assertions that are the mathematical prerequisites for justifying a fast outlier search algorithm are proved. In Section 6, the fast outlier detection algorithm with the number of arithmetic operations of Nlog2N is proposed. Section 7 describes the case of data with unknown trend. Iterative procedure of outlier search is proposed based on the finding of suitable trend approximation in polynomials class. The idea of excluding coarse measurements is based on finding a so-called minimizing set of measurement data of a given length. This distinguishes the proposed algorithm from known similar procedures in which outliers are detected by exceeding a preset threshold. The test results with simulated data are given. Sections 8 and 9 discuss the problem of detecting jumps in the Melbourne-Wübbena combination. An algorithm is proposed that includes the outlier cleaning procedure based on the search for the optimal solution. Section 10 shows the numerical calculations with real data for algorithms presented in Sections 3–9. Section 11 concludes the chapter.
2. Statement of the problem: a Melbourne-Wübbena combination
Often, measurements taken at moments of time j can be presented in the form:
where is a trend function, depending on physical process and as a rule is unknown, the sum is unknown random variable imposed on the trend with unknown constant z and a centered random variable .
Detection of outliers in data series expressed in Eq. (1) with unknown trend is uncertain since the concept of measurement or outliers itself is uncertain. In many cases, however, the trend function is known a priori.
For example, many data processing programs often use different linear combinations formed of code and phase measurement data to eliminate unknown parameters. One such combination is the Melbourne-Wübbena combination composed of both, carrier phase and code observables as described by Melbourne  and Wübbena . This combination eliminates the effect of the ionosphere, the geometry, the clocks, and the troposphere , and it is often used to detect loss of carrier phase capture in the preprocessing stages. The Melbourne-Wübbena combination generated for a specific satellite-receiver pair can be presented in the form of the sum of three terms . One of the terms includes the integer wide-lane ambiguity for the two carrier frequencies ; the second component accounts for the satellite and receiver instrumental delays; and the third component is the measurement noise, including carrier phase and code multipath. Thus, during a time interval where the integer wide-lane ambiguity does not change, the Melbourne-Wübbena combination can be written as formula (1) with .
Another example is satellite clock correction values derived from navigation message data, which can also be represented as in Eq. (1) with , where
with an unknown constant z, which we cannot be determined in advance, because the random value is not known and may contain outliers.
In Sections 2–6, we consider the case where the trend is known a priori, that is, the data can be presented as Eq. (2). A problem with an unknown trend will be discussed in Section 7. In principle, the outlier detection procedure described below is not affected by the measurement format expressed in Eq. (1) or (2); it can be applied to any set of data measurements .
The preliminary processing task includes rejection of rough measurements or outliers from data series (2). In other words, it is necessary to find a set of
where and are the standard deviation (SD) and their associated specified threshold values; MINOBS is a parameter limiting from below the length of the required set of measured values (e.g., 10), and we will assume hereafter that MINOBS <
The values that are not included in the set are treated as coarse measurements and removed from further processing. Typically, expressions in Eqs. (3)–(5) are the only conditions considered in processing programs when screening out rough measurements. Below we will formulate the problem of searching for a solution, complementing the conditions expressed in Eqs. (3)–(5) with two extreme conditions .
1. First, we will require that the length of the set sought be the maximum, that is, the number of measurements deemed to be coarse is the minimum:
Note that for the predetermined values , the problem solution satisfying conditions in Eqs. (3)–(5) may not exist (e.g., when includes a trend, in particular when is an arithmetic progression with a step greater than ). In the case when the solution does exist, we will denote the value L at which the maximum of Eq. (6) is reached as . Note that the condition expressed in Eq. (6) does not ensure the uniqueness of the set because several sets of length can be found that satisfy the conditions in Eqs. (3)–(5).
Let us define as follow:
Note that the value
Usually, iterative methods are used to find a solution to problem expressed in Eqs. (3)–(5). For example, the algorithm implemented in the observation data smoothing program (see ) is designed to find a set Y satisfying the conditions given by Eqs. (3)–(5). The proposed step-by-step algorithm is based on iterations (the index number of iteration is designated by the upper index in parentheses):
Step 1: Initialization: ; ; k = 0.
Step 2: Checking the length of the set ; if it is less than MINOBS, then the process comes to an end and a solution is not found.
Step 4: Checking the fulfillment of the inequality . If it is satisfied, the set is accepted as the solution, and the search process comes to an end. Otherwise, there is a transition to Step 5.
Step 5: Definition of for outlier detection:
In order to prevent an infinite loop of iterations, a required verification is carried out:
If this inequality is not satisfied, then the following is assumed:
Step 6: Definition of a new set to include those and only those for which
Step 7: Increasing k by 1: k++. Transition to Step 2.
Note that the optimal solution cannot be found in such a manner, as confirmed by numerical calculations (see Section 4).
3. Algorithm for solving the problem
and, therefore => ,
and, therefore => .
In the first case, Case (a), we replace the value in the set with . In the second case, Case (b), we replace with . In any of the cases, we will have another set of the same length for which conditions expressed in Eqs. (3) and (4) are satisfied, but the SD, as follows from Eq. (3), is less than . Consequently, the set is not optimal since requirement expressed in Eq. (7) is not satisfied. The contradiction that is reached proves Assertion 1.
Further, note that if is the optimal solution of the problem given by Eqs. (3)–(7), then an arbitrary permutation of the numbers will also be the optimal solution of the problem described by Eqs. (3)–(7). Thus, the optimal solution does not depend on the arrangement of given numbers . This allows us to arrange the numbers in the order most suitable for searching the optimal solution. Taking advantage of this important circumstance, we arrange the values in the ascending order and we will look for the optimal solution in the ordered sequence. For brevity, the ordered sequence will also be denoted by . Thus, . Moreover, for simplicity of logic, we will assume that all are different, that is,
Note that due to the formulated Assertion 1, if is the optimal set and and are its smallest and greatest values, respectively, then all values of from the interval belong to . Consequently, considering Eq. (10), we have and
Thus, the optimal solution should be sought in the ascending sequence among all possible sets of length L with k and L satisfying the following conditions:
Hence, instead of searching for all possible sets of various length, numbering , for the solution of the problem described by Eqs. (3)–(7), it is sufficient to vary just the two parameters,
Let be the length of an arbitrary set . We introduce the designations:
Note that the two last inequalities directly follow from Eq. (4), monotony of yj, and obvious inequalities: .
The following recursive relationships are available, making it possible to find and through the calculated values и for seven arithmetic operations:
The values of the fractions may be computed in advance as elements of a one-dimensional array. Analogously, the following formulas can make it possible to express and through and :
The algorithm described below is based on the search for all possible pairs (k, L), where L denotes the length of the set to be checked and k is the index of the smallest of the values included in the set. At that, k and L must satisfy conditions Eqs. (11) and (12). The set corresponding to each such pair must be checked for fulfillment conditions (15) and (16).
We organize this search according to the algorithm described below, at each step of which we check the validation of Eqs. (15) and (16) for all possible sets of a certain length. We start the examine process with Step 1, where we check the set of maximum length N. Further, with each next step, we will reduce the length of the sets to be checked by 1.
Step 1: We consider the set of length N. There is only one such set: . We check it for fulfillment conditions expressed in Eqs. (15) and (16). If they are satisfied, this set is selected as a solution, and further search stops. Otherwise, transit to Step 2.
Step 2: We consider the sets of length N − 1. There are two sets of length N − 1.
We test each of these sets for compliance with the conditions specified by Eqs. (15) and (16). If none of them satisfies conditions (15) and (16), then we transit to the next step. Otherwise, the following options are available:
Step N − L + 1: We consider the sets of length L. If L < MINOBS, then the search process stops, and a solution is not found. For L ≥ MINOBS, we examine N − L + 1 sets of length L:
We check each of these sets for fulfillment of conditions (15) and (16). If any of them does not satisfy these conditions, then we transit to the next step where we consider the sets of length (L − 1). Otherwise, two options are possible:
In accordance with the proposed arrangement, we calculate the values and in the first step of the algorithm using formulas expressed in Eqs. (13) and (14), and 4
4. Results of calculations using algorithm presented in Section 3
We test the algorithms discussed above on the real data obtained by the ONSA station that is a part of the IGS network . These data are included in the distribution kit of the installation software package  and available for usage. We consider measurement data received from global positioning system (GPS) satellite with system number PRN = 12 for 2010, day 207 to check the efficiency of the proposed algorithm described in Section 3. Figure 2 plots the values of the Melbourne-Wübbena combination over a time interval of 89.5 min (
We also provide similar results for data obtained by TLSE station, which is also included in the IGS network. We consider measurement data from GLONASS, Russia satellite with system number PRN = 1 for 2010, day 207. Figure 4 shows the values of the Melbourne-Wübbena combination over a time interval of 65.5 min (
5. Mathematical prerequisites for modifying of existing algorithm
Note that the number of arithmetic operations required to find the optimal solution according to the algorithm described in Section 3 depends on the , which is the length of the solution. As can be seen from Eq. (24), the smaller the length of the found solution (i.e., the larger the number of detected outliers), the more arithmetic operations are required to find it. This number of arithmetic operations is estimated to be of order . Note that the expression in the parentheses herein is equal to the number of outliers detected. Thus, if the number of outliers in the original data series is comparable to N, it will take ∼arithmetic operations to find the optimal solution. In particular, to make certain that there is no solution (e.g., in the case where the data contain a trend), ∼arithmetic operations will also be required. In Section 6, we will modify the existing algorithm and describe fast outlier search algorithm that requires ∼arithmetic operations.
The necessary preparations are given in this section. Note that in this and the next sections we are dealing with the sequence arranged in the ascending order.
One of two cases is possible:
Suppose, for example, the case (a) holds. Let us show that in this case
At first, we will show that:
and the above inequality derived in Case (a) implies:
Substituting here of the expression (17) in place of and writing for brevity,
Transform the right-hand side of this inequality:
Here we take into account the equality: . Next, we have:
Substituting this expression in Eq. (31), we get inequality
We introduce the notation:
Since k is chosen arbitrarily, then for all L = MINOBS, …, N − 1 the following inequalities hold:
which proves Assertion 3.
Assertion 3 implies the following corollary.
In the above-described procedure for solving problem (3)–(7), it takes ∼arithmetic operations to make certain that the solution not exists. Taking into account Assertion 3 and Corollary 1, the search procedure may begin by checking the conditions.
This will require approximately ∼N arithmetic operations. If none of these conditions are fulfilled, the solution search must stop because the solution does not exist. As a result, only ∼N arithmetic operations are required to ensure that there is no solution.
6. Fast outlier search algorithm
The above proposed search procedure consists in the calculating values of and using recurrent formulas (17)–(22) and checking at every k and L the fulfillment of the inequalities (11) and (12). The algorithm complexity is estimated by value of ∼, where is the number of outliers found. If it is known a priori that there are few outliers in the measurement data, then the search algorithm for the optimal solution that described in Section 3 can be applied. If the measurement data contain an N-comparable number of outliers, the complexity of such an algorithm will be estimated by about . It is namely for such type of data we below describe a modified outlier search algorithm with complexity of about .
First of all, note one property that is the key to the construction of a fast outlier search algorithm. Note that if the inequality (15) holds for some set of length L + 1, then there exists a set of length L for which the inequality (15) is valid too. Truly, let assume for some k the inequality holds. This inequality and Eq. (25) imply
From this, it follows that
This means that at least one of these sets and with length of L satisfies conditions expressed in Eq. (15).
However, this property is not true when checking the conditions expressed in Eq. (16). In other words, if these conditions are fulfilled for any set of length L + 1, it might happen that none of the sets of length L may satisfy them. This fact is a significant obstacle to increasing the rate of outlier detection that is necessary when processing a large amount of data with a large number of rough measurements. To overcome this obstacle, we will make the condition expressed in Eq. (16) weaker.
First of all, note that if for some set , the both conditions expressed in Eq. (16) are fulfilled, then the following inequality will hold:
It is easily seen that condition expressed in Eq. (36) for an arbitrary set of length L + 1 implies the fulfillment this condition for each of the sets and of length L. In fact, the fulfillment condition (36) for the set means the fulfillment of inequality from which due to monotony of , the inequalities imply and .
Thus, we have established the validity of the following assertion.
Based on this statement, we can formulate the following:
Step 0: Consider the segment of numerical axis, where = MINOBS, = N. In this step, there is one set of length N. We check it for fulfillment of the conditions expressed in Eqs. (15) and (36) with k = 1, L = N. If they are satisfied, this set is the solution, and our search stops. Otherwise, we pass to considering of sets of length MINOBS. We check each of these sets for fulfillment of conditions expressed in Eqs. (15) and (36). If none of them satisfy these conditions, then we stop the search process and conclude that the solution does not exist. Otherwise, once we find the set of length MINOBS satisfying conditions (15) and (36), we transit to Step 1.
Step 1: Step 1 is the same as Step k described below for k = 1
Step k: On the
The search process will continue until either case (b) or until the length of the segment is less than or equal to 1. In either case, the total number of steps will not exceed the number of . Since ∼N operations are performed in each step, the search process is guaranteed to be terminated in
7. Search an unknown trend in the power polynomial class
The need to process GNSS measurements including a trend on which noise and outliers are superimposed arises at different processing stages of the application process. As already stated above, satellite clock corrections contain a linear trend. In some cases, it may not be known, and then, one has to search for it, for example, by the least square method. The presence of outliers in the measurement data is a significant obstacle to accurate determination of drift and offset parameters of satellite clocks. Other examples are linear combinations of code and phase data on two carriers . To obtain high accuracy results, it is necessary to detect outliers against an unknown trend and remove them from further processing. This is the subject of this section.
7.1 Statement of the problem
Consider the problem of outlier detecting in data presented in the form of Eq. (1), recall that:
The procedure described above for finding the optimal solution in an ordered series of numbers may not produce an adequate result if applied to data containing an unknown trend. For example, there may be no solution, and all data will be defined as outliers. In order to detect outliers in a series of numbers with a trend using the algorithm described above, it is necessary to find a suitable approximation of an unknown function and subtract it from the data set. Searching for this approximation is usually done by selecting functions from some functional class. The choice of the functional class depends on the task. In some cases, these may be power polynomial, in other cases, trigonometric polynomial, etc. The presence of outliers in the data measurements makes it much more difficult to find such an approximation. In this section, we will describe the general approach to solving the problem and look for suitable approximations of trend in the class of power polynomial with real coefficients:
Thus, the problem consists in the creation of an algorithm for searching the trend in the class of power polynomial and detecting outliers in specified data series represented in Eq. (1).
7.2 Minimizing set of specified length L
Before we turn to the trend search algorithm construction, we will define the so-called minimizing set of given length L, which plays an essential role in the trend search. In addition, in Section 7.3, we will describe a search algorithm for such set based on the recurrent formulas (17)–(19) and (20)–(22).
Let be an arbitrary set of length L, composed of the values of a numerical series , the monotony is not supposed. The mean and the SD values for it are denoted by and . These values are calculated by standard formulas:
According to this definition, we have
Minimum in Eq. (41) is searched by all kinds of sets of length L composed of numbers of series .
Note that the numbers are not supposed to be in the ascending order.
Next, we will formulate and prove a statement similar to Assertion 1, which will allow us, when searching for a minimizing set, to proceed from the original series to its ordered permutation.
a. and therefore subject to inequality , we have
b. and therefore subject to inequality , we have
In the first case, Case (a), we replace the value in the set with . In the second case, Case (b), we replace the value with . We want to show that doing such replacement, the value expressed in Eqs. (40) and (41) will decrease. This will mean that is not a minimizing set.
Suppose Case (a). For brevity, we will write below z instead of and instead of . Denote and , the similar values obtained after replacement with . We have:
We want to show that
After simplification with taking into account Eq. (45), we get from here:
From (42), it follows (recall that we write z instead of ) that the expression in the square brackets is strictly less than zero. Thus, inequality (48) is proven. From this follows that the set is not minimizing one because the condition (41) is not met. Thus, we have arrived at a contradiction that proves the validity of the formulated Assertion 6. Case (b) is considered similarly.
7.3 search algorithm
Assertion 6 is much like Assertion 1, which made it possible to go from an arbitrary numerical series to an ordered one for the optimal solution search (see Section 3). Similarly, Assertion 6 makes it possible to go to an ordered number series to find the minimizing set of numbers of a given length. In our case, considerations similar to those presented in Section 3 may be made when searching for a minimizing set. Namely, if is the minimizing set of length L, then an arbitrary permutation of the numbers will also be the minimizing set. Thus, the process for searching of does not depend on the arrangement of given numbers . This allows us to arrange them in the order most suitable for searching the minimizing set. We arrange the values in the ascending order, and we will look for the minimizing set in the ordered sequence. As in Section 3, for brevity, the ordered sequence will also be denoted by . For simplicity of logic, we will assume that all are different, that is, Eq. (10) holds. Rewrite it for convenience:
Note that due to Assertion 6, if is the minimizing set and and are its smallest and greatest values, respectively, then all values of from the interval belong to . Consequently, considering Eq. (10), we have and therefore
Thus, in order to find a minimizing set of length L, it is sufficient for us to check N − L + 1 sets
and choose from them the one that has minimal SD value. In the minimizing set searching procedure, we use the appropriate designations and expressed by Eqs. (13) and (14) to denote mean and square SD in the case of sets composed of the ascending ordered values.
The calculation of and when searching can be carried out in the manner similar to that of the optimal solution according to the schematic, in which only the is shown:
At first, we carry out the transitions in the direction of the vertical arrows (see diagram in Figure 7) and calculate sequential values using formula (17)–(19). Finally, we go in the direction of the horizontal arrows and calculate values of using formula (20)–(22) and choose from them the minimal one.
7.4 Trend search algorithm
In order to correctly detect outliers in measurement data that include an unknown trend, it is necessary to find and remove trend from the original data. The problem in determining of unknown trend is to find a suitable profile for the measurement data by adjusting the fitting parameters. This implies, in turn, the needs to select from the specified series the “right” reference values used for fitting. If the data do not contain coarse measurements, all N values of the original data can be treated as reference ones and used for fitting. If the data contain rough measurements, the participation all or some of those values in the fit will result in a fatal distortion of the trend function and, as a result, an incorrect determination of the outliers. Thus, when determining the trend, it is necessary to exclude rough measurements from the number of reference values used for fitting and by which the trend approximations are built. We have the vicious circle. To determine outliers, it is necessary to find a proper trend approximation, and to find an exact trend approximation, we need to find all outliers and remove them from the values used for fitting. Another vicious circle is obtained when trying to choose the number of values for fitting. If we take it too small to minimize the possibility for outliers to be included into the subset of values for fitting, the data fit may not be accurate enough for the rest values of the data outside of the subset. If, on the contrary, we take rather large number of points for fitting, the outliers may be among them, and we will also get a bad approximation of the trend.
We describe herein a strategy for finding an unknown trend and detecting outliers. This strategy assumes that the number of outliers in the series presented by Eq. (1) does not exceed a certain value known a priori. Thus, we can suppose that the number of “right” values suitable for fitting in the series is not less than N − .
Below L is supposed to be fixed and associated with the number of the reference values of the series (1) used for fitting, and L ≤ N − .
Let us consider the following algorithm. It contains internal iterations, which we will denote with the upper index “s” in parentheses.
Step 0: n = 0. We set some L, satisfying the condition L ≤ N − .
Step 1: n++; s = 0; .
Step 2: s++. We fit polynomial to the data set and find fitting coefficients :
Step 3: Consider the values obtained after subtraction from the polynomial values with coefficients found in Step 2:
Using the algorithm described in Section 7.3, we find from the numbers defined in Eq. (52) a minimizing set of length L. For this set, we calculate the corresponding values of the mean and SD according to the formulas (40) and (41) in which is replaced by . We redefine the reference points for the next fit process (Step 2) by marking them with :
We transit to Step 2 and do so until the convergence of the , s = 1,2, …, is achieved. Note that we will consider the issue of convergence further in Section 7.5. After reaching convergence of values , we transit to Step 4 for outlier detection.
Step 4: Searching the optimal solution for data set , j = 1, …, N.
We find the optimal solution for values using the algorithm of Section 3 and determine the number of outliers detected. If it turns out that ≤, then solution is considered found; searching process stops: . Otherwise, if >, then we transit to Step 1.
We do this until we find a solution or until
7.5 Convergence of iterations in trend search algorithm
This section explains the convergence of the iterations described in the trend search algorithm (see previous section).
and therefore converged.
Using the defined in Eq. (53), rewrite the last equality as follows:
We transit to Step 2;
From the definition of the extremum of functional, it follows:
Taking into account Eq. (58), we have:
The extremum condition (one of n + 1) of functional is as follows:
From here, we derive:
Taking into account the designation for given by Eq. (52), we can write this equality in the form.
We transit to Step 3. In this step, for sequence , we find a minimizing set of length L: . The SD square for this set does not exceed the same magnitude for any other set, particularly for :
that is, , that proves Assertion 7.
8. Detection and repair cycle slips in the Melbourne-Wübbena combination algorithm
In this and the following section, we describe cycle slip repair algorithm for observers represented in the form of Melbourne-Wübbena combination, which is often used in modern GNSS measurement data processing programs. Loss by the receiver of the carrier phase capture results in jumps in the code and phase measurement data. In the absence of jumps, as we already discussed in Section 2, the values of the combination consist of measurement noise superimposed on an unknown constant value dependent on a specified satellite-receiver pair.
In case of temporary loss of carrier phase capture by the receiver, jumps occur in a series of values representing the Melbourne-Wübbena combination. The procedure of detecting jumps and eliminating them from the values of the combination, called cycle slip repair, is one of the most important steps of preprocessing GNSS data. The main difficulty in detecting jumps is that neither the exact size of jumps nor their epochs are known. A number of algorithms, descriptions of which can be found, for example, in Refs. [3, 6, 8] are proposed for the detection of jumps. Although differing in detail, they are based on a common idea, that is, comparison of the SDs of the time series of measurement data obtained from one of the bounds of the time interval to an arbitrary moment, an epoch. If the differences of the SDs corresponding to two adjacent epochs exceed a predetermined threshold value, then a jump is declared in one of these two epochs. A drawback of similar algorithms is the frequent false detection of jumps during epochs containing rough measurements (outliers) since the values of outliers can exceed the sizes of a jump itself. On the other hand, an attempt to increase the threshold value leads to the opposite effect, an inability to recognize jumps that are small in magnitude.
Below, we propose a robust cycle slip repair algorithm that allows, more reliably than similar known algorithms, to detect jumps and determine their sizes. The proposed algorithm is based on search for so-called clusters consisting of epochs, in which the values of the combination are grouped about corresponding predefined values. Besides, this algorithm implements the above-described method of cleaning data from outliers based on the search for the optimal solution. This method, combined with Springer’s algorithm used in Ref. , allows for the reliable determination of multiple (cascade) jumps in the Melbourne-Wübbena combination.
The Melbourne-Wübbena combination can be presented in the form of the total of three terms :
where is the formal wavelength (for all GPS satellites = 86.16 cm and for GLONASS satellites = 84.0 84.36 cm); is an unknown integer, the so-called wide lane ambiguity ;
where is an unknown piecewise-constant function of a discrete argument, regarding which it is only known that it takes integer values and can undergo integer jumps; β is an unknown constant value; and is a random variable. The problem consists in the development of an algorithm for automatic processing of the informational data of form of Eq. (66), making it possible to effectively determine integer-valued jumps against the noise component and outliers.
9. Description of the proposed algorithm
Let us present the proposed algorithm as the following sequence of steps.
Arrange the array in the ascending order and renumber it by placing the indexes in the ascending order. Denote the resulting array as . For simplicity of logic, we suppose that all are different. Therefore, we have: . Denote the corresponding permutation of the values of function as .
Step 1: On this step, we are looking for the maximum density of array values . We consider the segments of length Δ and look for the one that contains the largest number of values. Here, only those indexes
Step 2: Let be the segment found in the previous step, containing values of (see Figure 11). We calculate the mean
Step 3: Cluster searching. The values that are clustered about the mean
all points of the segment [k, l] are marked by the same value: = … = .
at the points j= k, l of the left and right boundaries of the segment, this inequality is satisfied:E69
the amount of points at which (69) is satisfied is no less than the present value of MINOBS;
the number of consecutive points in which (69) is not satisfied does not exceed the predefined value MAXGAP (e.g., 5);
the value of k cannot be reduced while maintaining the requirements of a–d; and
the value of l cannot be incremented while maintaining the requirements of a–d.
This definition illustrates in Figure 12.
It is understood that for specified values of
and since the arrays and differ only by permutation, it follows from Eq. (68) that the inequalities
If cluster was found, then we mark all points of it as 1, and then, we repeat the cluster search procedure. If even just one cluster has been found at this step, we transfer to Step 1. If not even one cluster has been found, then the search for clusters is complete and we transfer to Step 4.
Step 4: If even just one cluster has been found, we transfer to Step 5, and otherwise: (
Step 5: Search for individual jumps in clusters. Let us assume that
In each of the clusters that are found, outliers and 1-size jumps are possible. This follows immediately from the inequalities in (69) and the preestablished value 1.2.
Substep 5.1: In detecting a 1-size jump in a cluster, we use modified Springer algorithm (see Refs. [9, 10]) combined with the proposed in Section 3 algorithm that executes a search for the optimal solution with a minimum quantity of defective data.
Substep 5.2: We find all epochs and values (
Substep 5.3: We repair the data by the value of each found jump, using the formula
Substep 5.4: We rename: .
Step 6: Marking of points outside clusters. All points outside of the found clusters (if any) are marked as outliers.
Step 7: Ordering of clusters. We renumber the clusters so that they are placed left to right on the time axis. For the ordered set of clusters [
Step 8: Data screening within clusters and improving the mean values of .
Substep 8.1: In accordance with the algorithm proposed in Section 3, we perform screening from outliers in each of the
Substep 8.2: For each of the clusters cleaned of outliers, we determine the modified mean values .
Step 9: Jumps between clusters. It follows from the description presented above that the remaining jumps in the data are on boundaries between clusters. If only one cluster is found, the algorithm is terminated: no additional jumps are detected, and the data are cleaned of outliers. If more than one cluster is found (
Step 10: Repair data. We delete the jumps between clusters using formula analogous to (72):
We rename: . The algorithm is terminated.
10. Numerical calculations using algorithms presented in Sections 8 and 9
We present here the results of testing the proposed algorithm using real data obtained by the JOZ2 station, which is part of the IGS network . These data are included in the distribution set of the installation software package . Testing was carried out for data obtained from GPS satellite with number PRN = 13 for 2010, day 207. Figure 13 shows the Melbourne-Wübbena combination values in the 107 min time interval (
Figure 14a and b presents the values of the deviations from the mean value
This chapter presents several effective and stable algorithms for processing data received from GNSS receivers. These data form the basis of almost all engineering applications in the field of computational geo-dynamics and navigation and cadastral survey and in numerous fundamental research works as well. The accuracy of the results obtained is significantly influenced by the quality of the data used in the calculations. In particular, the presence of rough measurements (outliers) in the observation data can significantly reduce the accuracy of the calculations carried out. One of the tasks at the preliminary stage of data processing is reliable detection and removal of rough measurements from the series of measured data with minimum amount of rejected data. The so-called optimal solution, introduced in the chapter, made it possible to detect and eliminate outliers from observed data minimizing the number of rejected measurements. In addition, it is assumed that the data may contain a trend as an unknown function of time. The strategy for determining of the trend is depending on the physical process in question under an assumption that the trend is a continuous function of time. The efficiency of the search is definitely influenced by the choice of the function class from which the trend is searched. In this chapter, we considered the class of power polynomials, but in some cases, this choice may not lead to the expected result. It may require, for example, a class of trigonometric functions to find a suitable trend. The automatic search for the best functional class, together with the strategy of effectively finding an unknown trend, against the background of random noise and outliers, is a complex task for future research.