Open access peer-reviewed chapter

# Probabilistic Model of Delay Propagation along the Train Flow

Written By

Vladimir Chebotarev, Boris Davydov and Kseniya Kablukova

Submitted: 31 October 2017 Reviewed: 15 February 2018 Published: 26 September 2018

DOI: 10.5772/intechopen.75494

From the Edited Volume

## Probabilistic Modeling in System Engineering

Edited by Andrey Kostogryzov

Chapter metrics overview

View Full Metrics

## Abstract

In this chapter, we propose a probabilistic model for train delay propagation. There are deduced formulas for the probability distributions of arrival headways and knock-on delays depending on distributions of the primary delay duration and the departure headways. We prove some key mathematical statements. The obtained formulas allow to predict the frequency of train arrival delays and to determine the optimal traffic adjustments. Several important special cases of initial probability distributions are considered. Results of the theoretical analysis are verified by comparison with statistical data on the train traffic at the Russian railways.

### Keywords

• train traffic
• stochastic model
• train delay propagation
• probabilistic modeling
• operative management

## 1. Introduction

The trains’ movement is subject to a variety of random factors which leads to unplanned delays. This causes the scattering of the arrival times, hence, the inconvenience to passengers and consignees. Knowledge of the arrival times’ distribution properties leads to the possibility of predicting the characteristics of the train traffic and making correct decisions on the transportation process management. This makes it possible to improve the punctuality of train traffic and save resources, in particular, electric power.

The properties of the arrival headways distributions allow us to estimate the probability of delays emergence and theirs characteristics, which are important from a practical point of view. Probabilistic modeling of the delay propagation process along the train flow is the main tool for solving this problem.

The models for the distribution of delays in a dense train flow are divided into two classes. These are deterministic and stochastic models. Stochastic models take into account the unpredictable nature of obstacles in the railway. A mathematical model, proposed in the present chapter, make it possible to determine the probability distributions of the arrival headways of two consecutive trains at the station. The distribution properties are analyzed for different scattering of input random variables (the primary delay and the initial headways). Comparison of theoretical distributions with real statistics of train traffic on the Russian railways is performed.

## 2. Literature review

A substantial volume of literature is devoted to study of the train delays effect on the railway functioning. Deterministic models for primary and knock-on delays description were proposed in [1, 2]. These models based on the application of graph theory allow adjust the train traffic schedule. However, such approach considering the different characteristics of train traffic (e.g., travel and dwell times, headways, etc.) as deterministic values does not take into account the uncertainties that arise in reality.

Stochastic modeling takes the influence of random factors (e.g., see [3, 4, 5, 6, 7, 8]) into account. Authors of [7] determine a probabilistic distribution of the arrival times. The problem of finding a distribution of arrival train delays is examined in [8]. It should be noted that in these papers, special cases of primary delay distribution are considered. It is supposed in [8] that the random duration of the primary delay corresponds to some generalization of the exponential law. The paper [7] employs discretization of the delay distribution.

Some of the researchers have analyzed statistical data on deviations of the train arrival times from the planned ones. In particular, the papers [9, 10, 11] show that scattering of these deviations correspond to the exponential distribution.

## 3. Description of models and analysis of the arrival headways distribution

### 3.1. The first model

Trains follow one path one after another in one direction from station A to station B with the same average speed v 0 . Let the total number of trains is n. The distance from the train j to the train (j − 1) is denoted by X j + s 0 , where j = 2, 3, …, n, s 0 > 0 is the minimal safe distance between trains, and X 2 , X 3 , …, X n are the random variables (without any assumptions about their distributions). All trains have the same destination station.

Let us also introduce the notations: μ j = X j / v 0 , t 0 = s 0 / v 0 . Suppose that train 1 departs from station A at the time t = 0 . Then, the moment T m of departure train m can be found as (as shown at Figure 1):

T m = j = 2 m μ j + m 1 t 0 ,   m = 2 , 3 , , n E1

Assume that at some point in time, train 1 makes unplanned stop. The duration of this stop is random value τ . The subsequent trains suffer knock-on delays, when the value τ is large enough. Following train stops when the distance to the front train is reduced to s 0 . It is assumed that as soon as the front train restore running, then the next one immediately follows it. The following problem is considered: to find out the probability distribution of the random arrival headway between the trains (k − 1) and k at the destination B (denote this headway as ν k ), assume that only the first train makes an unplanned stop. In other words, we need to find the (cumulative) distribution functions W k t = Ρ ν k < t , k = 2, 3, …, n. Call this problem by the first problem.

### 3.2. The second model

Suppose that train 1 was delayed at station A at the moment t = 0 and waited for a random time τ . If τ < μ 2 , then trains 2, 3, and so on, depart at the planned times: T 2 , T 3 , etc. If τ > μ 2 , then train 2 will be delayed and will depart at the time τ + t 0 > T 2 . Train 3 departs according to the same rule depending on the delay time of train 2, and so on. In this formulation, ν k is actual departure headway between the trains with numbers (k − 1) and k. It is required to determine the distribution functions W k t of random variables ν k , k = 2, 3, …, n.

Example 1. Let n = 5, μ k = 2 , k = 2 , 5 ¯ , t 0 = 1 . The moments of planned departures of trains satisfy the equalities T k = 3 k 1 , k = 1 , 5 ¯ . Figure 2 shows the process of headways ν k forming, k = 2 , 5 ¯ , depending on the six values of the interval τ . The dots represent real train departure times that result from the primary delay τ .

The basic model assumptions are follows: (1) only train 1 is exposed to primary delay τ . (2) T k T k 1 > t 0 , k = 2, 3, …, n.

Denote by R k the real departure time of the train with number k, which depends on τ and t 0 .

We suppose that the departure times of trains satisfy the following two rules. Let k be fixed, 2 k n . The first rule: if R k 1 T k t 0 , then R k = T k . The second rule: if R k 1 T k t 0 , then R k = R k 1 + t 0 . Obviously, R k T k .

In what follows, we use the notation I x A = 1 , if x A , 0 , if x R \ A , where A is an arbitrary set on the real line R.

Suppose that the total number of trains is equal to n 2 . Formally, we set ν k = 0 if k > n . Let us proceed to the formulation of the obtained results. We note that the proofs of the majority of the assertions are not given here due to the condition on the size. They take up a lot of space and will be published in our other work.

Theorem 1. 1. If τ < μ 2 , then ν 2 = μ 2 + t 0 τ , ν k = μ k + t 0 , 3 k n .

2. Let k be a fixed integer, 2 k n . If τ j = 2 k μ j , then ν 2 = = ν k = t 0 .

3. If j = 2 k μ j τ < j = 2 k + 1 μ j , then

ν k + 1 = I k + 1 n j = 2 k + 1 μ j + t 0 τ , E2

ν m = I m n μ m + t 0 ,   m = k + 2 , , n E3

Theorem 2. Let n 2 . For any k, 2 k n , the following formula holds

W k t = I t > t 0 Ρ μ k < t t 0 τ < j = 2 k 1 μ j + Ρ τ + t t 0 > j = 2 k μ j τ j = 2 k 1 μ j , E4
in particular,
W 2 t = I t > t 0 Ρ τ + t t 0 > μ 2 E5

Let us introduce the notations, G x = Ρ τ < x , G ¯ x = Ρ τ > x . Note that G x + G ¯ x + Ρ τ = x = 1 . We denote by g x the density function of τ in the case when it is absolutely continuous.

Further, some corollaries of Theorem 2 are formulated.

Corollary 1. Let μ j , 2 j n , be arbitrary positive numbers, then for 2 k n

W k t = I t 0 < t μ k + t 0 G ¯ j = 2 k μ j t + t 0 + I t > μ k + t 0 , E6
in particular,
W 2 t = I t > t 0 G ¯ μ 2 t + t 0 . E7

Example 2. Let the primary delay τ have exponential distribution, that is,

g t = I t 0 λ e λt ,   λ > 0 E8

As initial parameters, we take the following quantities.

λ = 0.4 ,   t 0 = 3 ,   μ 2 = 5 ,   μ 3 = 6 ,   μ 4 = 10 E9

Graphs of the functions W 2 t from Eq. (7), W 3 t and W 4 t from Eq. (6) with the parameters (Eq. (9)) are depicted in Figure 3.

It should be noted that in this and the subsequent examples, we use the following measures for the values: μ k , T k , t 0 , τ , τ k , ν k , T , b , Ε ν k (minutes, min); λ (1/min); D ν k (min2). The product αβ (as mean of μ k ), where α is a shape parameter, β is a scale parameter (in min).

Corollary 2. Let μ j = T , 2 j n , be a positive constant, then for 2 k n

W k t = I t 0 < t T + t 0 G ¯ k 1 T t + t 0 + I t > T + t 0 , E10
in particular,
W 2 t = I t > t 0 G ¯ T t + t 0 . E11

Example 3. Let τ has density (Eq. (8)). As initial parameters, we take the following quantities:

λ = 0.4 ,   t 0 = 4 ,   T = 8 . E12

Graphs of the functions W 2 t from Eq. (11), W 3 t and W 4 t from Eq. (10) with the parameters (Eq. (12)) are depicted in Figure 4.

Figures 3 and 4 show that in the case of constant μ j , the primary delay τ practically does not affect the fourth train and all subsequent ones. This is consistent with the equality lim k W k t = I t > t 0 + T which, as it is not difficult to verify, follows from Eq. (10).

Remark 1. It is known that the distribution of sum of the independent random variables is the convolution of their distributions. The convolution of distribution functions F 1 and F 2 is determined by the formula F 1 F 2 x = F 1 x y dF 2 y , where the integral sign means the improper Riemann-Stieltjes integral. We consider exceptionally piecewise-continuous distribution functions, then the indicated integral exists with the exception of the case when F 1 and F 2 have at least one common discontinuity point (e.g., [12]). The convolution operation is permutable. In the case, when F 1 = F 2 = = F m = F , we shall use the following notations: F 2 F F , F m F F m 1 , m 2 . By definition, we assume that F 1 F . The convolution f 1 f 2 x of densities f 1 and f 2 is defined as the improper Riemann integral f 1 x y f 2 y dy .

Corollary 3. Let μ j , 2 j n , be independent identically distributed random variables with a continuous distribution function Ψ x . Let τ be independent of μ j , 2 j n . Then

W 2 t = I t > t 0 G ¯ z t + t 0 d Ψ z , E13
W k t = I t > t 0 Ψ t t 0 + t t 0 G ¯ z + u t + t 0 d Ψ z d Ψ k 2 u ,   3 k n . E14

Corollary 4. Let μ j , 2 j n , be independent identically distributed random variables with a density function ψ x . Let τ be independent of all μ j and has a density function g x . Then

W 2 t = I t > t 0 z t + t 0 g x dx ψ z dz , E15
W k t = I t > t 0 t t 0 ψ z dz + t t 0 z + u t + t 0 g x dx ψ z dz ψ k 2 u du , E16
3 k n .

Remark 2. The integration limit “ ” can be replaced by 0 in Corollaries 3 and 4 if μ j 0 . On the other hand, we may consider in these corollaries the case when μ j takes values of different signs. From a practical point of view, such an approach is acceptable if the probability that these random quantities take negative values is small enough. This assumption allows to consider, for example, models in which the random variables μ j are normally distributed with a variance small enough and to use the property that the class of normal distributions is closed with respect to the convolution operation.

Example 4. Let τ has the density (Eq. (8)), and all μ j have the same gamma density

ψ t = I t > 0 e t / β t α 1 Γ α β α , E17

where α > 0 , β > 0 , Γ α = 0 x α 1 e x dx is gamma function. Put

λ = 0.3 ,   t 0 = 5 ,   α = 14 ,   β = 0.5 . E18

One can show that in the example under consideration it follows from Eqs. (15) and (16) that

W k t = I t > t 0 1 Γ α t t 0 + b / β Γ α + ae λ t t 0 + b 1 1 + λβ k 1 α Γ α 1 + λβ t t 0 + b / β Γ α ,

where Γ α y = y x α 1 e x dx is incomplete gamma function. Graphs of the distribution functions W k t , 2 k 5 , with the parameters (Eq. (18)) are depicted in Figure 5.

It is not difficult to verify that for W k t from Example 4 the following formula holds:

lim k W k t a = 1 , b = 0 = lim k I t > t 0 1 Γ α t t 0 / β Γ α + e λ t t 0 1 1 + λβ k 1 α Γ α 1 + λβ t t 0 / β Γ α = W t I t > t 0 1 Γ α t t 0 / β Γ α .

It can be seen from Figure 5, curves W 4 t , W 5 t and so on are practically merged. Hence, in the case under consideration, one can draw the following conclusion: primary delay τ affects to fifth and all successive trains approximately like on the fourth one.

Remark 3. We define the 0-fold convolution as a generalized function with the following property: the equality v t ψ 0 t dt = v 0 holds for any bounded continuous function v t . Then, Eq. (16) for k = 2 coincides with Eq. (15).

We do not give proofs for the statements of Section 3 because of limitations on the volume. We will make this in another work.

## 4. Some results on the knock-on delays

Denote by N the random number of knock-on delays (within the framework of the model under consideration).

Lemma 1. For each fixed integer m, 1 m n 1 ,

Ρ N m = Ρ τ > j = 2 m + 1 μ j . E19

Proof. Easily seen: N = 0 = t 0 τ + t 0 T 2 , N = m = T m + 1 < τ + mt 0 T m + 2 t 0 ,

m = 1, 2, …, n – 2, N = n 1 = τ + n 1 t 0 > T n . This implies that

Ρ N m = Ρ τ + mt 0 > T m + 1 = Ρ τ > j = 2 m + 1 μ j .

Here and below, the sign □ denotes the end of the proof.

The corollaries of this lemma are given below. Their proofs are simple and therefore we do not present them.

Corollary 5. If μ j = T is a constant value, 2 j n , then for every fixed integer m, 1 m n 1 , we have the equality Ρ N m = G ¯ mT .

Corollary 6. If μ j = T is a constant value, 2 j n , and τ is exponentially distributed with parameter λ , then for every fixed integer m, 1 m n 1 , the following equality holds,

Ρ N m = e λmT .

Corollary 7. If μ 2 , …, μ n are independent identically distributed random variables with a density function ψ , then for every fixed integer m, 1 m n 1 , we have the equality

Ρ N m = G ¯ u ψ m u du .

In what follows, τ 1 = τ is the delay duration of the first train, τ k , k = 2, …, n, is the knock-on delay of the k-th train. The problem is to find the distribution functions G k t = Ρ τ k < t , k = 2, 3, …, n. Note that the solution of this problem, which we call by the second problem, allows us to find the distribution of the deviations of the real arrival times from the planned ones.

In what follows, we will use the notation a b instead of max a b .

Theorem 3. The following formula holds:

τ k = τ k 1 μ k 0 ,   2 k n . E20

Corollary 8. The following formula holds:

τ k = τ j = 2 k μ j 0 ,   2 k n . E21

It should be noted that within the framework of our model the deviation of the real arrival time from the planned one for k-th train coincides with τ k , 1 k n . Figure 6 illustrates this statement.

The dotted lines (lines 1 ¯ and 2 ¯ ) represent the scheduled trajectories of trains 1 and 2, solid lines (1 and 2) depict the real trajectories taking into account the delays. It can be seen that the arrival time of the train 1 differs from the schedule at τ and the train 2 on the τ 2 .

Denote μ ¯ k = j = 2 k μ j , 2 k n . As it follows from the assumption that the random variables μ 2 , , μ k have the same distribution function Ψ t . They are mutually independent. The random variable μ ¯ k has the distribution function Ψ k 1 t .

Corollary 9. The distribution function of τ k has the following form:

G k t = I t > 0 Ρ τ μ ¯ k < t ,   2 k n . E22

The next Corollaries 10 and 11 follow from Corollary 9 in an obvious way.

Corollary 10. Let μ j > 0 , 2 j n be some constant values. Then

G k t = I t > 0 G t + μ ¯ k . E23

Corollary 11. Let μ j = T > 0 , 2 j n be a constant value. Then

G k t = I t > 0 G t + k 1 T . E24

Corollary 12. Let μ j , 2 j n be independent identically distributed random variables with a continuous distribution function Ψ t . Let τ be independent of μ 2 , , μ n . Then

G k t = I t > 0 G t + y d Ψ k 1 y . E25

Corollary 13. Let μ j , 2 j n be independent identically distributed random variables with a density function ψ t . Let τ be independent of μ j , 2 j n and has a density function g t . Then G k t = I t > 0 t + y g z dz ψ k 1 y dy .

## 5. Proof of Theorem 3 and its corollaries

Lemma 2. The following formula is valid:

τ 2 = τ μ 2 0 . E26

Proof. Let t > 0 be the time spent by the train on the path length (distance to the place, where an unplanned stop of the train 1 occurred). We show the equality τ 2 = 0 holds under the condition τ μ 2 . The departure time of the train 1 after stopping is t + τ . The time point when train 2 reaches s can be written as μ 2 + t 0 + t . The knock-on delay of train 2 will not occur, i.e., τ 2 = 0 , in the case, when the indicated time points are separated by the value r t 0 , i.e., r = μ 2 + t 0 + t t + τ t 0 , or, which is the same thing, τ μ 2 . The considered case is illustrated in Figure 7a.

The knock-on delay of the duration τ 2 = τ μ 2 will occur when τ > μ 2 . Indeed, since trains after a random stop depart simultaneously, then the equality t + τ = μ 2 + t 0 + t t 0 + τ 2 holds, i.e., τ = μ 2 + τ 2 . The case under consideration is illustrated in Figure 7b. Thus, the validity of Eq. (26) is shown. □

Proof of Theorem 3. We shall use the method of mathematical induction. The equality (Eq. (20)) for k = 2 is established by Lemma 2. Let Eq. (20) be satisfied. We show that:

τ k + 1 = τ k μ k + 1 0 , 2 k + 1 n . E27

It follows from the inductive hypothesis that τ k = 0 under the condition τ k 1 μ k . But if the delay of the k-th train is 0, then the next train does not undergo any delay, that is, τ k + 1 = 0 . The present case is illustrated in Figure 8.

In the case, when τ k 1 > μ k , a knock-on delay of the k-th train occurs and equals to τ k = τ k 1 μ k (according to the inductive hypothesis). Further, two cases are possible: either (1) a delay τ k entails a delay τ k + 1 , or (2) τ k + 1 = 0 .

Case 1. If the k-th train is delayed, then (k + 1)-th one will be delayed only if τ k > μ k + 1 , and its delay duration is τ k + 1 = τ k μ k + 1 (this fact follows from the equality of the moments of departure of the k-th and (k + 1)-th trains after an unscheduled stop: T k + t k 1 t 0 + τ k = T k + 1 + t kt 0 + τ k + 1 ). Case 1 is illustrated in Figure 9a.

Case 2. If the k-th train is delayed, then (k + 1)-th one will not be delayed ( τ k + 1 = 0 ) only if τ k μ k + 1 . Case 2 is illustrated in Figure 9b. Note that if the knock-on delay of the k-th train occurs, a conflict of the k-th train with (k + 1)-th is described similar to the interaction of trains 1 and 2 (see Lemma 2). All described cases lead to Eq. (20).□

Proof of Corollary 8. We indicate that Eq. (21) is similar to Eq. (20). According to the statement of Theorem 3, we have:

τ 2 = τ μ 2 0 ,   τ 3 = τ 2 μ 3 0 ,   τ k = τ k 1 μ k 0 . E28

Using the method of mathematical induction and taking into account that μ ¯ k 1 + μ k = μ ¯ k , we obtain Eq. (21) from Eq. (28).□

Proof of Corollary 9. It follows from Corollary 8 that τ k = 0 if τ μ ¯ k (see, e.g., Figure 8), and τ k = τ μ ¯ k if τ > μ ¯ k (see, e.g., Figure 9a). Using the law of total probability, we obtain the following chain of equalities:

G k t = Ρ τ k < t = I t > 0 Ρ τ k < t = I t > 0 Ρ τ k < t τ μ ¯ k Ρ τ μ ¯ k + Ρ τ k < t τ > μ ¯ k Ρ τ > μ ¯ k = I t > 0 Ρ τ μ ¯ k 0 + I t > 0 Ρ 0 < τ μ ¯ k < t = I t > 0 Ρ τ μ ¯ k < t .

Proof of Corollary 12. Apply the well-known assertion to Eq. (22): if Y 1 and Y 2 are independent random variables, then for any function of two variables f and any c R , the following equality holds: Ρ f Y 1 Y 2 < c = Ρ f y Y 2 < c dF 1 y , where F 1 is the distribution function of Y 1 . Consequently, G k t = I t > 0 Ρ τ y < t d Ψ k 1 y . This implies Eq. (25).□

Proof of Corollary 13. The assertion follows from Eq. (25).□

Note that the function G k t has a jump at zero which is equal to:

G k 0 + = G + y d Ψ k 1 y , where G + y = lim t 0 + G t + y .

In the case, when τ and μ j are absolutely continuous, it follows from Eq. (25) that

G k t = I t > 0 t + y g z dz ψ k 1 y dy , E29

where g and ψ are the density functions of τ and μ 1 , respectively, ψ j y is the j-fold convolution of the density ψ . In this case, we also have

g k t I t > 0 G k t = I t > 0 g t + y ψ k 1 y dy . E30

If we assume that τ 0 , then we deduce from Eq. (29) that

G k t = I t > 0 t 0 t + y g z dz ψ k 1 y dy , E31
and we get from Eq. (30),
g k t = I t > 0 t g t + y ψ k 1 y dy . E32

## 6. Corollary of Theorem 2 when the distribution of primary delay is a mixture of exponential and one-point distributions

Consider the cumulative distribution function of the following type:

G x Ρ τ < x = I x b 1 ae λ x b , E33

where 0 a 1 , b 0 , and λ > 0 are some parameters. Such distribution function is considered, for example, in [8]. It is easy to see that G x = 1 a G 0 x b + aG x b λ , where G 0 x is the distribution function of the degenerate distribution concentrated at the point x = 0 , G x λ = I x 0 1 e λx .

Let us find out the form of the distribution functions (Eqs. (13) and (14)) in the case of Eq. (33), when the function Ψ is continuous. In what follows, we mean that n 3 .

Lemma 3. Let the function G be defined by Eq. (33), and Ψ be continuous. Then

W 2 t = I t > t 0 Ψ t t 0 + b + ae λ t t 0 + b t t 0 + b e λz d Ψ z , E34
W k t = I t > t 0 Ψ t t 0 + ae λ t t 0 + b b e λu d Ψ k 2 u t t 0 e λz d Ψ z + b e λu t t 0 u + b e λz d Ψ z d Ψ k 2 u + b Ψ t t 0 u + b Ψ t t 0 d Ψ k 2 u , k 3 . E35

Proof. According to Eq. (33), one may conclude that function G ¯ x has a unique discontinuity point x = b . Hence, the integral G ¯ z t + t 0 d Ψ z exists for any continuous distribution function Ψ . Note that if Ψ z had a discontinuity point z = t 1 , then the function G ¯ z t + t 0 would also be discontinuous at the point z = t 1 for t = t 0 + t 1 b , and then the considered integral would not exist (see Remark 1). Since

G ¯ x = I x < b + I x b ae λ x b , E36
then
G ¯ z t + t 0 d Ψ z = Ψ t t 0 + b + ae λ t t 0 + b t t 0 + b e λz d Ψ z . E37

In accordance with Eq. (13), the relation (Eq. (34)) is proved.

Let k 3 . It follows from Eq. (14) that

W k t = I t > t 0 Ψ t t 0 + V u d Ψ k 2 u , E38

where V u = t t 0 G ¯ z + u t + t 0 d Ψ z . Given Eq. (36), it is easy to see that

V u = V 1 u + V 2 u , E39

V 1 u = ae λ u t + t 0 b t t 0 I z + u t + t 0 b e λz d Ψ z , V 2 u = t t 0 I z + u t + t 0 < b d Ψ z .

By using equalities

u z : u b z > t t 0 z t t 0 u + b = u z : u b z > t t 0 , u z : u < b z > t t 0 z t t 0 u + b = u z : u < b z t t 0 u + b ,
V 1 u d Ψ k 2 u = ae λ t t 0 + b b t t 0 e λ z + u d Ψ z d Ψ k 2 u + b t t 0 u + b e λ z + u d Ψ z d Ψ k 2 u . E40

Since u z : u b z > t t 0 z < t t 0 u + b = ,

u z : u < b z > t t 0 z < t t 0 u + b = u z : u < b t t 0 < z < t t 0 u + b ,
then
V 2 u d Ψ k 2 u = b t t 0 t t 0 u + b d Ψ z d Ψ k 2 u . E41

It follows from Eqs. (39)(41) that

V u d Ψ k 2 u = ae λ t t 0 + b b e λu d Ψ k 2 u t t 0 e λz d Ψ z + b t t 0 u + b e λz d Ψ z e λu d Ψ k 2 u + b Ψ t t 0 u + b Ψ t t 0 d Ψ k 2 u . E42

The equalities Eq. (38) and Eq. (42) entail Eq. (35).□

Below we give without a proof a corollary of Lemma 3 in the case when μ j are not random variables, and they are equal to the same constant.

Corollary 14. Let μ j = T > 0 , 2 j n , be a constant. Let function G be defined by Eq. (33). Then, for 2 k n , the following formula holds:

W k t = I 0 b k 2 T I 0 < t t 0 T ae λ k 1 T t + t 0 b + I t t 0 > T + I k 2 T < b < k 1 T I 0 < t t 0 k 1 T b ae λ k 1 T t + t 0 b + I t t 0 > k 1 T b + I b k 1 T I t > t 0 . E43

Furthermore,

Ε ν k = I 0 b k 2 T t 0 + T a λ e λ k 2 T b 1 e λT + I b k 1 T t 0 + I k 2 T < b < k 1 T t 0 + k 1 T b + a λ e λ k 1 T b 1 . E44

D ν k = I 0 b k 2 T a λ 2 e λ k 2 T b 2 1 e λT ae λ k 2 T b 1 e λT 2 2 λ Te λT + I k 2 T < b < k 1 T a λ 2 e λ k 2 T b 2 e λ k 2 T b e λT ae λ k 2 T b e λ k 2 T b e λT 2 2 λ k 1 T b e λT . E45

Example 5. Figure 10 depicts the graphs of the functions W k t defined by Eq. (43) with k = 2, 3 for the following parameters:

a = 1 ,   b = 0 ,   λ = 0.26 ,   t 0 = 4 ,   T = 7 . E46

We calculated the values of Ε ν k and D ν k using the formulas (44) and (45) (see Table 1).

k = 2 k = 3 k = 5 k = 8 k = 10
Ε ν k 7.77702 10.47779 10.98629 10.99994 10.99999
D ν k 5.68009 2.33067 0.068156 0.00029 7.63176 × 10−6

### Table 1.

The behavior Ε ν k and D ν k with growth of the parameter k.

Remark 4. It can be easily seen that the larger k, W k t from Eq. (43) is closer to W t I b 0 I t > t 0 + T . This agrees with Figure 10 and the formulas (44) and (45) due to which we have Ε ν k t 0 + T , D ν k 0 as k , and also with the results of calculations in Table 1.

Let the random variable τ be distributed with the density (Eq. (33)) with parameters a = 1 , b = 0 . Now, we find the condition on the parameter T, under which the probability that at least m of knock-on delays will occur would not exceed a given probability p. Note that the departure headway is equal to T + t 0 .

According to Corollary 6, it is necessary to solve the inequality exp λmT p . As a result, we obtain the desired condition:

T 1 / ln 1 / p E47

(see also [13]). Denote by T m p λ the minimal T satisfying the inequality (Eq. (47)).

Example 6. Let us fix λ = 0.26 . The behavior of T m p λ as a function of the continuous parameter m with p = 0.1 and p = 0.05 is shown in Figure 11a. Obviously, T m p λ is the decreasing function with respect to the argument p. Exact calculations can be made using the formula:

T m p λ = 1 / ln 1 / p . E48

Let p = 0.1 . The behavior of T m p λ as a function of the continuous parameter m with λ = 0.26 and λ = 0.15 is shown in Figure 11b. In accordance with Eq. (48), T m p λ is the decreasing function with respect to the argument λ . In the case of exponential density g t , we have Ε τ = 1 / λ . Therefore, the decrease of λ leads to increase in the average of primary delay and the departure headways (if we want to reduce the number of knock-on delays).

We also obtain the corollaries of Lemma 3 in the case when μ j are distributed according to the gamma-law with the density (Eq. (17)).

Corollary 15. If primary delay τ has an exponential distribution g t = I t > 0 λ e λt and μ k , 2 k n , has the density (Eq. (17)), then the following formulas are true:

G k t = I t > 0 1 e λt λβ + 1 k 1 α , E49
g k t = I t > 0 λβ + 1 k 1 α λ e λt . E50

Remark 5. The function g k t = I t > 0 G k t is not a density, in particular, because of g k t dt = 0 g k t dt = G k G k 0 + = 1 h k 1 , where h k is the jump of the function G k t at the origin. At the same time, the function g ˜ k t 1 1 h k g k t is a density.

Corollary 15 can be reformulated as follows.

Corollary 15*. Let primary delay τ is exponentially distributed with a parameter λ , and μ k , 2 k n , have the same gamma distribution with the density (Eq. (17)). Then, τ k has the distribution function of the form Eq. (33) with a = λβ + 1 k 1 α , b = 0, and, consequently,

Ρ τ k = 0 = G k 0 + = 1 λβ + 1 k 1 α .

Remark 6. Let Ρ τ 2 = 0 = p , 0 < p < 1 . Then by Corollary 15*, Ρ τ k = 0 = 1 1 p k 1 , 3 k n . Hence, Ρ τ k = 0 1 as k .

Example 7. Let μ 2 , μ 3 , be independent random variables having the same density function (Eq. (17)). We perform three series of experiments and investigate a behavior of distribution of the arrival time deviations τ k with various combinations of parameters: α , β , k. The results are presented in graphical form in Figures 12–15 The functions G k t are calculated by formula (49), and the functions g k t by formula (50). Note that product αβ is the mean of μ k . Parameter λ is equal to 0.25 and αβ = 7 as it observes in reality.

## 7. Comparison with statistics of real train traffic

Let us consider the following random variable: the deviation of the real moment of arrival at a certain station from the scheduled one. Denote it by ξ . Statistical analysis of data on this random variable, received from the Russian railways, has led to the conclusion that in many cases, they obey the modified exponential law with the distribution function of the form Eq. (33) with b = 0 . Using data on the suburban trains of the direction “Moscow-Tver” for the period: January, 11–15, February, 1–6, 2016, we obtained a sample from the distribution of ξ of the size n = 50 with the sample mean 1.44 and sample variance 2.7. We tested the hypothesis that ξ obeys distribution (Eq. (33)) with λ = 0.35 and a = 0.64 . To this end, we applied the Kolmogorov goodness-of-fit test with the significance level α = 0.05 and obtained the fit between the hypothesis and the sample data (see Figure 16).

Remark 7. It should be noted that in considered example the deviation ξ is nonnegative. But in reality, it can frequently be both positive and negative. Positive values are due to arisen delay. Negative values occur due to the fact that sometimes early arrivals take place.

Remark 8. Although the hypothetical distribution function from Figure 16 is constructed for deviations without any details about the train number k, it is well correlated with the graph of the function G 2 t with α = 0.5 from Figure 12.

This allows us to assume that the distribution of the deviation ξ is mainly determined by the distribution of the delay τ 2 .

Remark 9. It was verified that if the length of the random variables μ j have the same gamma distribution, any variation of the parameters of this distribution ( α and β ) has a rather small influence on behavior of output distribution (see Figures 1215).

Remark 10. Since the primary delay has a great influence on formation of the output distribution of deviations from the schedule ( τ k ), then a knowledge of the primary delay distribution in each particular situation allows to predict the distribution of knock-on delays.

One important practical effect of the considered model is that it enables us to estimate the standard deviation (SD) of the actual arrival delays at the destination station. As an example, we calculated this parameter for the suburban railway line. The data analyzed were collected at the Tver station in the period of January 2016 and February 2016.

Example 8. Due to statistical data, we can consider that τ has the exponential distribution with the parameter λ = 0.25 (i.e., τ has the distribution function (Eq. (33)) with λ = 0.25 , a = 1 , b = 0 ), and μ 2 has gamma distribution with the density function (Eq. (17)), where α = 0.6 , β = 11.7 . Using formulas (49) and (50) with k = 2 , we have:

SD 2 = t a 2 2 d G 2 t = t 2 d G 2 t a 2 2 = 0 t 2 g 2 t dt a 2 2 10.987 .

Here a 2 = t d G 2 t = 1 λ λ β + 1 0.6 1.763 , t 2 d G 2 t = 2 λ 2 λ β + 1 0.6 14.088 ,

0 t 2 g 2 t dt a 2 2 = 2 λ 2 λ β + 1 0.6 1 λ 2 λ β + 1 1.2 10.987 .

Thus, theoretical SD 10.987 3.315 min . This corresponds with the real statistics which shows the SD amount is 3.32 min for the mentioned station.

## 8. Conclusions

The mathematical model of train traffic proposed in the chapter allows us to find conditions on initial headways, which provide a smallness of frequency of a large number of delays. In other words, the formulas for the distributions of arrival headways obtained in the chapter enable to optimize the frequency of arriving train delays.

## Acknowledgments

The research is funded by the JSC Russian Railways (grant 2016 for development of the scientific school).

## References

1. 1. Müller-Hannemann M, Schnee M. Efficient timetable information in the presence of delays. In: Ahuja RK, Möhring RH, Zaroliagis CD, editors. Robust and Online Large-Scale Optimization. Springer; 2009;5868:249-272
2. 2. Goverde RMP. A delay propagation algorithm for large-scale railway traffic networks. Transportation Research Part C. 2010;18:269-287
3. 3. Carey M, Kwiecinski A. Stochastic approximation to the effects of headways in knock-on delays of trains. Transportation Research Part B. 1994;28:251-267
4. 4. Carey M, Kwiecinski A. Properties of expected costs and performance measures in stochastic models of scheduled transport. European Journal of Operational Research. 1995;83:182-199
5. 5. Yuan J. Stochastic modeling of train delays and delay propagation in stations [thesis]. The Netherlands: Technische Universiteit Delft; 2006
6. 6. Meester LE, Muns S. Stochastic delay propagation in railway networks and phase-type distributions. Transportation Research Part B. 2007;41:218-230
7. 7. Berger A, Gebhardt A, Müller-Hannemann M, Ostrowski M. Stochastic delay prediction in large train networks. In: Proceedings of the 11th Workshop on Algorithmic Approaches for Transportation Modelling, Optimization, and Systems (ATMOS’11); 8 September 2011; Saarbrücken, Germany. pp. 100-111
8. 8. Büker T, Seybold B. Stochastic modelling of delay propagation in large networks. Journal of Rail Transport Planning and Management. 2012;2(12):34-50
9. 9. Yuan J. Statistical analysis of train delays at The Hague HS. In: Hansen IA, editor. Train Delays at Stations and Network Stability (Workshop). Delft, The Netherlands: TRAIL; 2001
10. 10. Yuan J, Goverde RMP, Hansen IA. Propagation of train delays in stations. In: Allan J, Hill RJ, Brebbia CA, Sciutto G, Sone S, editors. Computers in Railways VIII. WIT Press; 2002. pp. 975-984
11. 11. Alexandrova NB. Distribution of the train delays duration of due to station malfunctions. In: Proceedings of the Regional Conference “Universities of Siberia and Far East for Transsib”. Novosibirsk. 2002. pp. 20-21. (in Russian)
12. 12. Fikhtengolts GM. Course of Differential and Integral Calculus. 8th ed. Vol. 3. M.: Fizmatlit; 2003. 728p
13. 13. Davydov B, Dynkin B, Chebotarev V. Optimal rescheduling for the mixed passenger and freight line. In: Proceedings of the 14th International Conference on Railway Engineering Design and Optimization (COMPRAIL 2014), Rome, Italy. 2014. pp. 649-661

Written By

Vladimir Chebotarev, Boris Davydov and Kseniya Kablukova

Submitted: 31 October 2017 Reviewed: 15 February 2018 Published: 26 September 2018