The behavior *k.*

## Abstract

In this chapter, we propose a probabilistic model for train delay propagation. There are deduced formulas for the probability distributions of arrival headways and knock-on delays depending on distributions of the primary delay duration and the departure headways. We prove some key mathematical statements. The obtained formulas allow to predict the frequency of train arrival delays and to determine the optimal traffic adjustments. Several important special cases of initial probability distributions are considered. Results of the theoretical analysis are verified by comparison with statistical data on the train traffic at the Russian railways.

### Keywords

- train traffic
- stochastic model
- train delay propagation
- probabilistic modeling
- operative management

## 1. Introduction

The trains’ movement is subject to a variety of random factors which leads to unplanned delays. This causes the scattering of the arrival times, hence, the inconvenience to passengers and consignees. Knowledge of the arrival times’ distribution properties leads to the possibility of predicting the characteristics of the train traffic and making correct decisions on the transportation process management. This makes it possible to improve the punctuality of train traffic and save resources, in particular, electric power.

The properties of the arrival headways distributions allow us to estimate the probability of delays emergence and theirs characteristics, which are important from a practical point of view. Probabilistic modeling of the delay propagation process along the train flow is the main tool for solving this problem.

The models for the distribution of delays in a dense train flow are divided into two classes. These are deterministic and stochastic models. Stochastic models take into account the unpredictable nature of obstacles in the railway. A mathematical model, proposed in the present chapter, make it possible to determine the probability distributions of the arrival headways of two consecutive trains at the station. The distribution properties are analyzed for different scattering of input random variables (the primary delay and the initial headways). Comparison of theoretical distributions with real statistics of train traffic on the Russian railways is performed.

## 2. Literature review

A substantial volume of literature is devoted to study of the train delays effect on the railway functioning. Deterministic models for primary and knock-on delays description were proposed in [1, 2]. These models based on the application of graph theory allow adjust the train traffic schedule. However, such approach considering the different characteristics of train traffic (e.g., travel and dwell times, headways, etc.) as deterministic values does not take into account the uncertainties that arise in reality.

Stochastic modeling takes the influence of random factors (e.g., see [3, 4, 5, 6, 7, 8]) into account. Authors of [7] determine a probabilistic distribution of the arrival times. The problem of finding a distribution of arrival train delays is examined in [8]. It should be noted that in these papers, special cases of primary delay distribution are considered. It is supposed in [8] that the random duration of the primary delay corresponds to some generalization of the exponential law. The paper [7] employs discretization of the delay distribution.

Some of the researchers have analyzed statistical data on deviations of the train arrival times from the planned ones. In particular, the papers [9, 10, 11] show that scattering of these deviations correspond to the exponential distribution.

## 3. Description of models and analysis of the arrival headways distribution

### 3.1. The first model

Trains follow one path one after another in one direction from station *A* to station *B* with the same average speed *n*. The distance from the train *j* to the train (*j* − 1) is denoted by *j* = 2, 3, …, n,

Let us also introduce the notations: *A* at the time *m* can be found as (as shown at Figure 1):

Assume that at some point in time, train 1 makes unplanned stop. The duration of this stop is random value *k* − 1) and *k* at the destination *B* (denote this headway as *k* = 2, 3, …, *n*. Call this problem by the first problem.

### 3.2. The second model

Suppose that train 1 was delayed at station *A* at the moment *k* − 1) and *k*. It is required to determine the distribution functions *k* = 2, 3, …, *n*.

**Example 1**. Let *n* = 5,

The basic model assumptions are follows: (1) only train 1 is exposed to primary delay *k* = 2, 3, …, *n*.

Denote by *k*, which depends on

We suppose that the departure times of trains satisfy the following two rules. Let *k* be fixed,

In what follows, we use the notation *A* is an arbitrary set on the real line *R*.

Suppose that the total number of trains is equal to

**Theorem 1**. 1. *If* *, then* *,* *,*

2. *Let k be a fixed integer,*

3. If

**Theorem 2**. *Let* *. For any k,* *, the following formula holds*

*in particular*,

Let us introduce the notations,

Further, some corollaries of Theorem 2 are formulated.

**Corollary 1**. *Let* *,* *, be arbitrary positive numbers, then for*

*in particular*,

**Example 2**. Let the primary delay

As initial parameters, we take the following quantities.

Graphs of the functions

It should be noted that in this and the subsequent examples, we use the following measures for the values: ^{2}). The product

**Corollary 2**. *Let* *,* *, be a positive constant, then for*

*in particular*,

**Example 3**. Let

Graphs of the functions

Figures 3 and 4 show that in the case of constant

**Remark 1**. It is known that the distribution of sum of the independent random variables is the convolution of their distributions. The convolution of distribution functions

**Corollary 3**. *Let* *,* *, be independent identically distributed random variables with a continuous distribution function* *. Let* *be independent of* *,* *. Then*

**Corollary 4**. *Let* *,* *, be independent identically distributed random variables with a density function* *. Let* *be independent of all* *and has a density function* *. Then*

**Remark 2**. The integration limit “

**Example 4**. Let

where

One can show that in the example under consideration it follows from Eqs. (15) and (16) that

where

It is not difficult to verify that for

It can be seen from Figure 5, curves

**Remark 3**. We define the 0-fold convolution as a generalized function with the following property: the equality

We do not give proofs for the statements of Section 3 because of limitations on the volume. We will make this in another work.

## 4. Some results on the knock-on delays

Denote by *N* the random number of knock-on delays (within the framework of the model under consideration).

**Lemma 1**. *For each fixed integer m,* *,*

*Proof*. Easily seen:

*m* = 1, 2, …, *n* – 2,

Here and below, the sign □ denotes the end of the proof.

The corollaries of this lemma are given below. Their proofs are simple and therefore we do not present them.

**Corollary 5**. *If* *is a constant value,* *, then for every fixed integer m,* *, we have the equality*

**Corollary 6**. *If* *is a constant value,* *, and* *is exponentially distributed with parameter* *, then for every fixed integer m,* *, the following equality holds,*

**Corollary 7**. If *, …,* *are independent identically distributed random variables with a density function* *then for every fixed integer m,* *, we have the equality*

In what follows, *k* = 2, …, *n*, is the knock-on delay of the *k*-th train. The problem is to find the distribution functions *k* = 2, 3, …, *n*. Note that the solution of this problem, which we call by the second problem, allows us to find the distribution of the deviations of the real arrival times from the planned ones.

In what follows, we will use the notation

**Theorem 3**. *The following formula holds:*

**Corollary 8**. *The following formula holds:*

It should be noted that within the framework of our model the deviation of the real arrival time from the planned one for k-th train coincides with

The dotted lines (lines

Denote

**Corollary 9**. *The distribution function of* *has the following form:*

The next Corollaries 10 and 11 follow from Corollary 9 in an obvious way.

**Corollary 10**. Let *,* *be some constant values. Then*

**Corollary 11**. *Let* *,* *be a constant value. Then*

**Corollary 12**. *Let* *,* *be independent identically distributed random variables with a continuous distribution function* *. Let* *be independent of* *. Then*

**Corollary 13**. *Let* *,* *be independent identically distributed random variables with a density function* *. Let* *be independent of* *,* *and has a density function* *. Then*

## 5. Proof of Theorem 3 and its corollaries

**Lemma 2**. *The following formula is valid:*

*Proof*. Let

The knock-on delay of the duration

*Proof of Theorem 3*. We shall use the method of mathematical induction. The equality (Eq. (20)) for *k* = 2 is established by Lemma 2. Let Eq. (20) be satisfied. We show that:

It follows from the inductive hypothesis that *k*-th train is 0, then the next train does not undergo any delay, that is,

In the case, when *k*-th train occurs and equals to

Case 1. If the *k*-th train is delayed, then (*k* + 1)-th one will be delayed only if *k*-th and (*k* + 1)-th trains after an unscheduled stop:

Case 2. If the *k*-th train is delayed, then (*k* + 1)-th one will not be delayed (*k*-th train occurs, a conflict of the *k*-th train with (*k* + 1)-th is described similar to the interaction of trains 1 and 2 (see Lemma 2). All described cases lead to Eq. (20).□

*Proof of Corollary 8*. We indicate that Eq. (21) is similar to Eq. (20). According to the statement of Theorem 3, we have:

Using the method of mathematical induction and taking into account that

*Proof of Corollary 9.* It follows from Corollary 8 that

*Proof of Corollary 12.* Apply the well-known assertion to Eq. (22): if

*Proof of Corollary 13.* The assertion follows from Eq. (25).□

Note that the function

In the case, when

where *j*-fold convolution of the density

If we assume that

## 6. Corollary of Theorem 2 when the distribution of primary delay is a mixture of exponential and one-point distributions

Consider the cumulative distribution function of the following type:

where

Let us find out the form of the distribution functions (Eqs. (13) and (14)) in the case of Eq. (33), when the function

**Lemma 3**. *Let the function G be defined by* Eq. (33)*, and* *be continuous. Then*

*Proof*. According to Eq. (33), one may conclude that function

In accordance with Eq. (13), the relation (Eq. (34)) is proved.

Let

where

By using equalities

Since

It follows from Eqs. (39)–(41) that

The equalities Eq. (38) and Eq. (42) entail Eq. (35).□

Below we give without a proof a corollary of Lemma 3 in the case when

**Corollary 14**. *Let* *,* *, be a constant. Let function* *be defined by* Eq. (33)*. Then, for* *, the following formula holds:*

Furthermore,

**Example 5.** Figure 10 depicts the graphs of the functions *k* = 2, 3 for the following parameters:

We calculated the values of

7.77702 | 10.47779 | 10.98629 | 10.99994 | 10.99999 | |

5.68009 | 2.33067 | 0.068156 | 0.00029 | 7.63176 × 10^{−6} |

**Remark 4**. It can be easily seen that the larger *k*,

Let the random variable *T*, under which the probability that at least *m* of knock-on delays will occur would not exceed a given probability *p*. Note that the departure headway is equal to

According to Corollary 6, it is necessary to solve the inequality

(see also [13]). Denote by *T* satisfying the inequality (Eq. (47)).

**Example 6.** Let us fix *m* with *p*. Exact calculations can be made using the formula:

Let *m* with

We also obtain the corollaries of Lemma 3 in the case when

**Corollary 15**. *If primary delay* *has an exponential distribution* *and* *,* *, has the density (*Eq. (17)*), then the following formulas are true:*

**Remark 5**. The function

Corollary 15 can be reformulated as follows.

**Corollary 15***. *Let primary delay* *is exponentially distributed with a parameter* *, and* *,* *, have the same gamma distribution with the density (*Eq. (17)*). Then,* *has the distribution function of the form* Eq. (33) *with* *b = 0, and, consequently,*

**Remark 6**. Let

**Example 7.** Let *k*. The results are presented in graphical form in Figures 12–15 The functions

## 7. Comparison with statistics of real train traffic

Let us consider the following random variable: the deviation of the real moment of arrival at a certain station from the scheduled one. Denote it by

**Remark 7**. It should be noted that in considered example the deviation

**Remark 8**. Although the hypothetical distribution function from Figure 16 is constructed for deviations without any details about the train number *k*, it is well correlated with the graph of the function

This allows us to assume that the distribution of the deviation

**Remark 9**. It was verified that if the length of the random variables

**Remark 10**. Since the primary delay has a great influence on formation of the output distribution of deviations from the schedule (

One important practical effect of the considered model is that it enables us to estimate the standard deviation (SD) of the actual arrival delays at the destination station. As an example, we calculated this parameter for the suburban railway line. The data analyzed were collected at the Tver station in the period of January 2016 and February 2016.

**Example 8.** Due to statistical data, we can consider that

Here

Thus, theoretical

## 8. Conclusions

The mathematical model of train traffic proposed in the chapter allows us to find conditions on initial headways, which provide a smallness of frequency of a large number of delays. In other words, the formulas for the distributions of arrival headways obtained in the chapter enable to optimize the frequency of arriving train delays.

## Acknowledgments

The research is funded by the JSC Russian Railways (grant 2016 for development of the scientific school).