Daily close value of NSE.

## Abstract

In Recent years many forecasting methods have been proposed and implemented for the stock market trend prediction. In this Chapter, the trend analyses of the stock market prediction are presented by using Hidden Markov Model with the one day difference in close value for a particular period. The probability values π gives the trend percentage of the stock prices which is calculated for all the observe sequence and hidden sequences. This chapter helps for decision makers to make decisions in case of uncertainty on the basis of the percentage of probability values obtained from the steady state probability distribution.

### Keywords

- stock market
- HMM
- TPM
- EPM and trend prediction

## 1. Introduction

The fundamental idea behind a hidden Markov model is that there is a Markov process we cannot observe that determines the probability distribution for what we do observe. Thus a hidden Markov model is specified by the transition density of the Markov chain and the probability laws that govern what we observe given the state of the Markov chain. Given such a model, we want to estimate any parameters that occur in the model. And also determined the most likely sequence for the hidden process. Finally we may want the probability distribution for the hidden states at every location.

Let

Our model is then described by the sets of probability distributions

The measure of best is to find the path that has the maximum probability in the HMM, given the sequence

## 2. Hidden Markov model

Hidden Markov model (HMM) is a stochastic model which is not directly observable, It describes the observable events that are depends on internal factors. The observable events are represented as symbols, where the invisible factor involved in the observation is represented as a state. HMM is a stochastic model where the system is assumed to be a Markov Process with hidden states and it gives better accuracy than the other models. Using the given input values, the parameters of the HMM (λ) denoted by A, B and π are found out. An HMM is defined as λ = (S,O,A,B,π) where S = {s1,s2,…,sN} is a set of N possible states O = {o1,o2,…,oM} is a set of M possible observation symbols, A is an N*N state Transition Probability Matrix (TPM), B is an N*M observation or Emission Probability Matrix (EPM) and Π is an N dimensional initial state probability distribution vector and A,B and π should satisfy the following conditions (Figure 1):

### 2.1 Evaluation problem

Given the HMM = {A,B,π} and the observation sequence O = o1,o2,…,oM, the probability that model λ has generated sequence O is calculated. Often this problem is solved by the Forward Backward Algorithm [7, 8].

### 2.2 Decoding problem

Given the HMM λ = {A,B,π} and the observation sequence O = o1,o2,…,oM, calculate the most likely sequence of hidden states that produced this observation sequence O. Usually this problem is handled by Viterbi Algorithm [7, 8].

### 2.3 Learning problem

Given some training observation sequences O = o1,o2,…,oM, and general structure of HMM (numbers of hidden and visible states), determine HMM parameters λ = {A,B,π} that best fit training data. The most common solution for this problem is Baum-Welch algorithm [9, 10] which is considered as the traditional method for training HMM.

## 3. Results and discussions

In this chapter, the data has been taken from Yahoofinance.com and the NSE daily close value data for a month of January 2020 period is considered for the analysis.

Here two observing symbols “I” for Increasing states and the symbols “D” for decreasing states have been used. If the differences of close value greater than 0 its observing that the symbol is “f” and If the differences of close value less than 0 its observing that the symbol is “D”. There are six hidden states assumed and are denoted by the symbol S1, S2, S3, S4, S5, S6 are indicates that very low, low, moderate low, moderate high, high and very high respectively. The states are not directly observable.

The situations of the stock market are considered hidden. Given a sequence of observation we can find the hidden state sequence that produced those observations. Table 1 shows the daily close value of the stock market.

S. no | Date | Close |
---|---|---|

1 | 01/02/2020 | 41,626.64 |

2 | 01/03/2020 | 41,464.61 |

3 | 01/06/2020 | 40,676.63 |

4 | 01/07/2020 | 40,869.47 |

5 | 01/08/2020 | 40,817.74 |

6 | 01/09/2020 | 41,452.35 |

7 | 01/10/2020 | 41,599.72 |

8 | 01/13/2020 | 41,859.69 |

9 | 01/14/2020 | 41,952.63 |

10 | 01/15/2020 | 41,872.73 |

11 | 01/16/2020 | 41,932.56 |

12 | 01/17/2020 | 41,945.37 |

13 | 01/20/2020 | 41,528.91 |

14 | 01/21/2020 | 41,323.81 |

15 | 01/22/2020 | 41,115.38 |

16 | 01/23/2020 | 41,386.4 |

17 | 01/24/2020 | 41,613.19 |

18 | 01/27/2020 | 41,155.12 |

19 | 01/28/2020 | 40,966.86 |

20 | 01/29/2020 | 41,198.66 |

21 | 01/30/2020 | 40,913.82 |

22 | 01/31/2020 | 40,723.49 |

Interval values:

S1 = −9500 to −551.

S2 = −550 to −251.

S3 = −250 to 249.

S4 = 250 to 8500.

S. no | c.v | D in 1 day CV | o.s | D in 2 days CV | o.s | D in3 days CV | o.s | D in 4 day CV | o.s | D in 5 day CV | o.s | D in6 days CV | o.s |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

1 | 41,626.64 | ||||||||||||

2 | 41,464.61 | 162.03 | I | ||||||||||

3 | 40,676.63 | 787.98 | I | −625.95 | D | ||||||||

40,869.47 | −192.84 | D | 980.82 | I | −1606.77 | D | |||||||

5 | 40,817.74 | 51.73 | I | −244.57 | D | 1225.39 | I | −2882.16 | D | ||||

6 | 41,452.35 | −634.61 | D | 686.84 | I | −930.91 | D | 2156.3 | I | −4988.46 | D | ||

7 | 41,599.72 | −147.37 | D | −487.24 | D | 1173.58 | I | 2104.49 | I | 4260.79 | I | −9249.25 | D |

8 | 41,759.69 | −259.97 | D | 112.6 | I | −599.84 | D | 1773.42 | I | −3877.91 | D | 8138.7 | I |

9 | 41,952.63 | −92.94 | D | −167.03 | D | 279.63 | I | −879.47 | D | 2652.89 | I | −6530.8 | D |

10 | 41,872.73 | 79.9 | I | −172.84 | D | 5.81 | I | 273.82 | I | −1153.28 | D | 3806.18 | I |

11 | 41,932.56 | −59.83 | D | 139.73 | I | −312.57 | D | 318.38 | I | −44.56 | D | −1108.73 | D |

12 | 41,945.37 | −12.81 | D | −47.02 | D | −92.71 | D | 405.28 | I | −86.9 | D | 42.34 | I |

13 | 41,528.91 | 416.46 | I | 403.65 | I | −450.67 | D | 357.96 | I | 47.32 | I | −134.22 | D |

14 | 41,323.81 | 205.1 | I | 211.36 | I | 192.22 | I | −642.96 | D | 1000.92 | I | −953.6 | D |

15 | 41,115.38 | 208.43 | I | −3.33 | D | 214.69 | I | −22.4 | D | −620.56 | D | 1621.48 | I |

16 | 41,386.4 | −271.02 | D | 479.45 | I | −482.78 | D | 697.47 | I | −719.87 | D | 99.31 | I |

17 | 41,613.19 | −226.79 | D | −44.23 | D | 523.68 | I | −1006.46 | D | 1703.93 | I | −2423.8 | D |

18 | 41,155.12 | 458.01 | I | −684.86 | D | 640.63 | I | −116.95 | D | −889.51 | D | 2593.44 | I |

19 | 40,966.86 | 188.26 | I | 269.81 | I | −415.05 | D | 1055.68 | I | 938.73 | I | −1828.24 | D |

20 | 41,198.66 | −231.8 | D | 420.06 | I | −150.25 | D | −264.8 | D | 1320.48 | I | −381.75 | D |

21 | 40,913.82 | 284.84 | I | −516.64 | I | 936.7 | I | −1086.95 | D | 822.15 | I | 498.33 | I |

22 | 40,723.49 | 190.33 | I | 94.51 | I | −611.15 | D | 1547.85 | I | −2634.8 | D | 3456.95 | I |

The various probability values of TPM, EPM and π for difference in one day, two days, three days, four days, five days, six days close value are calculated as given below (Table 2).

S1 | S2 | S3 | S4 | |||||
---|---|---|---|---|---|---|---|---|

I | D | I | D | I | D | I | D | |

S1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |

S2 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |

S3 | 0.071 | 0 | 0.071 | 0 | 0.1429 | 0.2857 | 0 | 0.4286 |

S4 | 0 | 0 | 0 | 0.8 | 0.2 | 0 | 0 | 0 |

Probability values of TPM, EPM, and π for difference in one day close value (Figure 2 and Table 3):

S1 | S2 | S3 | S4 | |||||
---|---|---|---|---|---|---|---|---|

I | D | I | D | I | D | I | D | |

S1 | 0 | 0 | 0 | 0 | 0 | 0 | 0.5 | 0.5 |

S2 | 0 | 0 | 0 | 0 | 0.5 | 0.5 | 0 | 0 |

S3 | 0 | 0.111 | 0 | 0 | 0.3333 | 0.2222 | 0.1111 | 0.2222 |

S4 | 0 | 0 | 0.3333 | 0 | 0.5 | 0 | 0.1667 | 0 |

Probability values of TPM, EPM, and π for difference in two day close value (Figure 3 and Table 4).

S1 | S2 | S3 | S4 | |||||
---|---|---|---|---|---|---|---|---|

I | D | I | D | I | D | I | D | |

S1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |

S2 | 0 | 0 | 0 | 0 | 0 | 0.75 | 0 | 0.25 |

S3 | 0 | 0 | 0.4 | 0.2 | 0.2 | 0 | 0 | 0.2 |

S4 | 0.5 | 0 | 0.2 | 0 | 0.2 | 0 | 0.2 | 0 |

Probability values of TPM, EPM, and π for difference in three day close value (Figure 4 and Table 5):

S1 | S2 | S3 | S4 | |||||
---|---|---|---|---|---|---|---|---|

I | D | I | D | I | D | I | D | |

S1 | 0.1429 | 0.2429 | 0 | 0.1429 | 0 | 0.1429 | 0 | 0.4286 |

S2 | 0.5 | 0 | 0 | 0 | 0 | 0.5 | 0 | 0 |

S3 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |

S4 | 0.4286 | 0 | 0.1429 | 0 | 0 | 0 | 0.4286 | 0 |

Probability values of TPM, EPM and π for difference in four days close value (Figure 5 and Table 6):

S1 | S2 | S3 | S4 | |||||
---|---|---|---|---|---|---|---|---|

I | D | I | D | I | D | I | D | |

S1 | 0 | 0.1667 | 0 | 0 | 0 | 0.1667 | 0 | 0.6667 |

S2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

S3 | 0 | 0 | 0 | 0 | 0 | 0.6667 | 0.3333 | 0 |

S4 | 0.7143 | 0 | 0 | 0 | 0 | 0 | 0.2857 | 0 |

Probability values of TPM, EPM and π for difference in five days close value (Figure 6 and Table 7):

S1 | S2 | S3 | S4 | |||||
---|---|---|---|---|---|---|---|---|

I | D | I | D | I | D | I | D | |

S1 | 0 | 0 | 0 | 0.2 | 0 | 0.2 | 0 | 0.6 |

S2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |

S3 | 0.3333 | 0.3333 | 0 | 0 | 0.3333 | 0 | 0 | 0 |

S4 | 0.5 | 0 | 0 | 0 | 0.25 | 0 | 0.25 | 0 |

Probability values of TPM, EPM and π for difference in six days close value (Figure 7 and Table 8):

The various transitions probability values for difference in one day to six days close values are displayed in Figure 2 to Figure 7 respectively.

Optimum Sequence of States:

To generate a random sequence of emission symbols and states are calculated by using the function “Hmmgenerate”. The HMM matlab toolbox syntax is: [Sequence,States] = Hmmgenerate(L,TPM,EPM). The length of both sequence and state to be generated is denoted by L [11]. The fitness function used for finding the fitted value of sequence of states is defined by

Using the iterative procedure, for each TPM and EPM framed we get an optimum sequence of states generated.

The length of the sequence taken as L = 4 and the optimum sequence of states obtained from the all six day’s differences with TPM and EPM is given in the below and here ‘ε’ is the start symbol.

1. | ε | → | I S4 | → | D S4 | → | I S3 | → | D S4 |

2. | ε | → | D S1 | → | I S4 | → | I S4 | → | D S3 |

3. | ε | → | I S4 | → | I s2 | → | D S3 | → | D S1 |

4. | ε | → | D S1 | → | D S4 | → | I S3 | → | D S4 |

5. | ε | → | I S3 | → | I S3 | → | D S2 | → | D S4 |

6. | ε | → | I S4 | → | D S1 | → | I S3 | → | D S4 |

Here, the one day difference of TPM and EPM has the shortest path. So the best optimum sequence is found from one day difference in close value. Using the fitness function we compute the fitness value for each of the optimum sequences of states obtained (Table 9).

S. no. | Comparison of six optimum sequence of states | Calculated value | Fitness = |
---|---|---|---|

1 | (1,2) + (1,3) + (1,4) | 1 | 1 |

2 | (2,1) + (2,3) + (2,4) | 1.7 | 0.588 |

3 | (3,1) + (3,2) + (3,4) | 2.425 | 0.412 |

4 | (4,1) + (4,2) + (4,3) | 3.15 | 0.32 |

In column four the highest value is the fitness value and the better is the performance of the particular sequence.

## 4. Conclusion

Stock prediction is challenging due to its randomness. Hidden Markov Model can be used for stock prediction by finding hidden patterns. Here the Hidden Markov model easily recognized four states of the stock market and also it was used to predict the future values. The highest value in the Optimum State Sequences is the better performance of the particular sequence. Hidden states and sequences have been generated to easily identify the level of the sequence whether the next day value is increasing. And also identified whether the increasing level is moderate high or high or very high and also decreasing level whether moderate low or low or very low. This model will be very much useful for short term as well as long term investors.