## Abstract

This chapter presents variant maps for showing potential features in ECG data sets. The variant map is a visualization method different from a traditional ECG. In this chapter, the ECG data sets obtained by clinical ECG monitoring are used as the data source, and the corresponding variant maps are obtained by the variant statistics method. This chapter mainly introduces the variant statistics method about converting ECG data into variant maps. From sample results, various visual properties can be observed, and further explorations are required.

### Keywords

- variant maps
- ECG data
- visualization feature

## 1. Introduction

Today, people still are in a state of high cardiovascular disease incidence. The world is paying attention to cardiovascular diseases [1], mainly relying on the detection of ECG signals to promote the research of cardiovascular diseases. ECG signals are the product of a wide range of clinical ECG technologies. The electrocardiogram represents cardiac function and graphic signals [2], which are important means of diagnosing abnormal cardiac activity.

With the development of the information age, signal acquisition, data processing, and information analysis have become the main theme of scientific and technological development. In recent years, ECG signal research methods have made significant progress, such as the use of machine learning [3], clustering [4], partial fractal dimension [5], wavelet transform [6], and other methods for classification of arrhythmia detection [7]. Among the emerging ECG signal research methods, the most typical representative is the ECG scatter plot [8, 9, 10]. The ECG scatter plot observes the ECG signal in a new perspective, complementing traditional ECG detection.

The variant method is an emerging method for dealing with the phase change of the signal phase. Now the variant method has formed the theory of variant theory, variant logic function, and variant visualization method. In the 1990s, the application of variant method in the processing of binary image classification and conversion [11, 12]. In 2010, the variant method had been improved [13, 14]. So far, the variant method has been continuously developed and applied to different data samples, quantum sequences [15, 16], random sequences [17], noncoding DNA [18, 19, 20], bat echo signals [21], ECG signals [22, 23], and variant construction [24].

The variant method can process massive random sequences and extract statistical measurement features from them. The ECG sequence is a natural random sequence. It is a good fit to apply the variant method to the statistical measurement characteristics of massive ECG sequences. It has research value. The main purpose of this chapter is to study the visual characteristics of ECG signals and to mine valuable information in ECG signals. This chapter introduces the overall architecture, module function, and core algorithm of the variant measurement system. The results of this study show that the variant maps provide a new observation angle for ECG signal feature detection, and it shows the resolution of ECG data in visual effect.

The experimental data samples and experimental results in this chapter will be introduced in the fourth part. The overall structure and workflow of the variant measurement system are introduced in the second part. The third part introduces core module function and algorithm and finally summarizes the research.

## 2. Variant map for ECG

### 2.1 Overall structure

The variant measurement system is divided into five modules as a whole, which are an input data source module, a variant processing module, a segmentation measurement module, a state statistics module, and an output variant map module. The structure of variant measurement is shown in Figure 1.

It can be seen in Figure 1 that each module has its specific function. The input data source module is mainly used to read the ECG sequence. The main function of the variant processing module is to discretize the continuous ECG sequence. Segmentation measurement module is to segment the sequence. The main function of the state statistics module is to count the state of the pseudogene sequence.

### 2.2 Workflow chart

The five modules in the variant measurement system are independent and connected. The workflow of the entire variant measurement system is shown in Figure 2.

As can be seen from Figure 2, the five modules of the variant measurement system are arranged in order. The output of the previous module is the input of the next module. The input and output of each module are as follows:

Input data source module: The input data set, the output is the length of N ECG sequence.

Variable value processing module: Input is ECG sequence of length N, and the output is pseudogene sequence of length N.

Segmentation measurement module: Input segment length value “m,” input pseudogene sequence of length N, output is divided into M segment pseudogene sequence, where N = M*m.

State statistics module: Input the pseudogene sequence with length M (N = M*m), and output the corresponding variable measure value.

Variable value graph output module: Input variant measure, output variant ECG scatter plot.

## 3. Core module

### 3.1 Variant processing module

The core function of the module is to process successive ECG sequences into discrete 4-primary pseudogene sequences.

The variant processing module includes three submodules, a parameter setting submodule, a data discretization submodule, and a variant processing submodule. The three submodules are closely related, and the workflow of the module is shown in Figure 3.

As can be seen from the workflow chart of the variant processing module, the input and output relationship of the module is:

**Parameter setting:**

Sliding window value “W,”

**Input:**

The base sequence value of length N:

**Procedure:**

A conversion sequence of length N:

**Output:**

Pseudogene sequence of length N:

The above is the overall workflow of the variant processing module. Since the variant processing module includes a parameter setting submodule, a data discretization submodule, and a variant conversion submodule, the functions of each submodule and its core algorithm will be specifically described below.

#### 3.1.1 Parameter setting submodule

The parameter setting is to set the sliding window value “W” and the threshold “R” two parameters. It should be noted that the parameters have dynamic adjustability.

#### 3.1.2 Data discretization submodule

The specific variable discretization algorithm is divided into three steps: the first step calculates the average sequence corresponding to the base sequence, the second step calculates the truncated average sequence corresponding to the base sequence, and the third step calculates the conversion sequence corresponding to the base sequence. The calculation of these three steps is as follows:

1. The first step is to calculate the average sequence corresponding to the base sequence. In the sliding window value, the sliding window is sequentially moved from the first position of the base sequence, one bit at a time, and the average value in the sliding window value obtained by each movement is calculated. The calculation process is:

**Input:**

The base sequence value of length N is

**Processing:**

Here is an example of the process of calculating a sliding window. Suppose the base sequence in the sliding window value “W” is

**Output:**

The average sequence of length N is

2. The second step calculates the truncated average sequence corresponding to the base sequence. In the sliding window value, the sliding window is sequentially moved from the first position of the base sequence, one bit at a time, and the truncated average value in the sliding window value obtained by each movement is calculated. The calculation process is:

**Input:**

Base sequence value of length N:

**Processing:**

Here is an example of the process of calculating a sliding window. Suppose the base sequence in the sliding window value “W” is

**Output:**

Truncated average sequence of length N:

3. The third step calculates the conversion sequence corresponding to the base sequence:

**Input:**

Threshold “R,” R is a natural number greater than 0;

the base sequence of length N is

the average sequence of length N is

the truncated average sequence of length N is

**Processing:**

For example, calculation of the i-th element

**Output:**

Conversion sequence of length N is

#### 3.1.3 Variant processing submodule

The variant processing submodule is for processing the conversion sequence into a corresponding pseudogene sequence. The conversion rule is based on the threshold value, dividing the number axis into four intervals, and the four intervals correspond to the four primitives of the gene sequence: A, G, C, and T. When the conversion value is greater than or equal to the threshold, the conversion value is defined as A. When the conversion value is less than or equal to the negative threshold, the conversion value is defined as T. When the conversion value is greater than 0 and less than the threshold, the conversion value is defined as G, and when the conversion value is less than 0. When the value is greater than the negative threshold, the conversion value is defined as C; the conversion rules are as follows:

**Input:**

A sequence of converted values of length N:

**Processing:**

For example, conversion rule between the i-th element

**Output:**

A pseudogene sequence of length N

### 3.2 Segmented measurement module

The segmentation measurement module is to segment the pseudogene sequence. The function of this step is simple, but it is essential to prepare for the state statistics module. It should be noted that the segmentation measurement method here is different from the sliding window value in the variable value processing module. The principle of sliding window operation in the variable value processing module is to perform correlation measurement in order of 1 interval and sliding window value as unit length. The segmentation measurement is based on the segment length and sequentially segments the data sequence. For example, when the segment length is m, the pseudogene sequence of length N can be divided into M segments, N = M*m. The workflow of segmented measurement is in Figure 4.

The input and output relationship of this module is:

**Parameter setting:**

The segment length value is recorded as “m,”

**Input:**

Segmentation length value “m”; pseudogene sequence of length N:

**Processing:**

Segmenting the pseudogene sequence of length N in turn at intervals of segment length m.

**Output:**

The segmentation length of m is divided into M groups of pseudogene sequences

### 3.3 Variant state statistics module

This module statistically analyzes the sequence mathematically, revealing the patterns in the data and the relationship between the data. The module uses statistical ideas to calculate the measure values of the various primitives of A, G, C, and T in the pseudogene sequence. The measurement method is to count the number of each primitive in each group in the grouping sequence of the pseudogene sequence and mark the obtained value as a state statistical sequence. The workflow chart is shown in Figure 5.

As can be seen from Figure 5, the input and output relationship of the module is:

**Input:**

Segment length with m; pseudogene sequence

**Processing:**

Processing includes variant conversion statistics and variant probability measurement.

**Output:** Probability measure sequence.

The rules for variant conversion statistics and variant probability measurement are defined as follows.

#### 3.3.1 Variant conversion statistics

The process of variant conversion statistics is illustrated by taking the i-th group in the grouped pseudogene sequence as an example:

Taking the i-th group as an example, the state measurement sequence

#### 3.3.2 Variant probability measurement

The following describes the process of probability measurement by taking the i-th group in the pseudogene sequence as an example:

Taking the i-th group as an example, the probability measurement sequence

## 4. Sample results and brief analysis

### 4.1 Data source description

ECG data samples from the First People’s Hospital of Yunnan Province. This batch of data sets was initially analyzed by hospital experts. In order to facilitate the experimental research, an ECG database was established to classify ECG data. Among them, the normal ECG data is about 138 MB, and the abnormal ECG data is about 362 MB. The data samples obtained by collation are shown in Figure 6.

As can be seen in Figure 6, ECG data belongs to multivalued data and has a plurality of different attribute values, including pr interval, qt interval, p wave, qrs wave, and the like. In the medical field, the diagnosis of P-wave signals is a key point and difficulty in research. The P wave is the key to the diagnosis of arrhythmia; as shown in Figure 7, it is the normal ECG signal that marks P.

Based on the above background, this chapter selects the P-wave data in the ECG data provided by the First People’s Hospital of Yunnan Province to perform variable value visualization analysis. In order to ensure the rigor of the experiment, the normal P wave and the abnormal P wave of the same data amount were selected for research. By comparing the variant maps between the normal P wave and the abnormal P wave, the useful information in the ECG data is mined.

### 4.2 Meaning of selected variable value map

**Input:**

data source and parameter value; the data source is a normal P-wave ECG sequence of length 10,254, an abnormal P-wave ECG sequence of length 10,254; the parameter is the sliding window value “W,” and the threshold “R,” a segment length value “m”;

**Processing:**

The process is completed by the variant measurement system.

**Output:**

Variant maps: the X-axis represents the probability measure of G in the four primitives A, G, C, and T, and the Y-axis represents the probability measure of C in the four primitives A, G, C, and T. Marked on the variable map as X = St(G), Y=St(C).

### 4.3 Visualization features

Figure 8 shows an example of normal P-wave and anomalous P-wave variation map. This example is a variable value map obtained under the condition that the parameter sliding window value W = 24, the threshold R = 0.85, and the segment length value m = 100 are selected. It can be seen that there are obvious differences in the shape characteristics of the normal P-wave and the abnormal P-wave scatter cluster, and the distribution characteristics of the scatter cluster between the two are also different. The normal P-wave characteristics are mainly concentrated in the interior of the quadrilateral formed by “(0.3, 0.4), (0.4, 0.1), (0.8, 0.4), (0.8, 0.7).” The abnormal P-wave characteristics are mainly concentrated inside the triangle formed by “(0, 1), (0.4, 0.4), (1, 1).”

In order to better display the variant features, the following will be shown as an example of the visualization results under different “m” values (Figure 9).

## 5. Summary

This chapter is closely related to the measurement model, processing method, and variant maps to study ECG signals. To some extent, the variant maps and the traditional clinical ECG can be compared:

The electrocardiogram is a characteristic map obtained by processing the individual ECG signal. The variant maps, which mainly target massive ECG signals, can process individual ECG signals and can also process cluster ECG signals to provide visual analysis of points and surfaces.

The waveform features on the electrocardiogram have strong professionalism and complexity, while variant maps show waveform features from another perspective in the form of scatter clusters; variant maps visualization features are simple and clear. Nonprofessional ECG experts can also see the difference between normal and abnormal ECG characteristics.

The experimental results in this chapter demonstrate the visual characteristics of the differences in ECG data, giving a simple and clear visual experience, but the research in this chapter still has some shortcomings: due to the differences in the detection instruments, the different backgrounds of the times, the different data sources, and the lack of specific ECG diagnostic experts to guide these factors in the reality, the basic research of this chapter needs to be further improved.

Further cooperation with hospital ECG experts in the later stage is expected, combined with computer method technology, to process more targeted ECG data and further improve the variant measurement system to form a standard model, and combined with pathological conditions; the corresponding quantitative evaluation criteria were studied.

It is necessary to specifically note here that the parameters selected in the experiments in this chapter are selected after a large number of experiments, and the selection is based on the integrity, usability, and stability of the image features in the visualization results.

## Acknowledgments

Thanks to First People’s Hospital of Yunnan Province for ECG data, Key project of Electric Information and Next Generation IT Technology of Yunnan (2018ZI002), National Science Foundation of China NSFC (61362014), and the Overseas Higher-level Scholar Project of Yunnan for financial supports of the project.