## 1. Introduction

Nowadays, the medical image segmentation has a wide range of applications and researches in the medical research field, such as clinical diagnosis, pathological analysis, surgical planning, computer‐aided surgery, and so on. Especially in recent years, the global incidence of cancer is increasing, and the early diagnosis of cancer is particularly important. Accurate segmentation is an important part in computer‐aided analysis of blood cell image. And the blood cell image has the characteristics of cell touching, frequent severe adhesion, varying sizes of the cells, unclear cell boundary, and so on. It is difficult to accurately segment them. In particular, it has become a hot and difficult topic to study how to extract the cell region and achieve good segmentation of cell adhesion in the complex background.

Li et al. [1] and Al‐Kofahi et al. [2] proposed that the cell touching is the most difficult problem in the field of cell segmentation, which easily leads to undersegmentation, and a plurality of cells adhered together is regarded as a cell detection and segmentation, eventually leading to cell density calculation, spatial distribution, and morphological analysis error. In the field of the concave point detection, Anand et al. [3] used the color as the feature for the segmentation of adhesion cells, the algorithm can be highly segment irregular images, and has high segmentation accuracy. To segment fuzzy and touching cell images accurately, Micko et al. [4] used fast radial symmetry transform (FRST) algorithm to extract target and background markers, an improved watershed algorithm based on FRST was proposed for the cell touching segmentation. Aymen et al. [5] put forward improved watershed algorithm based on gradient distance transform combined with concave detection. It can split touching cells, and oversegmentation phenomenon has been partially improved.

The image segmentation method based on graph theory is widely used in recent years [6–10]. Zhang et al. [11] proposed an image segmentation method based on watershed and graph theory. Wang et al. [12] adopted a new image segmentation algorithm based on graph theory and mathematical morphology. Fabijanska et al. [13] used an improved algorithm based on minimum spanning tree, which can increase the speed of image segmentation by reducing the number of vertices in the graph. Song et al. [14] made graph theory combined with the method of multiscale convolution network (MSCN) to segment the cervical cell touching images, and achieved good results. Other methods such as Hough circular detection [15] and adaptive template matching [16] are also used for the segmentation of cell images, but there is more error localization and they cannot effectively isolate the touching cells.

From the study, it is found that the segmentation accuracy rate of the above algorithms is not high, the main reason may be the effect of the complex image background, and the dividing lines of the touching cells cannot be accurately obtained [17]. Hence, the most critical is to split the touching cells.

In addition, Felzenszwalb and Huttenlocher algorithm (FH algorithm) [18] suggested an improved minimum spanning tree segmentation algorithm, namely, when the regional internal differences are larger than the pixel differences between regions, it identifies two regions belonging to a homogeneous region and then merges them. According to the different characteristics of the images, it can work with high efficiency. But it also has its own shortcomings; when the threshold is set too large, it is easy to produce oversegmentation problem, and if the threshold is set too small, the phenomenon of undersegmentation will appear, so the segmentation scale is difficult to be grasped.

Based on this, a new blood cell image segmentation algorithm is studied which is based on the graph theory and concave point detection.

## 2. Image segmentation based on graph theory

### 2.1. Graph theory

Let *G* = (*V*, *E*) be an undirected graph with vertices *v _{i}* and

*v*. In the case of image segmentation, the elements in

_{j}*V*are pixels and the weight of an edge is some measure of the dissimilarity between the two pixels connected by that edge (e.g., the difference in intensity, color, motion, location, or some other local attributes).

By using the MST to segment images, the image information might be grasped from the overall situation, the growth process of MST can keep details of a region, and the process of looking for the smallest weight is adaptive, thus the global performance of an image meets the needs of the human visual characteristics. The algorithm can guarantee a good segmentation result in general, and it is of the high efficiency and has the simple data structure. So this paper uses the improved MST algorithm to conduct the cell image segmentation.

Using the MST algorithm, **Figure 2** is the segmentation result of **Figure 1(c)**. The threshold is greater than or equal to 51, and the segmentation result is shown in **Figure 2(a)**, which is the red region and the blue region; if the threshold is 4, then the image is divided into three parts, as shown in **Figure 2(b)**.

Using the MST algorithm, **Figure 2** is the segmentation result of **Figure 1(c)**. The threshold is greater than or equal to 51, and the segmentation result is shown in **Figure 2(a)**, which is the red region and the blue region; if the threshold is 4, then the image is divided into three parts, as shown in **Figure 2(b)**.

Thus, the threshold *K* selection is very important for the segmentation results.

### 2.2. Modified MST algorithm

In this paper, the image is mapped into a weighted graph *G* (*V*, *E*), and using Kruskal algorithm based on merged strategy. It is mainly related to three kinds of parameters: Gaussian filter parameter sigma; threshold function parameters of *K* used to control the extent of segmentation; and the parameters of minimum size, if the two neighboring region size is less than minimum size, the two regions are merged. The algorithm has the advantages of simple structure and high computational efficiency. For the algorithm, the following points of its improvement are presented.

#### 2.2.1. Improved weight function of edges

According to Felzenszwalb and Huttenlocher algorithm (FH algorithm), the edge weights of MST only represent absolute difference of color information between two pixels, without considering their spatial position information. If the space position (distance) of the two pixels is farther away, their relevance in general will also become weak, we should increase the edge strength. Only the edge weights of gray level images are redefined in the literature [19]. It can be redefined as the weight function.

For gray level images, the edge weight is defined as:

where

For color images, the edge weight is defined as:

Among them, *H _{i}*,

*H*,

_{j}*S*,

_{i}*S*,

_{j}*I*, and

_{i}*I*are the pixel components.

_{j}#### 2.2.2. Improved difference function of internal and inter region

We redefine the internal difference, Int (*C*), such that it gives a more accurate description of component *C*. Formally,

where *N* is the number of the MST edges, namely *N* =|*C*| – 1. It can reduce the sensitivity to a certain extent, and control segmentation scale by adjusting the parameter *K*, mainly inhibit the effect of noise. It is more stable than the original definition. More importantly, it does not increase the time complexity.

The definition of Diff (*C*_{1}, *C*_{2}) is as the following merge condition:

where Diff(*C*_{1}, *C*_{2}) is the difference between components *C*_{1} and *C*_{2}; Int(*C _{1}*) and Int(

*C*

_{2}) are respectively the internal differences of

*C*

_{1}and

*C*

_{2};

*T*(

*C*) =

*k*/|

*C*| is the threshold function. Parameter

*k*controls the size of the components in the image segmentation.

#### 2.2.3. Improvement of threshold function and parameter *k*

Felzenszwalb et al. pointed out that a large *k* was conducive to large areas, but the quantitative relationship between the *K* and the size of the region was not given. Therefore, an appropriate value is difficult to provide users with a parameter *k* for the expected component size. For example, two different *k* values of 150 and 300 are used. But they do not explain why 150 or 300 is selected, rather than the other values. For each particular image, this approach becomes infeasible in real-time applications if the value of *k* is determined by trial and error. Therefore, the expressions of the improved threshold and parameter *k* are as follows:

In the formula (5), *k* is a constant. The larger the *k* is, the more obvious the boundaries of the two regions can be distinguished. Note that *k* is not for the region numbers of the segmentation; the bigger the *K* value is, the lager the producing area is. Based on this, the stop‐merge condition for the component *C* becomes:

In the formula (6), for a given image, (*Diff*(*c*_{1}, *c*_{2}) - *Int*(*c*)) is not decreasing.

## 3. Separation of touching cells based on concave point detection

### 3.1. Determination of cell adhesion and the extraction of core coordinate

Cell touching can be divided into three types: parallel, series, and serial‐parallel, as shown in **Figure 3**. In parallel, the cell is enclosed in a closed area, as shown in **Figure 3(a)**. Series cells are end‐to‐end cells, as shown in **Figure 3(b)**; and the third is both cells connected in series and parallel cell, as shown in **Figure 3(c)**.

#### 3.1.1. Principle of cell touching

When the cells are stuck together, the boundaries will become more complex, usually concave regions will appear in the touching areas. The shape factor can describe the complexity of cell boundaries, and its formula is defined as follows:

In the formula, *C* is the circumference of the object, and *A* is the area of the object.

By scanning the image, the total numbers of pixels in the same marked area are the area of the target. The accumulation of the distances between adjacent edge points in the closed curve is the perimeter of the target. The distance between the two adjacent edge points in any horizontal direction or vertical direction is 1, while the distance between the two adjacent edge points in the tilt direction is

The range of shape factor is less than 1. When the target is close to the circle, the shape factor is close to 1. If cells are stuck together, their boundaries are complex. In the case of the same area, the circumference of the target with a concave object is larger than that of the target without a concave target, resulting in a corresponding smaller size of the shape factor. After learning and training to determine a threshold *P*_{0}, when *P _{E}* >

*P*

_{0}, the cell touching does not exist. When

*P*is less than or equal to

_{E}*P*

_{0}, the cell touching exists. With this constraint, the shape factor can prevent the error of separation.

#### 3.1.2. Extraction of the cell core coordinates

The core of the cell is the central pixel of the cell, which is the core of each cell that is touched together. As long as the touching cells are split into single cells, you can simplify the problem into the calculation of the core of a single cell. The algorithm flow chart is shown in **Figure 4**.

#### 3.1.3. Experimental results and analysis

Fu et al. [21] used the drawing software to generate cell images. After repeated experiments and training, the threshold of the best shape factor is *P*_{0} = 0.5. When *P _{E}* is greater than

*P*

_{0}, there is no cell adhesion, and when

*P*is less than or equal to

_{E}*P*

_{0}, there are cell adhesion. However, for the cells of some complex shape, the threshold of the shape factor may cause the misjudge phenomenon. After a number of experimental training, it is found that the value of

*P*is generally distributed in the range of 0.4–0.6, so a spinner control is added, as shown in

_{E}**Figure 5**.

The experimental results of adhesion cell core extraction are shown in **Figures 6**–**9**. From the above experimental results, we can see that the number of cell cores extracted by the algorithm in this paper is consistent with the actual cell numbers, and the core position is also basically accurate.

### 3.2. Principle and method of searching adhesion cell concave point

The angle and curvature may be the most widely used in the concave point separation algorithm. However, angle and curvature are susceptible to the effect of noise, especially when the cell image has a complex background and uneven cells in the outer nuclear region, the cell division will not produce the correct cell profile. Hence, considering the simplicity and robustness of the algorithm, the best concave points are found through detecting the concave points in major concave regions. How to search and extract the main concave points on the edges is described as follows.

#### 3.2.1. Search for concave points

The concave point is a very important parameter in the study of cell shape. If the number of concave points is more, then there are many touching cells. If there are a large number of concave points in a single cell, then the probability of cell mutation is higher. So it is a very meaningful work to study the concave points.

A cell image is generally characterized by concave pattern. The pixel value of the image background is 0, and the foreground pixel value is 1. There is no pixel value 0 on the line connecting any two pixel value 1, and the image is a convex figure; otherwise, it is concave. Therefore, the main problem of the algorithm is to determine the location relationship between the line connecting two edge points and the cells. It is observed that only the local concave points are the concave points on the cell edge, and the local convex points are not the concave points on the cell edge. Based on this, the algorithm first finds out the local concave points of the cell edge and then selects a concave point from the concave point group as the main concave point of the concave region.

As shown in **Figure 10**, set *L _{i}* for the edge of the cell adhesion profile,

*p*represents a point on the

_{j}*L*

_{i}.

*P*

_{j}_{‐h}and

*p*

_{j}_{+h}are the locations of

*h*pixel points before and after

*p*pixel point. After a number of experimental tests, when

_{j}*h*is equal to 10, the results will be better. If the line connecting

*p*

_{j}_{+h}and

*p*

_{j}_{‐h}is more than 60% outside of the adherent cell,

*p*is considered as a concave point [22]. In order to enhance the robustness of the algorithm, discarding the concave regions that only contain two or fewer local concave points, and only retaining the main concave points, and finally the main concave point is the central point of the corresponding concave region.

_{j}Specific implementation steps are as follows:

Select a point

*p*on the cell edge;_{j}To determine whether

*p*is_{j}*J*(horizontal) direction change or*I*(vertical) direction change;In accordance with the direction of step 2, to find adjacent points

*p*_{j}_{+h}or*p*_{j}_{‐h}in the 8 neighborhood of*p*, if not found adjacent point, return to step 1;_{j}To determine whether to find the first

*h*point, if not, then*p*_{j}_{+1}or*p*_{j}_{-1}as the starting point, return to step 2, if there is, step 5 is executed;To connect point

*p*_{j}_{-h}and*p*_{j}_{+h}, getting the percentage of the connection located in the outer region of the adhesion cells, if it is greater than or equal to 60%, then*p*is a concave point, if less than 60%,_{j}*p*is not a concave point;_{j}To determine whether the edge pixels are extracted, if not, return to step 1, if there is, the algorithm ends.

#### 3.2.2. Extraction of the main concave point

After all the local concave points of the cell edge are extracted, find out the main concave points from them. First, the local concave points are classified, finding the concave points that are in the concave regions, because the local concave point distance is relatively close in a concave region, so just find out a local threshold *D _{h}*. The concave points on the cell edge are divided into

*k*classes, then the classes that contained only less than or equal to two local concave points are removed, and finally, the intermediate point in concave point group is taken as the main concave point of the concave area.

#### 3.2.3. Design of adhesion cell separation method

Due to the diversity of the cell itself and the complexity of cell adhesion, there are many difficulties in the design of the separation algorithm, the difficulty and the key point is how to find the separation point. When the cells are stuck together, a pair of matched points can be found out on the edge of the cell profile, and a straight line that connects the two points can divide the touching cell into two parts. This pair of matching points satisfies the following properties:

According to the concave and convex of the adhesion cells, the concave area is calculated from the cell adhesion area, and the main concave point is found to be the separation point.

(a) Tandem cell separation

For cells that are connected in series, the separation points are all located on the edge of the touching region, because the cell series connection will form a pair of concave region. According to this characteristic, as long as the main concave points are found from the concave areas, connecting a pair of concave points, the tandem cells will be reasonably separated. Assuming that the number of the concave points is *A*, and the number of cells is *M*, then:

If there are only two touching cells, then the number of the main concave points is 2, which can be directly connected to split the touching cells. However, for more than three touching cells, the main concave points will be greater than or equal to 4, and then you need to determine which of the two main concave points are paired. As shown in **Figure 11**, the green dots are the main concave points of the cells, and the red spots are the center of the cells.

According to the geometric relationship between the cell core and the main concave point, the distance between the main concave point and the core of the cell is close to each other. As shown in **Figure 11**, *M*1 and *M*3 to *O*3 is much larger than that of *O*1 and *O*2.

In conclusion, the distance between *M*1 to *O*1 and *O*2 is minimal, and the distance between *M*3 to *O*1 and *O*2 is minimal, so *M*1 and *M*3 and *M*2 and *M*4 are paired. The experimental results are shown in **Figure 12**.

(b) Separation of parallel cells

In parallel, the pairing of the main concave points is relatively easy, because the adhesion of the parallel cells is located in the internal of the adhesion area, so the central point of the adhesive cells can be connected with the main concave point to split the touching cells. Assuming that the number of cells is parallel to *M*, the number of concave points is *A*, and it should satisfy:

The experimental results are shown in **Figure 13**.

### 3.3. Experimental results and analysis

The blood cell image of this paper comes from the First Affiliated Hospital of Fujian Medical University, and a total of 35 different types of blood smear cell images were collected. In order to verify the practicability of the algorithm, the experiments are carried out on 35 images, and selects some representative images to do further analysis. In **Figure 14**, the algorithm can efficiently split the touching cells. The segmentation result is stable and controllable.

## 4. Flow chart of new algorithm

Through the above analysis, the general flow chart of this algorithm (including two partial operation based on graph theory segmentation and adhesion separation) is shown in **Figure 15**. The red digital label 1 is the improved image segmentation algorithm based on the MST, and the mark 2 is the part of touching cells split based on concave point detection.

## 5. Experiments and analysis

### 5.1. Experimental result analysis

The original gray cell image is shown in **Figure 16(a)**. The background is clear. In addition to red blood cells, there are some small particles and cell nucleus in the cells and the gray value of the nucleus is relatively large. It has a large difference of gray value compared with the cytoplasm, so using ordinary methods are difficult to segment them. The overall cell is of regular shape, except for a small number of touching cells. For the original MST algorithm, there are many rough edges in the segmentation result. Because of the defects in the algorithm, dyeing pollution, particle noise, and the more redundant areas will be produced, with nonideal effect, and the segmentation result is shown in **Figure 16(b)**. In order to control region merging, the size of the area is introduced in the construction process of MST in the FH algorithm, which can reduce the generated redundant region segmentation results. So the holes of the segmentation results are eliminated and the cell surface becomes smooth, as shown in **Figure 16(d)**. The watershed algorithm is intuitive, fast, and accurate, which is widely applied in medical image segmentation. It is more effective for segmenting touching cell images, but it is prone to produce oversegmentation phenomenon. The watershed segmentation results are shown in **Figure 16(e)**, and the oversegmentation phenomenon is very obvious. The split results of FCM and mean shift are shown in **Figure 16(f)** and **(g)**; they are not ideal. For **Figure 16(d)**, the split result of the touching cells is shown in **Figure 16(h)**.

**Figures 17** and **18** are more complex than **Figure 16**, in which the gray value of the target is close to background. These two images have more holes caused by uneven light, and there are a lot of touching cells. In **Figure 17**, the contrast of the object and background is relatively obvious and has also more holes caused by uneven light, so the normal segmentation algorithms are difficult to segment this kind of images. There are a lot of rough edges in the segmentation results and the similar areas have not been well merged in **Figures 16(b)**, **17(b)**, and **18(b)**. The rough edges reduce and region merging are good, but there are still some redundancies in **Figures 16(c)**, **17(c)**, and **18(c)**. In comparison with the results of Ref. [18], the results are more ideal in **Figures 16(d)**, **17(d)**, and **18(d)**. It is found from the split results of each algorithm that, in comparison with several commonly used classical algorithms, the segmentation result of the proposed algorithm is ideal.

From the above segmentation results, for the adhesion separation part of a medical cell image, the algorithm proposed in this paper is ideal, and the algorithm can also be used for other separating adhesion target images with more effective effect. **Figure 19** is a land flow particle image, in which the discrimination of the object and the background is very clear. Because the viewing distance is farther, the rock surface information is vague. And the contour is relatively clear, but the individual parts have adhesion phenomenon. In **Figure 19(b)**, the segmentation results based on MST have not only good separation of target and background, but also increasing the adhesion degree of rock blocks. In **Figure 19(c)**, it exsits a lot of rough edges and holes. In **Figure 19(d)**, the rough edges are removed, and different regions are distinguished. As shown in **Figure 19(e)**, the watershed segmentation is able to handle some adhesion part, but it is easy to cause the oversegmentation phenomenon. The segmentation effect based on FCM algorithm is unable to handle the adhesive part of the rock mass, as shown in **Figure 19(f)**. **Figure 19(g)** is also not ideal.

### 5.2. Location analysis

In this paper, the author uses the proposed method to segment these images, and then the final segmentation results are located to verify the accuracy of the algorithm. The positioning results are shown in **Figures 20** and **21**.

### 5.3. Performance analysis

In order to illustrate the differences of the new algorithm and others, comparative analysis data are listed in **Table 1**. The data include the number of objects with oversplit and undersplit, and the total number of cells. The new algorithm has the minimum value in the above statistic data. In **Table 1**, in comparison with other algorithms, the new algorithm presents a better result. For this kind of the touching cell images, the watershed algorithm is better in comparison with other algorithms, but it causes an oversegmentation problem.

## 6. Conclusion

In order to solve the segmentation problem of the medical cell images with fuzzy and touching characteristics, this paper proposes an algorithm combing with modified MST and concave point detection. The MST method is improved from the following three aspects, namely, regional difference function, edge weight function, and the threshold function and parameter *k*, which can reduce the effect of noise on the segmentation result and improve the segmentation accuracy. But the improved MST cannot solve the cell touching problem. For splitting the touching cells, the concave point detection is adopted to find out the separation points. In comparison with the results of several commonly used image segmentation algorithms, the segmentation results of the proposed algorithm do not have many small areas, the oversegmentation phenomenon does not appear basically, and the touching cells can be split accurately, which is helpful to improve cell counting and recognition. A large number of tests show that the new algorithm is more ideal, undersegmentation and oversegmentation numbers are less, and the error rate is relatively low.