Comparison of point detection between Hough forest and proposed method.
Ultrasound image analysis and recognition techniques for improving workflow in diagnosis and treatment are introduced. Fully automatic techniques for applications of cardiac plane extraction, foetal weight measurement and ultrasound-CT image registration for liver surgery navigation are included. For standard plane extraction in 3D cardiac ultrasound, multiple cardiac landmarks defined in ultrasound cardiac examination guidelines are detected and localized by a Hough-forest-based detector, and by six standard cardiac planes, cardiac diagnosis is extracted following the guideline. For automatic foetal weight measurement, biparietal diameter (BPD), femur length (FL) and abdominal circumference (AC) are estimated by segmenting corresponding organs and regions from foetal ultrasound images. For ultrasound-CT liver image registration, initial alignment is obtained by localizing a corresponding portal vein branch from an intraoperative ultrasound and preoperative CT image pair. Then portal vein regions of the ultrasound-CT image pair are extracted by a machine learning method and are used for image registration.
- ultrasound image analysis
- automatic measurement
- machine learning
- image registration
Ultrasound imaging is widely used for examination and navigation in diagnosis and therapy procedures. However, ultrasound image quality is affected by speckle noise and physical characteristics of the patient. Moreover, ultrasound image acquisition highly depends on skills and experience of a user. As a result, high-accuracy measurement or navigation by using ultrasound imaging is difficult and time-consuming. Therefore, it is necessary to develop automatic ultrasound image analysis and recognition techniques to improve the efficiency and accuracy in workflow of diagnosis and treatment. In this chapter, automatic techniques for applications of cardiac plane extraction , foetal weight measurement and ultrasound-CT image registration for liver surgery navigation [2, 3] are introduced.
2. Standard plane extraction in 3D cardiac ultrasound
2.1. Overview of standard plane extraction
Cardiac ultrasound device is a necessary tool that can help clinicians evaluate, diagnose and treat cardiac diseases. In a routine cardiac examination, six standard planes, apical four chamber (A4C), apical two chamber (A2C), apical three chamber (A3C), parasternal short-axis mitral valve (PSX MV), parasternal short-axis papillary muscle (PSX PM) and parasternal short-axis apex (PSX AP)  are commonly used to evaluate the structure and function of the heart. However, diagnosis using ultrasound device still suffers from inefficiency problems, such as user/patient dependency and complex operational procedures. There are market needs of improving the workflow in ultrasound diagnosis. According to such needs, there is a trend that cardiac diagnosis is changing from 2D diagnosis to 3D diagnosis. In conventional 2D ultrasound diagnosis shown in Figure 1(a), 1D array probe is used for data acquisition. Clinicians have to change multiple positions to check different views of the heart. As a result, a large number of images are acquired. During the measurement, clinicians have to evaluate and diagnose cardiac diseases manually, which is inefficient. On the other hand, in 3D ultrasound diagnosis in the near future shown in Figure 1(b), a 2D array probe is used to acquire a whole heart volume. Measurement then can be applied automatically using the acquired volume, and it is much more efficient than the conventional 2D ultrasound diagnosis. Therefore, an efficient and robust method for automatic plane extraction in 3D cardiac ultrasound is significant for improving the cardiac examination workflow.
There are literatures on automatic detection and localization in medical volume data [5, 6, 7, 8]. However, large computational load is required since the standard planes were extracted in an independent way in most of these works. This chapter introduces a machine learning framework based on the cardiac-ultrasound examination guideline presented by the American Society of Echocardiography . The presented framework is shown in Figure 2. The procedures are as follows: (1) Feature point detection: The above guideline indicates that clinicians should firstly localize the A4C plane with image features of mitral annulus and apical in the examination. Accordingly, three anatomical points on the A4C plane are selected as targets of localization, and a Hough forest classifier [9, 10] is modified and applied for the feature point localization. As a result, the A4C plane is localized. (2) Plane initialization: The guideline indicates that there are anatomical regularities between A4C and the other five planes. Therefore, initial locations of the other five planes are estimated by using that of the A4C plane and the anatomical regularities. (3) Plane refinement: Plane location refinement is required due to individual differences. Here, a regression forest method with locations constraints is presented for the plane localization refinement.
2.2. Standard plane initialization
In the presented method, three anatomical points, including the apex, left mitral annulus (left MA) and right mitral annulus (right MA), are selected for the localization of the A4C plane. Feature point localization is performed by utilizing a Hough forest classifier, which is presented in Section 2.2.1. Moreover, on the purpose of improving the accuracy and speed, a multi-scale hierarchical searching approach is presented in Section 2.2.2. Next, initial locations of all six planes are determined by using the locations of the earlier feature points and anatomical regularities. The explanation is given in Section 2.2.3.
2.2.1. Hough forest
A Hough forest classifier is modified and applied for feature point localization. This approach can map from image patches to anatomical locations. In our work, Hough forest is extended from 2D to 3D by using 3D image features and 3D Hough voting.
Training process: In Hough forest, each tree T is constructed on the basis of a set of image patches , where is image appearance of the image patch in 3D, is class label (positive or negative) and is the offset from the patch location centre to the target location centre. For a background patch, is undefined. When the training converges, each image patch reaches a leaf node L of the constructed tree. The information of such patches is stored in the leaf node. Such information includes , which is the proportion between the object patches, the background patches and which is the list of the offset vectors. During training, before reaching the leaf node, each patch goes through binary tests of non-leaf nodes in each tree. The binary test is based on the appearance vector (image feature vector) of each patch where refers to one channel of the feature vector, such as pixel intensity, first or second derivative and so on. During training, each node needs to be constructed with a binary test. A key point of Hough forest is how binary test is evaluated. In order to achieve an optimal test, the uncertainties in both the class labels and the offset vectors should decrease towards the leaves. Class label uncertainty and offset uncertainty are defined as:
where is a set of patches, is the number of patches, is the proportion of patches with label in set and is the mean offset vector over all object patches. The node construction process is as follows . Given a training set of patches, we firstly generate a pool of pixel tests by uniformly choosing one feature channel and two pixel locations inside a patch. The threshold τ for each test is chosen randomly from the range of differences observed on the data. Then, the randomized decision is made whether the node should minimize the class label uncertainty or the offset uncertainty. The process can be represented as:
where * indicates the uncertainty measure for the node (or ), is the set of patches that satisfy the binary test and is the set of patches that satisfy the binary test .
Testing process: The detection process can be divided into regression and voting steps. The regression process is as follows. Step 1: for each pixel at a location , an image patch surrounding the pixel with a fixed size is extracted and input is fed to the trained forest (trees). Step 2: when passing through each node, the patch is split into either the left or the right child node according to the binary test of the trained node. All pixels in the image are going through the forest simultaneously in steps 1–2 until they reach the leaves. During the voting process, the information stored in the leaves is used to cast the probabilistic Hough votes to the location of the object centre. Leaf information consists of proportion and offset vectors , so that is defined as a weight value for a vote. Each pixel with its location votes to all locations (the candidate positions of the object centre) with a weight value . After Gaussian filtering of all the accumulated votes at all the candidate positions from each pixel, a Hough image can be obtained. The maxima position of the Hough image can be considered as the detected object centre, that is, the detected vessel branch.
2.2.2. Coarse-to-fine localization strategy
In order to improve the efficiency and accuracy of the landmark localization, a coarse-to-fine strategy is applied to both the training and test procedures. In the training procedure, images of training data are firstly down-sampled to a low resolution level and the corresponding Hough forest classifiers are trained. Then regions of interest (ROIs) centred at the landmark locations are extracted for the training of the fine resolution level. In the test (detection) procedure, the rough landmark location is estimated at the coarse level and the ROI centred at the detected location are extracted at the fine resolution level. By applying the fine-level classifiers, final detection result can be achieved.
2.2.3. Plane initialization with anatomical regularity
First, the A4C plane can be localized by the detection of the three feature points. The cardiac long axis can also be localized by point A and the centre of point B and point C, as shown in Figure 2. Next, the A3C and A2C planes can be localized by rotating the A4C plane along the long axis at angles of 53 degrees and 129 degrees, respectively. Finally, localization of the three short-axis planes (MV, PM and AP) starts from positions that are perpendicular to A4C. Then they can be localized by translating along the long axis with proportional intervals of 1/6, 3/6 and 5/6.
2.3. Plane refinement with regression forest
The refinement is divided into two parts: refinement of the long-axis planes and refinement of the short-axis planes. The long-axis planes include A4C, A3C and A2C. Because the location of A4C can be determined by three detected points, only the refinement of the other two long-axis planes (A3C and A2C) is considered. According to the guideline of standard plane extraction, the long-axis planes should all pass through the long axis. An example of plane A3C is shown in Figure 3. A geometry theorem defines that two lines (not collinear) will uniquely determine a plane. Therefore, the long-axis planes can be localized by the long-axis and an angle around the long axis . Here, the angle equals to a line with a point intersected at the long axis. Since the long axis is fixed by the detected feature points, during refinement, only the angle of the planes (A3C and A2C) needs to be optimized. A method based on regression forest [7, 8] is used for searching the optimized angles.
According to the guideline of standard plane extraction, the short-axis planes should be perpendicular to the long axis. An example of plane PM is shown in Figure 4(a). In addition, there is a geometry theorem defining that a plane is uniquely determined by a nonzero normal vector and a point. Therefore, the short-axis planes can be localized by the long axis (nonzero normal vector) and a point on each plane. Here, anatomical feature points are selected on each plane for refinement, as shown in Figure 4(b). Three points selected as feature points are (1) for plane MV, the centre of mitral valve (marked as point D), (2) for plane PM, centre of papillary muscle (the bottom one, marked as point E) and (3) for plane AP, the centre of apex (marked as point F). Point D, E and F are first detected from the 3D volume for the refinement of the short-axis planes. Each short-axis plane is then determined by the long axis and each detected point. The detection of point D, E and F also uses the method which is presented in Section 2.2.1. However, for the feature points on the short axis, only coarse-level detection is applied. Evaluation experiments demonstrated that the fine-level detection applied for these points led to a larger error. This is mainly because these three points are lacking of discriminative local features which are crucial for the success of the fine-level detection. Therefore, the whole volume of information of the heart is applied to predict the final location of the three feature points on the short-axis planes.
2.4. Experiments and discussions
The presented method was evaluated on a 3D cardiac ultrasound dataset that was available in the works of Tobon-Gomez et al. . The database includes 15 datasets from healthy volunteers. Only the end diastole frame from each volunteer is used in this experiment. A fivefold cross-validation scheme was performed in the evaluation. Moreover, data augmentation was performed by randomly rotating and scaling the original cardiac volume. As a result, 120 volumes were generated for training, and 30 volumes were generated for evaluation in each of the fivefold validations. Dimensions of the volumes were around 320 × 347 × 241, and image resolution is 0.5 × 0.5 × 0.5 mm3.
To measure the accuracy of plane extraction, that is, how close the extracted plane is from the ground-truth plane, two evaluation standards were presented by Lu et al. . Such standards are angle error and distance error. The angle error is defined as the angle between the normal vector of the ground-truth plane and that of the extracted plane. The distance error is defined as the distance of an anchor on one plane to the other plane, where the anchor is the LV centre. Evaluation results of the proposed method and the previous works by Lu et al. and Chykeyuk et al. [5, 7] are listed in Table 1. Average angle errors and distance errors of plane extraction were measured. Run time was measured as the total computational time of the six-plane extraction. The experiments were performed on an Intel® Core™ i7 3.6 GHz computer with 16 GB of RAM. Examples of standard plane extraction are shown in Figure 5. As shown in Table 1, plane extraction accuracy was improved by about 30%, while computational time was significantly reduced.
3. Automatic foetal weight measurement
Generally, growth diagnosis of foetus is used by length of foetal head, abdomen and femur in ultrasound image. Nowadays, it is measured by doctor or clinical examiner, and the measurement process is very complex and needs a long time. In this chapter, a fully automated method for estimated foetal weight (EFW) measurement is proposed. In Figure 6, measurement plane images and measurement positions in common ultrasound scanning are shown . Biparietal diameter (BPD) for foetal head, abdominal circumference (AC) for foetal abdomen and femur length (FL) for foetal femur are measured. Using the above three measurement results, foetal weight can be estimated by common formulas such as.
3.2. Automatic BPD measurement
In Figure 7, the procedures of BPD measurement are shown. At first, the input ultrasound image of the foetal head is transformed to that of polar coordinates, as shown in Figure 8. Second, line-route searching on pixels with high intensity in the polar transformed image is performed by using dynamic programming (DP). Third, the detected line route is transformed back to the original image coordinates. Then, ellipse fitting is performed on pixels with high intensity. Here, direct least squares fitting (DLSF)  is applied. According to the shape of foetal head, only the ellipses with aspect rates of higher than 0.65 are adopted as candidate results. Next, mid-line of foetal head is masked out, and outer-inner processing is performed to obtain the final BPD measurement result. Figure 9 shows example results of ellipse fitting and BPD measurement.
3.3. Automatic FL measurement
In Figure 10, the procedures of FL measurement are shown. First, as image preprocessing, intensity normalization and weighting is performed to prevent misdetection of tissues of placenta or uterine wall. Second, candidate areas of foetal femur are detected by using entropy binarization . Figure 11(a) and (b) shows example results of such detected areas. Third, area integration process is performed to extract appropriate range of femur area by using second moment of the candidate area. Figure 11(c) shows the integration result of the candidate detected areas in Figure 11(b). Finally, measurement points are detected by intersection points between searching line along the second moment angle and the femur area. Figure 12 shows the detection result of measurement.
3.4. Automatic AC measurement
In Figure 13, the procedures of AC measurement are shown. First, foetal abdomen area is detected by using an AdaBoost classifier. Detection scale is set as 150-500 pixels. Second, feature points representing abdomen contour are extracted by using Laplacian of Gaussian (LoG) filter. Third, ellipse fitting on the abdomen contour candidates is performed. Next, outliers are removed by using a  RANdom Sample Consensus (RANSAC-)based method. Finally, measurement points are detected. Figure 14 shows an example of obtained measurement points.
3.5. Experiments and discussion
In evaluation experiments, 32 foetal head images, 44 foetal femur images and 105 foetal abdomen images are collected. Measurement points for BPD, FL and AC are automatically detected by using the proposed method. Such results are compared with those of manual measurement. Success rate of measurement and running time (estimate running time in C++) are shown in Table 2. We can see that measurement points for foetal weight estimation can be automatically and accurately detected with the proposed method.
|Success rate (%)||90.6||90.9||93|
|Run time (ms)||34||5||79|
4. Ultrasound-CT liver image registration
The registration of intraoperative ultrasound image to preoperative planning data is an important issue in the area of surgery navigation. In this chapter, a fully automatic method for anatomical landmark localization, portal vein region classification and intraoperative-preoperative image registration is proposed. The registration framework is shown in Figure 15, and the system configuration is shown in Figure 16. Before registration, a 3D freehand ultrasound image is acquired with an ultrasound probe which can be tracked by a position sensor. In order to find an appropriate initialization to guide the registration, a specified vessel branch (here, the right and left portal vein branch, PB, is used) is recognized and localized from the ultrasound image and CT image, respectively. The branch localization is based on a Hough forest detector. Then, local 3D images around the detected branches are then extracted from the ultrasound image and CT image, respectively. These two local images are used for the following registration. As a result, a reliable initialization (translation) of the registration is achieved. Next, vessel candidate regions are extracted from both local images. According to the opinions of surgeons, portal vein regions should be used for ultrasound-CT liver image registration. Here, an AdaBoost-based method was proposed for recognizing the portal vein region. Since the coordinates of the extracted portal vein regions can be represented as point sets in 3D space, the vessel point sets of the ultrasound-CT local images are then registered by using iterative closest point (ICP) .
4.2. Portal vein branch localization
Hough forest detector mentioned in Section 2.2.1 is modified and applied for the portal vein branch localization. Hough forest provides a way to map from local image patches to anatomical locations. It is a combination of random forest and Hough transform. Hough forests are sets of decision trees learnt from the training data. Each tree in the Hough forest maps local appearance and anatomical locations of image patches to its leaves and each leaf provides a probabilistic vote in the Hough space.
4.3. Vessel candidate region extraction
When the vessel branch is detected, a local 3D image around the branch is cropped from the ultrasound image, and the voxel spacing is resized to 1.0 × 1.0 × 1.0 mm3. Vessel regions are then extracted from these local images. For the purpose of reducing computational time, a 2D image filtering and thresholding approach is applied. First, 2D image slices are extracted from the ultrasound volume. Then denoise processing by using median filter is executed. Next, image contrast enhancement is performed, and vessel candidate regions are segmented and binarized on each slice by using an adaptive mean filter. After that, the resulting binary images are constructed to a 3D binary volume. Finally, connected components which include the location of the detected bifurcation are extracted as vessel regions. Brief explanations of the image filtering and thresholding will be given as follows.
Contrast-enhanced filtering: A contrast-enhanced filter is applied to the 2D ultrasound slices as a preprocessing step of the vessel extraction. The filter is defined as:
where is the contrast-enhanced image, is a mean-filtered image, is a weighted mean-filtered image and is a coefficient that represents the degree of image contrast enhancement. Here, a mean filter and a Gaussian filter are used for the mean filtering and the weighted-mean filtering. Coefficient is set as experimentally.
Adaptive mean thresholding: After the contrast-enhanced filtering, image thresholding is applied to the 2D ultrasound images. If the intensity of an interesting pixel is smaller than the threshold value, the pixel is considered as a part of the vessel. Such a threshold value is determined by mean filtering on the local surrounding region of the pixel:
Here, is the output value of an mean filtering and is a coefficient that represents the standard deviation of the filtered region. In this work, and are determined empirically. The resulting binary images are then reconstructed to a 3D binary image. Since the portal vein volumes are considered to have relative large capacities in liver vessels, only eight of largest labelled volumes are extracted as candidate regions for the next procedure.
4.4. Portal vein region classification
As mentioned earlier, it is necessary to classify and extract portal vein regions from the above vessel candidate regions. In this chapter, a machine learning-based method is proposed for the portal vein region extraction, as shown in Figure 17. It is based on an object detector by using AdaBoost classifier . Figure 18 shows procedures of the proposed method. First, input vessel candidate regions. Second, randomly sample a number of image patches from the input candidate regions. Next, input the image patches to the AdaBoost classifier and perform detection of portal vein objects. Finally, according to the votes of detected portal vein objects, determine whether the vessel candidate regions belong to portal vein.
The AdaBoost classifier is composed of a set of weak learners with associated weights. Each weak learner uses a single image feature to produce a hypothesis. In training process, AdaBoost is used to find the best weak learners and the corresponding weights for these classifiers. In this chapter, portal vein objects are detected from image patches which are randomly sampled from the extracted vessel candidate regions. Figure 19(a) shows how the image patches are sampled for training data generation, and Figure 19(b) shows some examples of positive and negative image patches. Positive examples are sampled from the extracted regions which belong to portal vein, and negative examples are sampled from other extracted regions which do not belong to portal vein.
In the stage of portal vein region identification, image patches are randomly sampled from five of the largest vessel candidate volumes. The number of image patches sampled from each vessel candidate volume is the same. Next, such image samples are inputs to the trained AdaBoost-based portal vein object detector. Once more than one object of portal vein regions is detected, the image patch is identified as a portal vein sub-region. The number of detected sub-regions is measured on each vessel candidate volume. If this number exceeds a threshold value, the corresponding candidate volume is classified as a portal vein object. The resulted regions are used for ultrasound-CT point-set registration.
4.5. ICP registration of vessel point sets
Since image quality and characteristics of the ultrasound volumes differ greatly from those of preoperative modalities, intensity-based approaches are difficult to obtain a reliable registration result. On the other hand, coordinates of the segmented vessel regions can be considered as point sets in 3D space. Such point sets without intensity information are able to be utilized to a registration. Here, ICP method for point set registration is applied. Figure 20 illustrates an example of ultrasound-CT vessel point-set registration; the first row shows the point sets and the second row shows the corresponding cross-sections of CT and ultrasound images. For initial registration, a large scale of transformation is needed to be estimated. Fortunately, only rotation should be updated since vessel branch detection already provides rough translation. For the fine registration, a robust method of wrong-pair rejection is applied for removing vessel points outside the field of view (FOV) of an ultrasound image.
4.6. Experiments and discussion
Ultrasound-CT image registration accuracy was evaluated by using clinical data acquired in liver resection surgery. Ultrasound-CT images from 44 patients were used in the experiment. Since more than 1 ultrasound volume is acquired from each patient, 30 CT images and 114 ultrasound images of 30 patients were for training, and 14 CT images and 36 ultrasound images of 14 patients were for testing. Dimension sizes of the CT and ultrasound images are and , respectively. Image resolutions are mm3 and mm3, respectively. All the experiments were performed on a system with Intel® Core™ i7 CPU 3.70-GHz and 12-GB memory. Evaluation results are shown in Table 3. The proposed method was compared with our previous work . In , we set a searching area around the branch, set sample points inside the area and estimate and threshold the vesselness by using Hessian-based filtering. The connected region with the largest number of vessel points is considered as portal vein region. Registration that has an error less than 30 mm is considered as a succeeded one. Distance between the measurement points before registration was 137.6 ± 18.1 mm. After registration, errors of the proposed method and our previous work were 8.7 ± 4.4 and 10.2 ± 8.7 mm, respectively. Success rates were 94.4 and 77.8%, respectively. Running time of the proposed method was 9.3 s and that of the previous work was 8.1 s. It can be seen that the proposed method outperforms the previous work by improving both the registration accuracy and robustness. The running time is slightly longer than that of the previous work because AdaBoost-based classification is performed to select portal vein regions. The objectives of registration error, success rate and running time were achieved. The study was approved by the ethics committee of Hitachi group headquarters.
|Distance before registration (mm)||Registration error (mm)||Success rate (error < 30 mm)||Running time (s)|
|Proposed method||137.6 ± 18.1||8.7 ± 4.4||34/36 = 94.4%||9.3|
|Previous work ||10.2 ± 8.7||28/36 = 77.8%||8.1|
Fully automatic techniques for ultrasound imaging applications of cardiac plane extraction, foetal weight measurement and ultrasound-CT image registration for liver surgery navigation are introduced. With such techniques, automatic ultrasound examination and navigation can be achieved accurately and efficiently. Workflow in diagnosis and treatment by using ultrasound image analysis can be improved.