Computer Aided Diagnosis - Medical Image Analysis Techniques

Breast cancer is the second leading cause of death among women worldwide. Mam-mography is the basic tool available for screening to find the abnormality at the earliest. It is shown to be effective in reducing mortality rates caused by breast cancer. Mammo-grams produced by low radiation X-ray are difficult to interpret, especially in screening context. The sensitivity of screening depends on image quality and unclear evidence available in the image. The radiologists find it difficult to interpret the digital mammography; hence, computer-aided diagnosis (CAD) technology helps to improve the performance of radiologists by increasing sensitivity rate in a cost-effective way. Current research is focused toward the designing and development of medical imaging and analysis system by using digital image processing tools and the techniques of artificial intelligence, which can detect the abnormality features, classify them, and provide visual proofs to the radiologists. The computer-based techniques are more suitable for detection of mass in mammography, feature extraction, and classification. The proposed CAD system addresses the several steps such as preprocessing, segmentation, feature extraction, and classification. Though commercial CAD systems are available, identification of subtle signs for breast cancer detection and classification remains difficult. The proposed system presents some advanced techniques in medical imaging to overcome these difficulties.


Introduction
In medical imaging field, computer-aided detection (CADe) or computer-aided diagnosis (CADx) is the computer-based system that helps doctors to take decisions swiftly [1,2].Medical imaging deals with information in image that the medical practitioner and doctors has to evaluate and analyze abnormality in short time.Analysis of imaging in medical field is very crucial task because imaging is basic modality to diagnose any diseases at the earliest but acquisition of image is not to harm the human body.Imaging techniques like MRI, X-ray, endoscopy, ultrasound, etc. if acquired with high energy will provide good quality image but they will harm the human body; hence, images are taken in less energy and therefore, the images will be bad in quality and low contrast.CAD systems are used to improve the quality of the image, which helps to interpret the medical images correctly and process the images for highlighting the conspicuous parts [3].
CAD is a technology which includes multiple elements like concepts of artificial intelligence (AI), computer vision, and medical image processing.The main application of CAD system is finding abnormality in human body.Among all these, detection of tumor is the typical application because if it misses in basic screening, it leads to cancer [4].

Objectives of the CAD system
The main goal of CAD systems is to identify abnormal signs at an earliest that a human professional fails to find.In mammography, identification of small lumps in dense tissue, finding architectural distortion and prediction of mass type as benign or malignant by its shape, size, etc.

Significance of the CAD system
CADe usually restricted to marking the visible parts or structures in image, whereas CADx helps to evaluate the structures identified in CADe.Both together the CAD models are more significant in identifying the abnormality at an earliest.For example, it highlights microcalcification clusters, marginal structure of mass, and highly dense structure of tissue in mammography.This helps the radiologist to draw the conclusion.Though the CAD has been used for over 40 years, still it does not reach the expected outcomes.We agree that CAD cannot substitute the doctor but definitely it makes radiologists as better decision makers.It plays a supporting and final interpretative role in medical diagnosis.

Applications of CAD system
CAD is used in the diagnosis of breast cancer, lung cancer, colon cancer, prostate cancer, bone metastases, coronary artery disease, congenital heart defect, pathological brain detection, Alzheimer's disease, and diabetic retinopathy.

CAD for breast cancer
Breast cancer ranks as second leading cause of death in women worldwide.According to American Cancer Society, about one in eight women will have breast cancer in her lifetime and only 5-10% of breast cancers occur in women with clearly defined genetic link [5].Hence, the early detection will help to have better quality of life, economical treatment, and mental peace of patient and family.With a low dose of X-ray imaging, mammography is a most basic screening test for breast cancer and also records better visualized internal details of the breast [6].Usually, mammography images consist of many artifacts and noises and makes medical images too difficult to detect and understand the cancer at the primary stages [7].Therefore, standardization of image quality and extraction of Region of Interest (ROI) are essential to limit the hunt for abnormalities.
CAD systems fundamentally work on highly complex patterns found in image.For breast cancer, it is used in screening mammography.Mammography is a basic screening test for breast cancer.It is low level X-ray imaging of the female breast.It helps for early detection of breast cancer [8,9] and it is mainly, established in the Netherlands and United States in addition with human evaluation conducted every year.The first CAD for mammography was developed in University of Chicago as research project.Today, commercially offered by iCAD, R2 image checker (version 3.8.17),and Hologic.Some of the non-commercial systems were developed such as Alan Hshieh gradient-based software and Ashita project.Some studies of CAD in mammography have positive impact, but some show no improvement [10,11].A systematic review on CAD in screening mammography conveyed that it does not have any significant impact, but it undesirably increases false-positive rates.A CAD system helps in achieving high accuracy, sensitivity which benefits for diagnosing mammography and also the patients.Normally, CAD systems are optimized by number of images.These images are analyzed in many steps as shown in Figure 1.• Support vector machine (SVM) • Principal component analysis (PCA) For classification of mass type in mammography, SVM classifier is used.If the detected structure reached to certain threshold level then they are marked by the radiologist, in some CAD systems abnormality marked automatically and saved for later examinations.

Evaluation of CAD systems
Evaluation of CAD systems measured by two major factors, such as sensitivity and specificity, they seek for suspicious structure.CAD systems may not be 100% but their hit rate means sensitivity can be up to 98% these days.But accuracy of the CAD depends on the conditions of the images used for training the system and factors like retrospective design.Image quality, conditions of mammography examination, radiologists marks, type of lesion, and size and location of mass are highly influences.

Dataset used
There are many standard datasets that are recommended by researchers to test CAD algorithms for mammography.Most of the datasets are not freely available.The most easily available datasets are mammographic image analysis society (MIAS) and the digital database for screening mammography (DDSM).Besides with these mini-MIAS database, B-SCREEN-Bayesian decision support in medical screening, AMDI-indexed atlas of digital mammograms, image retrieval in medical applications (IRMA), MammoGrid-European federated mammogram database implemented on a grid structure, and grid platform for computeraided library in mammography (GPCALMA) datasets are available [12].To test and analyze the CAD model, MIAS mini-mammographic database (i.e., mini-MIAS database of mammograms) [13] dataset is used.MIAS dataset is organized by research group of UK, films taken for National Breast Screening Programme and digitized to 50-μm pixels.A dataset consists of 322 images of 1024Â1024 sizes with radiologist mark if abnormality exists.

Methodology
Breast cancer diagnosis requires systematic image analysis and characterization and integration of numerous clinical and mammographic variables, which are difficult and error-prone tasks for physicians [14,15].This leads to low positive predictive value of imaging interpretation.The integration of computer models into the radiological imaging interpretation process can increase the accuracy of image interpretation.Hence, the CAD models help in early detection and accurate analysis of breast cancer.This CAD model aims to detect abnormality and identification of type of abnormality.The detailed diagram describes the steps carried out in CAD system for breast cancer detection and classification as shown in Figure 2.

Preprocessing
Preprocessing is the foremost task in medical imaging, it helps to identify the abnormal part, which cannot be recognized by visualizing the image but can be detected through CAD systems.Here in preprocessing, image quality enhanced by removing unwanted artifacts marked in Figure 3 from mammography.
Several methods have been reported for preprocessing mammography images since 1980 because of its influences in detection of cancer.The techniques like adaptive median filter, mean filter, adaptive mean filter, histogram equalization, histogram modified local contrast enhancement, breast region and pectoral muscle extraction, Contrast Limited Adaptive Histogram Equalization (CLAHE) technique, and morphological have been discussed earlier [16][17][18].Preprocessing of mammography [19] explored that the selection of significant parameters for quality improvement influences in the efficiency of CAD system [20].Figure 4 shows the steps carried out in preprocessing.

Removal of background
Histogram is the traditional method to remove the background.By identifying the threshold value from histogram, background of mammography removed.Using identified threshold value, image binarized and ordered with connected components, the largest component indicated the breast profile [21].Computer Aided Diagnosis -Medical Image Analysis Techniques http://dx.doi.org/10.5772/intechopen.69792

Removal of pectoral muscle
Another challenging task in preprocessing of mammography is removal of pectoral muscle [21].The modified region growing method used to remove the pectoral muscle by identifying the origin of the image either left oriented or right oriented.Once origin is identified, then it selects first top corner pixel as seed point and segments the pectoral muscle.This process carries until the complete muscle part is marked.

Image enhancement
The visual effect of the mammography uplifted by median filter followed by CLAHE [22].As stated, earlier dataset consist of fatty tissue, glandular tissue and dense tissue and the preprocessing was more helpful in dense tissue.
The evaluation of quality measure by the traditional image quality measuring parameters like root mean square error (RMSE), Peak signal to noise ratio (PSNR), and image quality index (IQI).RMSE and PSNR values are calculated by using Eqs.( 1)-( 3), respectively.
…N} are original and test image signals, respectively.The IQI is measured as Eq. ( 3) where There is a strong reason for using Wiener filter and CLAHE for image enhancement.Comparing the median filter, adaptive min-max and Wiener filter, we obtained high PSNR for all the images tested, as shown in Figure 5. Timely screening is the main aim of reducing death rates in breast cancer but traditional screening system may miss the abnormality because of low radiation.Hence, preprocessing is one of the essential components to detect abnormality at earliest.

Segmentation of mass
Segmentation is the process of partitioning the abnormal part from the normal part.Each identified regions represents the information that it belongs to and structuring elements to differentiate the abnormality [23,24].The main aim of segmentation in this CAD model is mass segmented from the breast tissue as shown in Figure 8. Mass in mammography is one of the subjects to identify the abnormality.Usually, abnormality of mass is identified by its shape, margin, and intensity.Sometimes, the high intensity with circular objects is likely to be ill-   defined [25].To train the system is too difficult in such cases hence CAD models cannot reach 100% of accuracy till today.
Many mass detection techniques have been developed for CAD systems earlier [26][27][28].More recently used mass segmentation approaches are region growing, watershed, threshold based, contour based, and clustering methods.
In the proposed system, we have identified three different segmentation techniques such as adaptive thresholding segmentation techniques, modified marker controlled watershed segmentation technique, and energy-based contour segmentation technique are applied to extract ROI.

Adaptive thresholding segmentation techniques
Thresholding is yet effective and simple method of segmenting the image into different regions.In proposed algorithm, before applying thresholding to the image it transformed with watershed and morphological operations [29].Watershed was originally proposed by Digabel and Lantuejoul [30,31].It is one of the useful concepts in image segmentation.Many modifications have been carried out on the watershed algorithm, because it gets oversegmented on the gray scale image.The concept of the watershed could be illustrated by geography as the representation of a topographical representation of the image.If the image is of the landscape it start filling with the water from the minimum gray value in the region of interest [18].When water fills up two or more regions it start merging, so we have to prevent merging by increasing the margins of basins till the high intensity.To control this more commonly creates a dam at points where water of two different regions meets.These regions are considered as catchment basins and the dams or the watershed lines which divide two different regions based on similarity satisfy the region.Most of the time, it over segments the image; to control oversegmentation, mathematical, morphological, and logical operations are used.The proposed work is a threshold-based segmentation and it is modified to extract the ROI with watershed transform and morphological operation.Threshold of the image is measured by the adaptive method, and ROI is extracted by iteratively selecting the threshold as shown in Figure 9.

Mathematical morphological operations
Mathematical morphological operations help to structure elements and measure the shape of the image.It also helps to refine the characteristics of the image in order to maintain the image data and characters [32].This work considered two morphological operations are opening and closing.Opening and closing are most commonly used operators in morphology [33].They have implemented by the basic operations such as erosion and dilation.The morphological opening is denoted by Eq. ( 4) and it is achieved by erosion followed by dilation.
Morphological opening helps to smoothen the edges and breaks the weak connections.Also it helps to remove the unwanted regions that do not contain structuring elements.
The morphological closing operation is denoted by Eq. ( 5) and it is achieved by dilation followed by erosion.It is union of all translations of B without overlap on A.
It helps to join the weak edges and fill the breaks in the edges.Also it helps to fill gaps and small holes in the structuring elements.
After calculation of gradient, the proposed method finds the regional minima, on which watershed transformation is applied.The watershed lines are obtained by "ORing" with minimum values to get mask.Then the mask is imposed on the gradient image.But it results in oversegmentation.Then the morphological operations, such as opening and closing, are applied to minimize the regions and fill the gap between the edges.Then the level-wise thresholding is applied to select appropriate threshold point [34] as shown in Figure 10.Comparing with the Otsu thresholding method [35], at fourth levels of thresholding successfully extracted ROI as shown in Figure 11.

Modified watershed segmentation
The traditional watershed method has the disadvantage of oversegmentation; hence, marker controlled watershed segmentation is used to extract the mass part from breast profile.The modified watershed method works systematically as shown in Figure 12.The preprocessed image passed to the gradient of the image with sobel operator to smoothen the edges.Then traditional watershed is applied to oversegments.Applying morphological operation opening followed by closing helped to find regional maximum and minimum values to apply watershed segmentation.
Stepwise results of watershed segmentation techniques are as shown in Figure 13.This method does not work well on dense and low contrast images, either it over segments or it miss the mass part in segmentation.

Contour-based segmentation technique
The contour-based segmentation algorithm works in five steps as follows: Step 1: Read the preprocessed image as input image.
Step 2: By performing the morphological operations, the abnormality is super imposed on original image.
Step 3: Apply active contour technique to identify the suspicious lesions; the suspicious lesions are peaks of the contour.
Step 4: Extract peak of the contour by calculating the energy of each contour.
Step 5: Mark extracted contour as ROI.
The stepwise results are shown in Figure 14.Energy of the contour is calculated by adding the intensity of pixels from each contour and finding average.Average of each contour is compared to select the mass region.
The contour-based technique works well on all kinds of tissues like fatty, glandular, and dense as shown in Figure 15.Also it works with high-intensity and low-intensity images.

Feature extraction of mass ROI
Radiologists depict masses by their shape, gray levels, and texture properties.The properties of mass surroundings are important discriminators from the background tissue.The shape of the mass changing from early benign to malignant as round, oval, lobular, or irregular circumscribed, micro-lobulated, obscured, indistinct, or peculated [36][37][38][39].Figure 16 shows a schematic diagram of mass shapes and boundary characteristics differ from benign to malignant.We also note that masses with speculated and indistinct boundaries have a greater probability of malignancy than circumscribed masses.
It also notes that masses with speculated and indistinct boundaries have a greater probability of malignancy than circumscribed masses.Along with the mass margin and shape, intensity of gray level is one of major feature to classify the mass.Hence, in this CAD system, different Computer Aided Diagnosis -Medical Image Analysis Techniques http://dx.doi.org/10.5772/intechopen.69792 features have extracted by wavelet features, Gray Level Co-Occurrence Matrix (GLCM) features, and Segmentation-based Fractal Texture Analysis (SFTA) features calculated.

Discrete wavelet transform (DWT)
The DWT is wavelet transform using discrete set of scales and translations followed by some rules.To use a wavelet, it is necessary to discretize with respective to scale parameters, i.e., sampling.The scale and translation parameters are given by, S = 2 À m and T = n2 À m, where m and n are the subset of all integers.Thus, the family of wavelet is defined in Eq. (6).The wavelet transform decomposes a signal χ(t) into a family of wavelets as given in Eq. (7). where For a discrete time signal x[n], the decomposition is given by Eq. ( 8): In case of images, the DWT is applied to each dimensionality, separately.[40,41].The basis of wavelet transform is localized on mother wavelet.Hence, in the proposed work, Haar, Daubechies (db2,db4 and db8), coiflet and bi-orthogonal wavelets at decomposition of level 4 used for the dataset and passed feature vector for the classification.

GLCM features
In texture analysis, widely used features are GLCM features.The GLCM is representation of frequently occurred gray levels combinations [42].It is second order statistics that can be used to analyzing the texture features based on number of pixels in different combinations as shown in Figure 17.The matrices are constructed at different gray levels, such as 1, 2, 3, 4, and so on, for the different directions, such as 0, 45, 90, 180 and so on.Depends on the number of combinations the statistics are measured as features in first order, second order, and in higher orders.Initially Haralick et al. [43] has defined 13 GLCM features then Soh and Tsatsoulis [44], and Clausi [45] have increased them to 21 features.In most of the CAD systems, these gray level features are used to interpret the symptoms.In the proposed work, we have extracted 21 GLCM features which are contributing to the discrimination of mass type.

SFTA features
Texture feature extraction is time-consuming process with basic filters because of scale and time invariant.This time consuming problem overcome by applying SFTA algorithm proposed by Costa [46].SFTA works on multilevel thresholding on gray image.In purpose of using SFTA is to get the clear structure for mass boundaries.The 21 texture feature vector corresponds to texture information like dimension, different gray levels, and area of ROI.The region-based 21 shape features extracted from the ROI such as area, orientation, bounding box, extent, perimeter, centroid, extrema, pixel_idx_list, convex area, filled area, pixel list, convex hull, filled image, solidity, convex image, sub_array_idx, eccentricity, major_axis_length, equi_diameter, minor_axis_length, and Euler number.All together there are 73 features extracted from mass to train the CAD system to discriminate the mass type as benign and malignant [48].

Classification
Support vector machine (SVM) is a supervised learning technique that seeks an optimal hyperplane to separate two classes of samples.Mapping the input data into a higher dimensional space is done by using Kernel functions with the aim of obtaining a better distribution of the data.Then, an optimal separating hyperplane in the high-dimensional feature space can be easily found as shown in Ref. [47].An example of an optimal hyperplane is shown in Figure 18.
Figure 18.Optimum hyperplane for support vector machine.

Experimental results
The proposed algorithm implemented in MATLAB13a, classification accuracy measured with confusion matrix shown in Table 1 and tested on MIAS dataset.MIAS contains a total of 322 mammograms of both breasts (left and right) of 161 patients.
According to above definitions of true positive, true negative, false positive and false negative.
The equations related to specificity (the accuracy of negative class), sensitivity (accuracy of positive class and accuracy), and accuracy of recognize both negative and positive classes are defined as in Eqs. ( 9)-( 11), respectively.
Classification measured based on different feature extraction techniques with contour-based segmentation and SVM classifier as shown in Table 2, the number of images used to test the system is 50, and among them, 37 are malignant cases and 13 are benign cases.The accuracy is high using wavelet db4 features [50].
Though wavelet db4 gives high accuracy, it is important to consider texture based and gray level features to discriminate the mass type as benign and malignant.Hence, for the proposed CAD model all features together passed to measure the performance of algorithm with different segmentation techniques such as adaptive threshold-based technique, modified segmentation technique, and energy-based contour segmentation shown in Table 3. Comparing with all the three techniques, energy-based technique gives more accurate results as shown in Figure 19.
The performance of the classifier compared with previous work shown in

Discussion
Early detection of breast cancer may reduce the death rate.The advancement in technology is needed in the detection of all types of masses in terms of increasing sensitivity and reducing false positive rate.Masses can be varying in size and shape and thus, the proposed segmentation and feature extraction techniques are more suitable to measure in terms.As the experimental results reported based on individual feature sets such as GLCM, wavelet, SFTA, and region-based statistical features, the accuracy was 92, 96, 94, and 90%, respectively as observed in Table 2.With same segmentation technique accuracy is increased by passing combined set of features to SVM classifier as shown in Table 3.The CAD system is compared with different set of features with different classifiers as shown in Table 4.It proved that with less number of features and simple classifier, it improved the accuracy of detection and classification with less complexity.

Conclusion
The CAD system is used to help the radiologists to interpret the medical images like mammography, X-ray, ultrasound, MRI, etc.It used as a second opinion by the radiologists.Improving CAD accuracy increases the treatment option and a cure is more likely.There are some commercial CAD systems that have been reported, which are R2 technology Inc, intelligent system software Inc. (ISSI), CADx medical systems, and iCAD.All of these commercial CAD systems perform better at detecting calcifications than the masses.Architectural distortions become the challenging task to all the commercial CAD system.One cannot make a direct comparison between these systems and their work because there is no same clinical dataset to study and compare the performances.The proposed CAD model is more suitable for mass detection and classification.The obtained result show that selection of suitable approaches to design an algorithm for CAD is subject to the accuracy, sensitivity, and false positive identifications.To remove background noise and pectoral muscle, region growing and thresholding methods are proved to be good.The quality of the mammography was enhanced by using CLAHE and Wiener.Mass in mammography is extracted with proper marking use of contour-based segmentation.The set relevant features are provided to SVM classifier to discriminate mass type as benign or malignant.Finally, the outcomes from this study predict that the selection of appropriate technique at each stage of medical image analysis is subjective to relevant and significant to design a CAD model.

••
Reduction of background artifacts (bugs in images) • Removal of noise • Filtering • Enhancing the quality of the image by leveling and increased contrast for clearing the image's 1.4.2.Segmentation • Disparity of different structures in the image, e.g., mass, microcalcification, and tissue • Finding the ground truth from anatomic databank 1.4.2.1.Feature extraction Detected region of interest is analyzed individually for special features (characteristics): • Size, location, and border • Gray levels analyzed in ROI • Texture of the ROI • Patterns found in ROI • Architectural distortion of the ROI 1.4.3.Classification After analysis of structure, every ROI is evaluated individually for scoring of the probability value for true positive (TP), false positive (FP), false negative (FN) and true negative (TN).The following procedures are examples of classification algorithms: • Nearest-neighbor rule (e.g., k-nearest neighbors') Minimum distance classifier • Cascade classifier • Naive Bayesian classifier • Artificial neural network • Radial basis function network (RBF)

Figure 1 .
Figure 1.Block diagram of CAD model for breast cancer.

Figure 2 .
Figure 2. Detailed diagram of CAD model for breast cancer.

Figure 4 .
Figure 4. Steps carried out in preprocessing of mammography.
It is clear from this figure that Wiener filter is suitable for noise removal of mammography image because it has high PSNR compared to min-max and median filter.It was tested with different filtering mask from [1 1] to[8 8] to select significant filter mask for Wiener filter as shown in Figure6.As mask of Wiener filter increased the PSNR value for mask [1 1], [2 2], and [3 3], whereas the RMSE, IQI values were reduced.Hence, [3 3] mask selected as significant filter mask for Wiener filter.However, mask increased beyond significant, PSNR increased but image gets blurred.Similarly, for the contrast index (CI) values, the default CI is suitable as compared with different CI values.For CI = 0.2, PSNR increased and continued with slight increase RMSE and IQI reduced from the CI = 0.2.Hence, contrast index 0.2 selected as significant value.The stepwise results acquire in preprocessing stage are shown in Figure7.

Figure 6 .
Figure 6.Selection of significant filter mask for Wiener filter (a) PSNR, (b) RMSE, and (c) IQI values of different filter masks from [1 1] to [8 8] for dense, fatty, and fatty glandular tissues.Selection of significant CI for CLAHE (d) PSNR, (e) RMSE, and (f) IQI values of different CI from 0.1 to 0.8 for dense, fatty, and fatty glandular tissues, (g) is results of different filter mask in Wiener filter (h) is results of different CI in CLAHE.

Figure 7 .
Figure 7. Experimental results proposed method (a) original image, (b) binary image with threshold value 0.1, (c) breast part extracted, (d) multiplication of (a) and (c) which consist only breast part without background, (e) seed point marked for region growing, (f) pectoral muscle segmented, (g) suppressed from original image, (h) Wiener filter, and (i) result of CLAHE.

Figure 10 .
Figure 10.Experimental results of threshold-based segmentation (a) original image from the mini-MIAS database, (b) preprocessed image, (c) opening, (d) closing, (e) reconstructed from opening-closing, and (f) mass identified fourth level of thresholding.

Figure 12 .
Figure 12.Steps of modified watershed segmentation technique.

Figure 16 .
Figure 16.Morphological changes of mass in image from benign to malignant.

Figure 15 .
Figure 15.Mass segmented on different tissues using contour-based segmentation.(a) Ground truth (b) results of proposed work.

Figure 17 .
Figure 17.Example of GLCM (a) four-level gray image, (b) direction of combination with single pixel distance, (c) covariance matrix of four levels with direction 00 with single pixel distance, and (d) co-variance matrix of four levels with direction 450 with single pixel distance.
The resulting image X is decomposed in first level is xA, xH,xV, and xD as approximation, horizontal, vertical, and diagonal, respectively.The xA component contains low frequency components and remaining contains high frequency component.Hence, X= xA + {xH + xV + xD}.Then, DWT applied to xA for second level decomposition.Hence, the wavelet provides hierarchical framework to interpret the image information

Table 2 .
Samples used for performance evaluation.

Table 4 ,
the combination of different features achieved more accuracy comparing with existing work.

Table 3 .
The performance measures of the SVM classifier with different similarity matrices.
Figure 19.Comparative analysis of accuracy rate for adaptive threshold, modified watershed, and energy-based contour segmentation techniques.

Table 4 .
Comparison of preset algorithm with previous works reported.