Segmentation of Brain MRI

Effective, precise and consistent brain cortical tissue segmentation from magnetic resonance (MR) images is one of the most prominent issues in many applications of medical image processing. These applications include surgical planning (Kikinis et al., 1996), surgery navigation (Grimson et al., 1997), multimodality image registration (Saeed, 1998), abnormality detection (Rusinek et al., 1991), multiple sclerosis lesion quantification (Udupa et al., 1997), brain tumour detection (Vaidyanathan et al., 1997), functional mapping (Roland et al., 1993), etc. Traditionally, the purpose of segmentation is to partition the image into non-overlapping, constituent regions (or called classes, clusters, subsets or sub-regions) that are homogeneous with respect to intensity and texture (Gonzalez & Woods, 1992). If the domain of the image is given by 硬, then the segmentation problem is to determine the sets 鯨賃 ⊂ 硬, whose union is the entire domain 硬. Thus, the sets that make up a segmentation must satisfy


Introduction
Effective, precise and consistent brain cortical tissue segmentation from magnetic resonance (MR) images is one of the most prominent issues in many applications of medical image processing. These applications include surgical planning , surgery navigation (Grimson et al., 1997), multimodality image registration (Saeed, 1998), abnormality detection (Rusinek et al., 1991), multiple sclerosis lesion quantification (Udupa et al., 1997), brain tumour detection (Vaidyanathan et al., 1997), functional mapping (Roland et al., 1993), etc. Traditionally, the purpose of segmentation is to partition the image into non-overlapping, constituent regions (or called classes, clusters, subsets or sub-regions) that are homogeneous with respect to intensity and texture (Gonzalez & Woods, 1992). If the domain of the image is given by , then the segmentation problem is to determine the sets ⊂ , whose union is the entire domain . Thus, the sets that make up a segmentation must satisfy where ∩ ∅ for , and each is connected. Ideally, a segmentation method is to find those sets that correspond to distinct anatomical structures or regions of interest in the image (Pham et al., 2000). For brain MR image segmentation, some studies aim to identify the entire image into subregions such as white matter (WM), grey matter (GM), and cerebrospinal fluid spaces (CSF) of the brain (Lim & Pfefferbaum, 1989), whereas others aim to extract one specific structure, for instance, brain tumour (M.C. Clark et al., 1998), multiple sclerosis lesions (Mortazavi et al., 2011), or subcortical structures (Babalola et al., 2008). Due to varying complications in segmenting human cerebral cortex, the manual methods for brain tissues segmentation might easily lead to errors both in accuracy and reproducibility (operator bias), and are exceedingly time-consuming, we thus need fast, accurate and robust semi-automatic (i.e., supervised classification explicitly needs user interaction) or completely automatic (i.e., nonsupervised classification) techniques (Suri, Singh, et al., 2002b).

MR imaging (MRI)
MR imaging (MRI), invented by Raymond V. Damadian in 1969, and was firstly done on a human body in 1977 (Damadian et al., 1977). MR imaging is a popular medical imaging technique used in radiology to visualize detailed internal structures. It provides good contrast between different soft tissues of the body, which makes it especially useful in imaging the brain, muscles, the heart and cancers when compared with other medical imaging techniques, such as computed tomography (CT) or X-rays (Novelline & Squire, 2004). According to different magnetic signal weighting with particular values of the echo time ( ) and the repetition time ( ), three different images can be achieved from the same body: -weighted, -weighted, and PD-weighted (proton density).
In the clinical diagnosis, one patient's head is examined from 3 planes showed in Fig.1 (a), and they are axial plane, sagittal plane and coronal plane. The -weighted brain MR images from different planes are respectively showed in Fig.1 (b), (c), and (d).

Difficulties in segmentation of brain MRI
Even though cortical segmentation has developed for many years in medical research, it is not regarded as an automated, reliable, and high speed technique because of magnetic field inhomogeneities: 1. Noise: random noise associated with the MR imaging system, which is known to have a Rician distribution (Prima et al., 2001); 2. Intensity inhomogeneity (also called bias field, or shading artefact): the non-uniformity in the radio frequency (RF) field during data acquisition, resulting in the shading of effect (X. Li et al., 2003); 3. Partial volume effect: more than one type of class or tissue occupies one pixel or voxel of an image, which are called partial volume effect. These pixels or voxels are usually called mixels (Ruan et al., 2000).

Evaluation of segmentation techniques
The evaluation of brain tissue classification also is a complex issue in medical image processing. Visual inspection and comparison with manual segmentation are very strenuous and are not reliable since the amount of data to be processed is usually large. Tissue classification methods can also be validated by using synthetic data and real brain MR images. The simulated brain MR data with different noise levels and different levels of intensity inhomogeneity, have been provided by Brainweb simulated brain phantom (Collins et al., 1998;Kwan et al., 1999) (http://www.bic.mni.mcgill.ca/brainweb/), and the ground truth for both the classification and partial volumes within the images is also available to estimate different methods quantitatively. The real brain MRI datasets with expert segmentations can be obtained from Internet Brain Segmentation Repository (IBSR) (http://www.cma.mgh.harvard.edu/ibsr/). A few surveys on this topic have been provided in (H. Zhang et al., 2008;Y.J. Zhang, 1996Y.J. Zhang, , 2001. Here, we depict three different measures for quantitatively evaluating segmentation results. (1) The misclassification rate (MCR) is the percentage of misclassified pixels and is computed as (background pixels were ignored in the MCR computation) (Bankman, 2000) × number of misclassified pixels MCR = 100% total number of all pixels (2) (2) The root mean squared error (RMSE) is to quantify the difference between the true partial volumes and the algorithm estimations. The RMSE of an estimator with respect to the estimated parameter is defined as (Bankman, 2000): (3) Let be the number of pixels that do not belong to a cluster and are segmented into the cluster, be the number of pixels that belong to a cluster and are not segmented into the cluster, be the number of all pixels that belong to a cluster, and be the total number of pixels that do not belong to a cluster. Three parameters in this evaluation system may now be defined as follows (Shen et al., 2005). The purpose of this chapter is to render a review about existing segmentation techniques and the work we have done in the segmentation of brain MR images. The rest of this chapter is organized as follows: In Section 2, existing techniques for human cerebral cortical segmentation and their applications are reviewed. In Section 3, a new non-homogeneous Markov random field model based on fuzzy membership is proposed for brain MR image segmentation. In Section 4, image pre-processing, such as de-noising, the correction of intensity inhomogeneity and the estimation of partial volume effect are summarized. In Section 5, the conclusion of this chapter is given.

Thresholding
The simplest operation in this category is image thresholding (Pal & Pal, 1993). In this technique a threshold is selected, and an image is divided into groups of pixels having value less than the threshold and groups of pixels with values greater or equal to the threshold. There are several thresholding methods: global thresholding, adaptive thresholding, optimal global and adaptive thresholding, local thresholding, and thresholds based on several variables (Bankman, 2000). Thresholding is a very simple, fast and easily implemented procedure that works reasonably well for images with very good contrast between distinctive sub-regions. A typical example is to separate CSF from highly T2-weighted brain images (Saeed, 1998). However, the distribution of intensities in brain MR images is usually very complex, and determining a threshold is difficult. In most cases, thresholding is combined with other methods (Brummer et al., 1993;Suzuki & Toriwaki, 1991).

Region growing
Region growing (or region merging) is a procedure that looks for groups of pixels with similar intensities. It starts with a pixel or a group of pixels (called seeds) that belong to the structure of interest. Subsequently the neighbouring pixels with the same properties as seeds (or based on a homogeneity criteria) are appended gradually to the growing region until no more pixels can be added (Dubey et al., 2010). The object is then represented by all pixels that have been accepted during the growing procedure. The advantage of region growing is that it is capable of correctly segmenting regions that have the same properties www.intechopen.com Segmentation of Brain MRI 147 and are spatially separated, and also it generates connected regions (Bankman, 2000). Instead of region merging, it is possible to start with some initial segmentation and subdivide the regions that do not satisfy a given uniformity test. This technique is called splitting (Haralick & Shapiro, 1985). A combination of splitting and merging adds together the advantages of both approaches (Zucker, 1976). However, the results of region growing depend strongly on the selection of homogeneity criterion. Another problem is that different starting points may not grow into identical regions (Bankman, 2000). Region growing has been exploited in many clinical applications (Cline et al., 1987;Tang et al., 2000).

Edge detection techniques
In edge detection techniques, the resulting segmented image is described in terms of the edges (boundaries) between different regions. Edges are formed at intersection of two regions where there are abrupt changes in grey level intensity values. Edge detection works well on images with good contrast between regions. A large number of different edge operators can be used for edge detection. These operations are generally named after their inventors. The most popular ones are the Marr-Hildreth or LoG (Laplacian-of-Gaussian), Sobel, Roberts, Prewitt, and Canny operators. Binary mathematical morphology and Watershed algorithm are often used for edge detection purposed in the segmentation of brain MR images (Dogdas et al., 2002;Grau et al., 2004). However, the major drawbacks of these methods are over-segmentation, sensitivity to noise, poor detection of significant areas with low contrast boundaries, and poor detection of thin structures, etc. (Grau et al., 2004).

Classifiers
Classifier methods are known as supervised methods in pattern recognition, which seek to partition the image by using training data with known labels as references. The simplest classifier is nearest-neighbour classifier (NNC), in which each pixel is classified in the same class as the training datum with closest intensity (Boudraa & Zaidi, 2006). Other examples of classifiers are k-nearest neighbour (k-NN) (Duda & Hart, 1973;Fukunaga, 1990), Parzen window (Hamamoto et al., 1996), Bayes classifier or maximum likelihood (ML) estimation (Duda & Hart, 1973), Fisher's linear discriminant (FLD) (Fisher, 1936), the nearest mean classifier (NMC) (Skurichina & Duin, 1996), support vector machine (SVM) (Vapnik, 1998). The weakness of classifiers is that they generally do not perform any spatial modelling. This weakness has been addressed in recent work extending classifier methods to segment images corrupted by intensity in-homogeneities (Wells III et al., 1996). Neighbourhood and geometric information was also incorporated into a classifier approach in (Kapur et al., 1998). In addition, it requires manual interaction to obtain training data. Training sets for each image can be time consuming and laborious (Pham et al., 2000).

Clustering
Clustering is the process of organizing objects into groups whose members are similar in certain ways, whose goal is to recognize structures or clusters presented in a collection of unlabelled data. It is a method of unsupervised learning, and a common technique for statistical data analysis used in many fields.

K-means clustering
K-means clustering (or Hard C-means clustering, HCM) (MacQueen, 1967) is one of the simplest unsupervised clustering method, aiming to partition N samples into K clusters by minimizing an objective function so that the within-cluster sum of squares is minimized. It starts with defined initial K cluster centers and keeps reassigning the samples to clusters based on the similarity between the sample and the cluster centers until a convergence criterion is met. Given a set of samples , ,…, , where each sample is a M-dimensional real vector, is the num of samples in cluster k denoted by , is the mean value of these samples, and then the objective function is defined as: where || || is a distance measure between point and the cluter center . The common distance measures are Euclidean distance, chessboard distance, city block distance, Mahalanobis distance, or Hamming distance. The K-means algorithm has been used widely in brain MR image segmentation (Abras & Ballarin, 2005;Vemuri et al., 1995), because of its easy implementation and simple time complexity. A major problem of this algorithm is that it is sensitive to the selection of K cluster centers, and may converge to a local minimum of the criterion function value (Jain et al., 1999). Dozens of optimal solutions have been proposed for selecting better initial K cluster centers to find the global minimum value (Bradley & Fayyad, 1998;Khan & Ahmad, 2004).

Fuzzy c-means clustering (FCM)
Fuzzy c-means clustering (FCM) (Bezdek, 1981;Dunn, 1973) is based on the same idea of finding cluster centers by iteratively adjusting their positions and minimizing an objective function as K-means algorithm. Meanwhile it allows more flexibility by introducing multiple fuzzy membership grades to multiple clusters. The objective function is defined as: where m is constant to control clustering fuzziness, generally m = 2. is the fuzzy membership of in the cluster k and satisfying , ∑ . is the i-th sample in measured data.
is the cluster center, and ∥ * ∥ is a distance measure. Fuzzy partitioning is carried out through an iterative optimization of the objective function shown above, with the update of membership and cluster centers by: This iteration will stop when , ε is a termination criterion between 0 and 1, and p is the iteration step (Kannan et al., 2010). Although clustering algorithms do not require training data, they do require an initial segmentation (or equivalently, initial parameters). Clustering algorithms do not directly incorporate spatial modeling and can therefore be sensitive to noise and intensity inhomogeneities. This lack of spatial modeling, however, can provide significant advantages for fast computation (Hebert, 1997). Some work on improving the robustness of clustering algorithms to intensity inhomogeneities in MR images have been carried out (Pham & Prince, 1999). Robustness to noise can be incorporated with spatial correlations in an image based on k-nearest neighbor model (R. Xu & Ohya, 2010) or Markov random field (MRF) modeling (Liu et al., 2005).

Statistical models
Statistical classification methods usually solve the segmentation problem by either assigning a class label to a pixel or by estimating the relative amounts of the various tissue types within a pixel (Noe et al., 2001). Statistical inference enables us to make statements about which element(s) of this set are likely to be the true ones.

Expectation maximization (EM)
Expectation maximization (EM) algorithm (Dempster et al., 1977) is a method for finding the maximum likelihood or maximum a posteriori (MAP) estimator of a hidden parameter with a probability distribution. EM is an iterative method which alternates between performing an expectation (E) step, in which each pixel is classified into one cluster according to the current estimates of the posterior distributions over hidden variables, and a maximization (M) step, in which the hidden parameters are re-estimated by maximizing the likelihood function, according to the current classification. These parameter-estimates are then used to determine the distribution over hidden variables in the next E step. Convergence is assured since the increase of likelihood after each iteration ). The underlying model in EM algorithm can be specified according the specific requirements of the given task (Wells III et al., 1996;Y. Zhang et al., 2001). In spite of these achievements, they have a few deficiencies: a good prior distribution and the known number of classes are required, and it has extensive computations.

Markov random field model (MRF)
Markov random field (S.Z. Li, 1995) model is a statistical model that can be used within segmentation methods. MRFs model spatial interactions among neighboring or nearby pixels. In medical imaging, they are typically used because most pixels belong to the same class as their neighboring pixels (Pham et al., 2000). Let a finite lattice I as a 2D image, ∈ is the pixel i in this image, which is denoted by , ∈ , where is the gray value of pixel i. For each pixel, the region-type (or pixel class) that the pixel belongs to is specified by a class label , ∈ (i.e., image segmentation results). ∈Λ, Λ , , … , is a set of labels and K is the number of classes. So X (label filed) and Y (gray field) will be random fields in lattice I and the purpose of MRF model is to establish the relationship between X and Y, then the image model is defined as: where is the gray mean value of class , and is a random variable meeting Gaussian distribution. If , ∈Λ, , , in which is the variance of for k, then the conditional probability density is defined as: Subsequently, , ∈ , the priori model of image segmentation results is a 2D MRF.
According to Hammersley-Clifford theorem in (Hammersley & Clifford, 1971), the priori probability of MRF meets Gibbs distribution, and so the priori model is defined as: is a normalizing constant called partition function and denotes the potential function of clique ∈ , which only depends on , ∈ . C is the set of second order cliques (i.e. doubletons), and indicates the neighborhood of pixel i. If multi-level logistic (MLL) model is adopted and the second order neighborhood system and the dual potential function are only considered, energy function is defined as: Note that the energies of singletons (i.e. pixel ∈ ) directly reflect the probabilistic modeling of labels without context, while doubleton clique potentials express relationship between neighboring pixel label. On the basis of maximum a posteriori (MAP) estimation (Geman & Geman, 1993) and Bayes' theorem, the optimal solution X X * is defined as: In this way, the segmentation problem in MRF model is reduced to the minimization of the above energy function, which is usually computed by iterated conditional modes (ICM) algorithm (Besag, 1986). The ICM method uses the 'greedy' strategy in the iterative local minimization and convergence is guaranteed after only a few iterations (Boudraa & Zaidi, 2006). By importing spatial relations among pixels, non-supervised and nonparametric MRF model can effectively decrease the influence of image noise, and undertake fine stable and satisfied segmentation results for low SNR images. This model has been widely applied in human cerebral cortical segmentation (Held et al., 1997;Y. Zhang et al., 2001). Contrarily a difficulty associated with MRF models is proper selection of the parameters controlling the strength of spatial interaction (S.Z. Li, 1995). A setting that is too high can result in an excessively smooth segmentation and a loss of important structural details. Some researchers have proposed several schemes for the estimation of MRF parameters (Descombes et al., 1999;Salzenstein & Pieczynski, 1997;R. Xu & Luo, 2009). In addition, MRF methods usually require computationally intensive algorithms (Pham et al., 2000).

Artificial neural networks (ANNs)
Artificial neural networks (ANNs) are parallel networks of processing elements or nodes to simulate biological neural networks. Each node in an ANN is capable of performing elementary computations. Learning is achieved through the adaptation of weights assigned to the connections between nodes. The massive connectionist architecture usually makes the system robust while the parallel processing enables the system to produce output in real time. To simulate biological neural network, the neurons and connections in ANNs model comprise the following components and variables in Fig. 2 (Kriesel, 2007). A thorough treatment of ANNs can be found in (J.W. Clark, 1991). www.intechopen.com

Advances in Brain Imaging
The most widely application in medical imaging is as a classifier (Gelenbe et al., 1996;Hall et al., 1992), in which the weights are determined by training data and the ANN is then used to segment new data. ANNs can also be used in an unsupervised fashion as a clustering method (Bezdek et al., 1993;Reddick et al., 1997), as well as for deformable models (Vilarino et al., 1998). Because of the many interconnections used in a neural network, spatial information can be easily incorporated into its classification procedures (Pham et al., 2000). However, the major disadvantage of the artificial neural networks (ANNs) is that it requires training data. For large neural networks, it also requires high processing time because its processing is usually simulated on a standard serial computer.

Deformable models
Deformable models are physically motivated, model-based techniques for detecting region boundaries by using closed parametric curves or surfaces that deform under the influence of internal and external forces. To delineate an object boundary in an image, a closed curve or surface must first be placed near the desired boundary and then be allowed to undergo an iterative relaxation process. Internal forces are computed from within the curve or surface to keep it smooth throughout the deformation. External forces are usually derived from the image to drive the curve or surface toward the desired feature of interest (Pham et al., 2000). The original deformable, called snake model, was introduced in (Kass et al., 1988), in which the contour deforms to minimize the contour energy that includes the internal energy from the contour and the external energy from the image. A number of improvements have also been proposed, such as snake variations (Cohen, 1991;McInerney & Terzopoulos, 2000;. Level set is another important deformable contour method and it was firstly proposed for image segmentation in (Malladi et al., 1995). Some researchers applied level set formulation with a contour energy minimization for obtaining a better convergence (Siddiqi et al., 1998;Wang et al., 2004).
Deformable models are quite helpful for cerebral cortical segmentation in MR images (Davatzikos & Bryan, 1996;. The advantages are that they are capable of generating closed parametric curves or surfaces from images and incorporating a smoothness constraint that provides robustness to noise and spurious edges. The disadvantage is that they require manual interaction to place an initial model and choose appropriate parameters. The successes in reducing sensitivity to initialization have been made in (Cohen, 1991;Malladi et al., 1995;. Standard deformable models can also exhibit poor convergence to concave boundaries. This difficulty can be alleviated somewhat through the use of pressure forces (Cohen, 1991) and other modified external-force models (C. . Another important extension of deformable models is the adaptivity of model topology by using an implicit representation rather than an explicit parameterization (Malladi et al., 1995;McInerney & Terzopoulos, 1995). Several general reviews on deformable models in medical image analysis can be found in (He et al., 2008;Heimann & Meinzer, 2009;McInerney & Terzopoulos, 1996;Suri, Liu, et al., 2002).

Atlas-guided approaches
Atlas-guided approaches are a powerful tool for medical image segmentation when a standard atlas or template is available. The whole idea of using the brain atlas was to provide a priori knowledge, which can help in grouping the segments into anatomical structures. This helps to obtain fully automatic cortical segmentation procedures. The standard atlas-guided approach treats segmentation as a registration problem. It first finds a one-to-one transformation that maps a pre-segmented atlas image to the target image. This process is often referred to as 'atlas warping'. The warping can be performed with linear transformations (Talairach & Tournoux, 1988), or nonlinear transformations (Collins et al., 1995;Davatzikos, 1996). Atlas-guided approaches have been applied mainly in brain MRI segmentation (Collins et al., 1995), as well as in extracting the brain volume from head scans (Aboutanos & Dawant, 1997). One advantage is that labels as well as the segmentation are transferred. They also provide a standard system for studying morphometric properties (Thompson & Toga, 1997). Atlas-guided approaches are generally better suited for segmentation of structures that are stable over the population of study. One method that helps model anatomical variability is to use probabilistic atlases (Thompson & Toga, 1997), but these require additional time and interaction to accumulate data. Another method is to use manually selected landmarks (Davatzikos, 1996) to constrain transformation.

Other techniques
Texture segmentation is to segment an image into regions according to the textures of the regions. It was in the late 1970s when Haralick et al (Haralick et al., 1973) published an extensive paper on texture. Later, Peleg et al (Peleg et al., 1984) and Cross et al (Cross & Jain, 1983) also published work in texture analysis applied to computer vision images. Application of texture in brain segmentation started in the early 1990s, when Lachmann et al (Lachmann & Barillot, 1992) developed a method for the classification of WM, GM and CSF. This method, however, did not discuss the validation schemes, and it was hard to judge the performance of such a segmentation algorithm. Besides, it seemed sensitive to initial textural properties, and no such discussion was carried out in the paper (Suri, Singh, et al., 2002b).
Self-organizing maps (SOM), introduced by Kohonen in early 1981 (Kohonen, 1990), is a type of artificial neural network, whose precursor is learning vector quantization (LVQ) invented by T. Kohonen (Kohonen, 1997). It is able to convert complex, nonlinear statistical relationships between high-dimensional data items into simple geometric relationships on a low-dimensional display via using unsupervised learning. The applications of SOM method can be found in (Y. Li & Chi, 2005;Tian & Fan, 2007). However, SOM algorithms are, firstly, highly dependent on the training data representatives and the initialization of the connection weights. Secondly, they are very computationally expensive if the dimensions of the data increases (Y. Li & Chi, 2005).
Wavelet transform, adventured in medical imaging research in 1991 (Weaver et al., 1991), is a tool that cuts up data or functions or operators into different frequency components, and then studies each component with a resolution matched to its scale (Daubechies, 2004). Modern wavelet analysis was considered to be proposed by Grossmann and Morlet in their milestone paper (Morlet & Grossman, 1984). In medical image segmentation, wavelet transforms have been employed to combine texture analysis, edge detection, classifiers, statistical models, and deformable models, etc. Many works benefit through using image features within a spatial-frequency domain after wavelet transform to assist the segmentation (Barra & Boire, 2000;Bello, 1994).
Multispectral segmentation is a method for differentiating tissue classes having similar characteristics in a single imaging modality by using several independent images of the same anatomical slice in different modalities (e.g., T1, T2, proton density, etc.). As a consequence of different responses of the tissues to particular pulse sequences, this increases the capability of discrimination between different tissues (Fletcher et al., 1993;Vannier et al., 1985). The most common approach for multispectral MR image segmentation is pattern recognition (Bezdek et al., 1993;Suri, Singh, et al., 2002b). These techniques generally appear to be successful particularly for brain MR images (Reddick et al., 1997;Taxt & Lundervold, 1994), but much work remains in the area of validation.

A new non-homogeneous Markov random field model
As we introduced in Section 2.6.2, Markov random field (MRF) theory (S.Z. Li, 1995) has been widely used in the field of medical image processing with the advantages, including non-supervision, fine stability and satisfied segmentation effect for the image with low SNR. MRF theory provides a convenient and consistent way for modeling context among image pixels. This is achieved through characterizing mutual influences among such entities using conditional MRF distributions. The practical use of MRF models is largely ascribed to the equivalence between MRF and Gibbs distributions established by Hamersley and Clifford (Hammersley & Clifford, 1971) and is further developed by Besag (Besag, 1974) for the joint distribution of MRF. This enables us to model vision problems by a mathematically sound yet tractable means for image segmentation in Bayesian framework (Geman & Geman, 1993;Grenander, 1983).
In traditional MRF model, Gibbs random field (GRF) uses the parameter to determine spatial correlation among dependent image pixels. The greater the parameter is, the stronger the spatial correlation would be; the smaller the parameter is, the weaker the spatial correlation would be. Generally, MRF model is assumed to be homogeneous, which means the parameter is constant. Plenty of previous researches have offered a series of methods to accurately estimate this parameter, which advance the effect of image segmentation (Deng & Clausi, 2004;Descombes et al., 1999). Due to its own features of medical image, homogeneous MRF model often leads to over-segmentation and induces higher misclassification rate. In this section, we propose a new non-homogeneous MRF model (called Modified-MRF or M-MRF model) using fuzzy membership to accurately estimate the parameter and the experimental results show our model effectively reduces over-segmentation and enhances segmentation precision (R. Xu & Luo, 2009).

Fuzzy sets
Fuzzy sets are sets whose elements have degrees of membership, which firstly were proposed by L.A. Zedeh in 1965 (Zadeh, 1965) as an extension of the classical notion of set. Classical set theory only describes precise phenomenon, because an element belonging to a classic set contains only two cases: yes or no. By contrast, fuzzy set theory permits the gradual assessment of the membership of elements in a set; this is described with the aid of a membership function valued in the real unit interval [0, 1]. Fuzzy sets generalize classical sets, since the indicator functions of classical sets are special cases of the membership functions of fuzzy sets, if the latter only take values 0 or 1 (DuBois & Prade, 1980). The fuzzy set is defined as: Given a domain X, x denotes its element, the mapping is defined as ∶ → , , → , which means confirms a fuzzy set F in domain X, is called F's membership function and is x's membership for F. The greater the membership, the greater the degree of one element pertaining to one fuzzy set. As a consequence, F is a subset in domain X, which does not have undefined border.

Modified non-homogeneous MRF model
In terms of the features in brain MR images, the spatial correlation of adjacent pixels varies with the positions of image space, which indicates the parameter should be a variable changing with space site. Consequently, the corresponding MRF model should be considered as non-homogeneous.

The Function based on fuzzy membership
Let y be the gray value of pixels, and x be the classification of pixels in image I. If pixel i is marked by class k ( is the clustering center of class k, ,…, ), the parameter will be a decreasing function of , which denotes the membership of pixel i belonging to class k. The smaller the is, the less the degree of pixel i in class k would be, which implies the attribute of pixel i should be decided by the state of neighborhood. The larger the is, the larger the degree of pixel i in class k would be, which implies the attribute of pixel i should be decided by the gray value of itself. Thus, the function is defined as:

The modified MRF model (M-MRF model)
In traditional MRF model (see Section 2.6.2), the parameter is used to calculate the energy function and clique potentials over all possible cliques ∈ , which only depends on the neighborhood of pixel i: , ∈ . According to the function, the energy function and clique potentials through considering multi-level logistic (MLL) model, second-order neighborhood system and dual potential function, can be modified as , And the new non-homogeneous MRF (M-MRF) model has been improved into Therefore, the segmentation problem is reduced to minimize the above energy function, which is generally solved by iterated conditional modes (ICM) algorithm (Besag, 1986). The algorithm of M-MRF model for image segmentation is designed as follows: 1. Initialize the number of class K, the clustering center , the smallest error , and ; 2. Get the initial segmentation results via KFCM algorithm (L. Zhang et al., 2002), and estimate the parameter by Eq.(18); 3. Segment the initial image based on maximum-likelihood criterion and M-MRF model, and calculate the global energy E of whole image; 4. Calculate local conditional energy of every pixel for all possible classification by Eq. (19) and Eq. (21), and update the classification of every pixel following the principle of minimizing local conditional energy. 5. Calculate the global energy E of whole image again by the new classification of every pixel, ; 6. if max |E E | ε, then go to (7), else return (4); 7. Output image segmentation results and stop.

Smoothing of image
Owing to complexity of brain MR images and their own reasons of segmentation algorithms, segmentation results are often accompanied by burrings, stains, rugged edges, etc. By smoothing, isolated burrings and stains of image can be removed, edges of regions can be smoothed and holes of areal objects can be filled. Sequentially, the quality of segmentation results can be further improved. In the processing of image smoothing, matrix template of (n is customarily assigned by ~ ) is currently employed to march image via lines and columns. If the image matches successfully, the segmentation result of the pixel in the center of matrix template will be replaced by the same segmentation results around this pixel.

Deburring
The deburring matrix in (a) is frequently betaken, where a, b, x ∈ , (L is the set of labels) and 'x' is arbitrary which figures the segmentation results of x's sites can be left out of account. When the image segmentation results in matrix march the deburring matrix in Fig. 3 (a), 'b' in the center of matrix will become 'a'.

Smoothing of lines and filling of holes
The methods of smoothing of lines and filling of holes are the same as that of deburring, just the matrices are different. The matrix of smoothing of lines in Fig. 3 (b) is utilized as a rule. In the same way, When the image segmentation results in matrix march the matrix in Fig. 3 (b), 'b' in the center of matrix will become 'a'.

Experimental results
In order to verify the effect of M-MRF model in image segmentation, KFCM algorithm (L. Zhang et al., 2002), traditional MRF model (S.Z. Li, 1995 and M-MRF model are applied in the segmentation of simulated brain MR images. During the experiments, brain MR images are divided into four regions: gray matter (GM), white matter (WM), cerebrospinal fluid (CSF) and background (BG). All experiments are operated by VS.Net 2003 in PC of Intel® Core™2 CPU 6600 @ 2.40GHZ with 2GB memory.
The simulated brain MR images from Brainweb (http://www.bic.mni.mcgill.ca/brainweb/) are applied in the experiments, and we call them gold standard of image segmentation. Each data set is composed of 8 8 pixels, thickness of layer is 1mm, weighted. Herein, the lay images used in experiments are the . 's ones of image sequences. Fig. 4 is a comparison of the segmentation results of several algorithms for a simulated brain MRI superposed 9% noise. The experimental results demonstrate that, even for images of lower signal-to-noise ratio (SNR), M-MRF model also achieves more satisfied segmentation results.  In consideration of its own traits of brain MRIs, a new non-homogeneous MRF model (M-MRF model) is put forward for reducing over-segmentation, where the parameter is estimated to an inch by fuzzy membership, so that the spatial relativities among each pixel will be reasonably set up. The experimental results prove our model not only inherits the superiorities of traditional MRF model, e.g., non-supervision, fine stability and satisfied robustness for image of low signal-to-noise ratio (SNR), but also significantly enhance the accuracy of image segmentation. Meanwhile, the algorithm of this new model is also simple and feasible and it is easy to be applied into clinical application by fusing de-bias field model.

Image pre-processing
Due to the inherent technical limitations of the MR image process, uncertainties are inserted into MR images, including random noise, intensity inhomogeneity, and partial volume effect, etc. A more complete and comprehensive coverage of the contributing sources of error inherent in MR images can be found in (Plante & Turkstra, 1991). The image preprocessing techniques reviewed here mainly focus on reducing the detrimental effects of the artifacts mentioned for the purpose of applying segmentation methods.
It is difficult to remove noise from MR images, which is known to have a Rician distribution (Prima et al., 2001), and state-of-art methods in removing noise are substantial. Methods vary from standard filters to more advanced filters, from general methods to specific MR image de-noising methods, such as linear filtering, nonlinear filtering, adaptive filtering, anisotropic diffusion filtering, wavelet analysis, total variation regularization, bilateral filter, trilateral filtering, and non-local means models (NL-means), etc. A worthy survey of image de-noising algorithms can be seen in (Buades et al., 2006).
Intensity inhomogeneity (also called bias field, or shading artefact) in MRI, which arises from the imperfections of the image acquisition process, manifests itself as a smooth intensity variation across the image (Fig. 5). Because of this phenomenon, the intensity of the same tissue varies with the location of the tissue within the image. Although intensity inhomogeneity is usually hardly noticeable to a human observer, many medical image analysis methods, such as segmentation and registration, are highly sensitive to the spurious variations of image intensities. This is why a large number of methods for the correction of intensity inhomogeneity in MR images have been proposed in the past (Vovk et al., 2007). Early publications on MRI intensity inhomogeneity correction date back to 1986 (Haselgrove & Prammer, 1986;McVeigh et al., 1986). Since then, sources of intensity inhomogeneity in MRI have been studied extensively (Alecci et al., 2001;Keiper et al., 1998;Liang & Lauterbur, 2000;Simmons et al., 1994) and can be generally divided into two groups: prospective methods and retrospective methods. According to the classification proposed by U. Vovk (Vovk et al., 2007), we may further classify the prospective methods into those that are based on phantoms, multi-coils, and special sequences. The retrospective methods are further classified into filtering, surface fitting, segmentation-based, and histogram-based, etc. Additionally, several valuable reviews about this topic can be found in (Arnold et al., 2001;Belaroussi et al., 2006;Hou, 2006;Sled et al., 1997;Velthuizen et al., 1998;Vovk et al., 2007).

www.intechopen.com
Advances in Brain Imaging

160
Partial volume effect (PVE) means artefacts that occur where multiple tissue types contribute to a single pixel, resulting in a blurring of intensity across boundaries, which is common in medical images, particularly for 3D MRI data. Fig. 6 illustrates how the sampling process can result in PVE, leading to ambiguities in structural definitions. In Fig. 6 (Right), it is difficult to precisely determine the boundaries of the two objects. The most common approach to addressing partial volume effect is to produce segmentations that allow regions or classes to overlap, called soft segmentations. Standard approaches use 'hard segmentations' that enforce a binary decision on whether a pixel is inside or outside the object. Soft segmentations, on the other hand, retain more information from the original image by allowing for uncertainty (such as membership for every pixel) in the location of object boundaries. Generally, membership functions can be derived by fuzzy clustering and classifier algorithms (Herndon et al., 1996;Pham & Prince, 1999) or statistical algorithms, in which case the membership functions are probability functions (Wells III et al., 1996), or can be computed as estimates of partial volume fractions (Choi et al., 1991). Soft segmentations based on membership functions can be easily converted to hard segmentations by assigning a pixel to its class with the highest membership value (Pham et al., 2000). The growing attention have been given to estimate partial volume effect in the last decade (Choi et al., 1991;Gage et al., 1992;Gonzalez Ballester et al., 2002;Roll et al., 1994;Soltanian-Zadeh et al., 1993;Thacker et al., 1998;Tohka et al., 2004).

Conclusion
A great number of medical image segmentation techniques have been used for analysis of MRI data of human brain, whose performance is affected by the characteristics of MRI data, which include a number of artifacts, such as random noise, intensity inhomogeneity and partial volume effect, etc. On the other hand, the inherent multispectral character of MRI gives it a distinct advantage over other imaging techniques. Many of the approaches described here explore ways to correct the artifacts in MRI and to fully exploit the multispectral character of this imaging modality. I n t h i s c h a p t e r , w e h a v e g i v e n a b r i e f introduction to the fundamental concepts of these techniques, and presented our work on brain MR image segmentation, as well as a descripted the pre-processings such as denoising, the correction of intensity inhomogeneity and the estimation of partial volume effect.
The future researches in the segmentation of human brain MRI will focus upon improving the accuracy, precision, and execution speed of segmentation methods, as well as reducing the amount of manual interaction. Accuracy and precision can be improved by incorporating prior information from atlases and by the fusion of different methods. For the sake of advancing execution efficiency, multi-scale processing, graphic processing unit (GPU) technique and parallelizable methods such as neural networks can be used promisingly. In order to raise the current acceptance of routine clinical applications for segmentation methods, extensive efficient validation is required. Furthermore, one must be able to demonstrate some significant performance advantage (e.g. more accurate diagnosis or earlier detection of pathology) over traditional methods to guarantee the less cost of training and equipment. It is impossible that automated methods will replace the physicians, but they are likely to become crucial elements in medical image analysis.

Acknowledgment
Special thanks to go the group of Ohya Laboratory, Global Information and Telecommunication Studies (GITS), Waseda University, Japan, and the group of the Laboratory of Image Science and Technology (LIST), School of Computer Science and Engineering, Southeast University, China, for their contribution and discussion on various aspects and projects associated with image segmentation. The authors would like to thank the reviewers for their valuable suggestions for improving this manuscript.