Microcalcification Detection in Digitized Mammograms: A Neurobiologically-Inspired Approach

A Computer-Aided Diagnosis (CAD) system is a set of automatic or semi-automatic tools developed to assist radiologists in the detection and/or classification of abnormalities presented in diagnostic images of different modalities. Although on the early phase of research and development CAD systems were criticized by some computer scientists; regardless of this criticism, nowadays’ experimental evidence indicates that success rates of radiologists increase significantly when they are helped by these systems: In mammography, researchers have reported results from prospective studies on a large number of screenees, regarding the effect of CAD on the detection rate of breast cancer. Although there is a large variation in the results, it is important to note that all of these studies indicated an increase in the detection rates of breast cancer with the use of CAD; as a consequence of this, using CAD contributes to decrease cancer-related deceases due to the early detection of cancer signs.


Introduction
A Computer-Aided Diagnosis (CAD) system is a set of automatic or semi-automatic tools developed to assist radiologists in the detection and/or classification of abnormalities presented in diagnostic images of different modalities.Although on the early phase of research and development CAD systems were criticized by some computer scientists; regardless of this criticism, nowadays' experimental evidence indicates that success rates of radiologists increase significantly when they are helped by these systems: In mammography, researchers have reported results from prospective studies on a large number of screenees, regarding the effect of CAD on the detection rate of breast cancer.Although there is a large variation in the results, it is important to note that all of these studies indicated an increase in the detection rates of breast cancer with the use of CAD; as a consequence of this, using CAD contributes to decrease cancer-related deceases due to the early detection of cancer signs.
The idea of developing computer systems to assist physicians in the detection of diseases has been a challenging matter during the last years, specifically on reducing the number of missed diagnosis and the time taken to reach a diagnosis among the different diagnostic image modalities.Moreover, the recent development of full-field digital imaging and picture archiving and communication systems (PACS) have been a catalyst in the increase of such computer systems in developed countries.
Because of the emphasis on screening programs in almost every country, the number of mammograms to be analyzed by the radiologists is enormous but, only a small portion of them are related to breast cancer (Oliver et al., 2010).In addition, a mammographic image is characterized by a high spatial resolution which is adequate enough to detect subtle finescale signs such as microcalcifications.Consequently, the analysis of mammographic images is a complex and cumbersome task which requires highly specialized radiologists.
During the last years, the number of papers related to CAD has been augmented due to the increased interest on improving disease diagnosis using different image modalities.As far as the evidence indicates, it appears reasonable to use CAD for screening examinations, provided that large fractions of them give normal results and therefore the task of diagnosis becomes both cumbersome and time-consuming.In addition, the current performance of commercial CAD systems have shown that there is a substantial gain in detection rates as well as an important increase in recall rate, not to mention the overall performance of such systems for the detection of disease signs (e.g., 98% sensitivity at 0.25 false positives per mammographic image, for one of the latest commercial CAD systems) (Doi, 2007).
As far as the literature shows, there seems to be only one attempt to integrate CAD systems into a multi-organ and multi-disease one incorporating all the diagnostic knowledge (Kobatake, 2007).On the other hand, the current status of single-purpose, single-organ CAD systems shows some good examples of commercial and functionally CAD systems for practical and clinical use.In mammography, chest radiography and thoracic CT, a number of commercial systems are available.The former systems include the detection and differential diagnosis of masses and microcalcifications.Furthermore, in chest radiography and thoracic CT, CAD schemes include the detection and differential diagnosis of lung nodules, interstitial lung diseases, and the detection of cardiomegaly, pneumothorax and interval changes (Doi, 2007).Researchers have reported an important reduction in the mean age of patients at the time of detection when CAD was used along with the increase in the detection rates of breast cancer (Cupples et al., 2005), similar results were achieved on the detection rates of lung cancer, colon diseases, intracranial aneurysms, among others (Doi, 2007).
Microcalcification detection has been extensively studied.Yu and Guan, 2000, developed a technique for the detection of clustered microcalcifications.The first part of the algorithm addresses the extraction of features based on wavelet decomposition and gray-level statistics, followed by a neural-network classifier.The detection of individual objects depends on shape factors, gray-level features, and a second neural network as a classification scheme.The algorithm was tested using a set of 40 mammograms and the sensitivity reported was 90% at 0.5 false positive per image.Christoyianni et al., 2002, proposed a neural classification scheme for different kinds of regions of suspicion (ROS) on digitized mammograms; in this approach the Mini-MIAS database was used to perform the feature extraction and classification stages.The feature extraction stage was based on independent component analysis calculation in order to find a set of regions that generates the mammograms observed.The recognition accuracy for the detection of abnormalities was 88.23% and 79.31% in distinguishing between benign and malignant regions.El-Naqa et al., 2002, used support vector machines to detect microcalcification clusters.The algorithm was tested using 76 mammograms, containing 1120 microcalcifications, and it outperformed several well-known methods for microcalcification detection with a sensitivity of 94% at one false positive.
Vilarrasa, 2006, proposed a variety of visual processing and classification schemes to detect and classify mammary tissue.This group of algorithms employs standard segmentation procedures such as Tukey outlier test, region growing and segmentation via watershed transformation; additionally, a neural classifier is proposed to distinguish between healthy and calcified mammary tissue.The results were not good enough (were not reported due to its poorness), nevertheless, a morphologic filter was used to increase the success rates of the classifier; finally, the system reached 84% sensitivity, 64% specificity and 77.2% accuracy.Verma et al., 2009, used a novel soft cluster neural network technique for the classification of suspicious areas in digital mammograms; the main idea of the soft clusters is to increase the generalization ability of the neural network; this network used a set of six features and was trained and tested using the DDSM benchmark database and the results showed an accuracy between 79% and 94%.Wei et al., 2009, proposed a microcalcification classification scheme assisted by content-based mammogram retrieval.The algorithm was tested using 200 different mammographic images from 104 cases.This approach used an adaptive support vector machine (Ada-SVM) as classifier which outperformed the classification accuracies given by other classifiers due to the incorporation of proximity information; the reported classification accuracy was 0.82 in terms of the area under the ROC curve.Tsai et al., 2010, proposed an approach in which suspicious microcalcified regions are separated from normal tissue by wavelet layers and Renyi's information theory.Subsequently, several statistical shape-based descriptors are extracted; principal component analysis (PCA) is used to reduce the dimensionality of the feature space and the data classification is performed by a standard MLP neural network.The maximum performance achieved by this approach was 97.1 at 0.08 false positives.

Visual cortex mechanisms: Neurobiological considerations and potential for CAD
Up to this moment, microcalcification detection has been largely studied along with the development of computer vision algorithms.There are many computational approaches which have driven the problem at reasonable cost-effectiveness.Nonetheless, as a matter of fact, neurobiologically-inspired approaches have been rather neglected due to the poor establishment of the relation between cogent neurobiological principles and their potential to visual computer systems development.
Primates' visual cortex is capable to interpret dynamical scenes in clutter, in spite of using several serial visual processes as the attention shifting and saccadic eye movements suggest.As pure parallel processing of visual inputs becomes obscure and cumbersome for the visual cortex machinery, it deals with such task by selecting circumscribed regions of visual information to be processed preferentially and by changing the processing focus over the time course.Up to this moment, there are several approaches for the dynamic routing of visual stimuli and information flow through the visual cortex, which accounts for competitive interactions and dynamical modifications of the neural activity into the ventral and dorsal pathways, and the consequent biasing of these interactions in favor of certain objects of the space into scene-dependent (bottom-up) and/or task-dependent (top-down) strategies (Itti & Koch, 2000).The interactions among these two visual processes have been addressed by many researchers (Fix et al., 2010;Navalpakkam & Itti, 2005;Navalpakkam & Itti, 2002;Walther & Koch, 2006;Serre et al., 2006).
Objects in the visual field must compete for processing within more than 30 different visual cortical areas.As the ability to screen out objects during visual search tasks is contextual and primates often detect a single target in an array of non-targets, detections -for all the effectsdepend largely on the correlation between targets and non-targets.According to this biased competition model, the targets and non-targets of a scene compete for processing spaces during visual search.There may be biases towards sudden appearances of new objects in the visual field and towards objects that are larger, brighter, faster moving, etc (Desimone & Duncan, 1995).Many computational models of human visual search have embraced the idea of a saliency map to accomplish preattentive selection.This representation contains the overall neural activity elicited by the objects and non-objects of the space, which compete for processing spaces in the visual search according to primary visual features such as intensity, orientations, colors and motion.The conformation of feature maps is a consequence of highly structured receptive fields of cells in lateral geniculate nucleus (LGN) and, notably, V1.Certain well-established neurobiological evidence points out the existence of this neuronal map and, on the other hand, some other evidence rejects the idea of a topographical representation standing for the overall saliency of visual stimuli and, therefore, points out the selectivity as a consequence of interactions among feature maps, each codifying the saliency of objects in a specific feature (Itti & Koch, 2000).
Modeling of visual attention mechanisms seems to have reasonably high promise, and its application to microcalcification detection will be the main topic and purpose of this chapter.In this approach we perform pre-processing and post-processing stages using several computer vision algorithms.This allows us to identify the potential of the neurobiologicallyinspired visual mechanisms model as part of a CAD scheme.We also give some relevant comparisons in relation to our previous approach (Ramirez-Villegas et al., 2010).

The proposed algorithm
The algorithm proposed in this book chapter is illustrated by Figure 1.The overall procedure is divided in six stages: (1) Mammographic images were taken from the Mini-MIAS Database of Mammograms (see sub-section 3.1.for a detailed description of the data); (2) The region of interest (ROI) cropping is accomplished by using the available information on the description section of the database; specifically we took into account the location and the approximate radius of the circle enclosing the abnormalities (microcalcifications); (3) Adaptive histogram equalization and the so-called top-hat algorithm were performed as pre-processing steps in order to enhance the microcalcifications' traces; (4) A pre-attentive bottom-up visual model was implemented in order to preliminarily distinguish between calcified and non-calcified tissue; (5) Tukey outlier test-based segmentation was used to perform the final segmentation of sub-regions via the simulated gaze allocation outcomes obtained in the former step; (6) Finally, a Self-Organizing Map (SOM) neural network was implemented in order to adjust topologically the microcalcifications and to provide a final visual output.Fig. 1.Overview of the proposed approach.

Mammographic database
In this work, a total of 23 mammographic images containing microcalcified tissue were taken from The Mini-MIAS Database of Mammograms (Suckling, 1994), which is widely used by researchers to carry out and evaluate their research work before other researchers in the area of CAD of breast cancer.We have used this database in our previous research (Ramirez-Villegas et al., 2010;Ramirez-Villegas & Ramirez-Moreno, 2011).The database provides appropriate details of the pathologies and general characteristics of the mammograms: The MIAS database reference number, character of background tissue (as it can be fatty, fatty-glandular and dense-glandular), class of abnormality (as it can be calcification, well-defined/circumscribed masses, spiculated masses, ill-defined masses, architectural distortion, asymmetry or normal), severity of abnormality (as it can be benign or malignant), the (x,y) image-coordinates of centre of abnormality and the approximate radius (in pixels) of a circle enclosing the abnormality.The resolution of the original images was 200 micron pixel edge so that every image's size was 1024 x 1024 pixels.The images are centered in the matrix.
All ROIs (calcified tissue samples) were selected using the reference given in the description of the database.

Mammograms enhancement
Enhancement algorithms have been employed for the improvement of contrast features and the suppression of noise (Papadopoulos, 2008).They are commonly used to increase the radiologist's detection effectiveness or as pre-processing stages of CAD schemes.In the preprocessing module, the significant features of the mammogram are enhanced, recovering most of the hidden characteristics and improving the image quality.According to recent findings (Papadopoulos, 2008), the contribution of the preprocessing module in the detection of ability of the CAD system is definite.Consequently, the final outcome of the CAD scheme depends largely on the pre-processing steps.
The pre-processing stage of the current approach is divided in two parts: (1) Contrast enhancement and, (2) microcalcification enhancement by the so-called top-hat algorithm.The usefulness of these methods is reported in the literature along with their potential to enhance signs present in mammographic images.

Adaptive Histogram Equalization (AHE)
In our previous work (Ramirez-Villegas et al., 2010;Ramirez-Villegas & Ramirez-Moreno, 2011), we implemented the Adaptive Histogram Equalization (AHE) as preprocessing stage.According to our findings, this technique can be applied to enhance the high frequency components of the image, i.e., microcalcifications, due to the computations applied to central and contextual region pixels.In order to avoid the noise amplification a contrast limited-equalization can be performed, especially in homogeneous areas.This method exhibits improvements over the Local-Area Histogram Equalization (LAHE), which presents high computational load and noise magnification due to standard histogram equalization computed for each pixel taking into account its neighborhood (contextual region).
In order to decrease the computational load, equalization can be computed only for some pixels (and its context regions), as the image is divided into a mosaic; thereby, the modified pixel is the central pixel, and the others are obtained using a standard interpolation method.In this way, each contextual region will affect, with its equalization, another spatial zone which doubles its length.
The final value of each pixel will be obtained applying the pixel mapping given by where N   is the mapping of the left superior area, N   is the mapping of the left inferior area, and so on; and

Top-hat algorithm
As a matter of fact, background removal and microcalcification enhancing are considered as necessary procedures in many CAD applications, given the initial visibility and detectability of such mammographic signs.Morphological operations can be employed to enhance mammographic images at reasonable computational load-effectiveness.A large class of filters can be represented by mathematical morphology implementing two simple operations: Erosion and dilatation.When the signal of gray levels and the background of an image are constant, a standard image thresholding procedure can be performed to detect objects.Nonetheless, the top-hat algorithm becomes a very good choice when the signal of gray levels of the background is highly sparse, as it is the mammary tissue in a mammographic image.
The top-hat algorithm consists of a standard pixel-to-pixel subtraction of the original image from its opened version.The image opening is defined as the erosion of the image followed by its dilatation.Erosion is the morphologic operation in which a pixel, located at the center of the structuring element, is substituted by the minimum value of the pixels of the neighborhood.Hence, this operation reduces small regions with higher gray levels than those of the structuring element.On the other hand, dilatation is the opposite morphologic operation to erosion; in this case, the pixel located at the center of the structuring element is substituted by the maximum value of the pixels of the structuring element.Consequently, this operation enlarges the regions of the image with high gray levels which did not disappear as a result of the erosion step.
The top-hat algorithm can be formulated as follows: where: is the opening of the image   , A x y by a structuring element   , B x y , where  and  denote erosion and dilatation, respectively.

As images are functions mapping a Euclidean space
, where  is the set of real numbers, the grayscale erosion and dilatation of   , A x y by   , B x y are given, respectively, by: , , a r g min ', ' ( ' , ' )

Background suppression (revised method)
The enhancement stage must be sensitive enough to emphasize small low-contrast objects, while it must have the required specificity to suppress the background.Usually the background corresponds to some smoothed fractions of the image provided by the tissue characteristics and image acquisition process; in consequence, these areas are softened regions of image which give no-relevant information about pathologies in many cases.In our last work (Ramirez-Villegas et al., 2010), the suppression is performed using difference of Gaussians filters according to Eq. ( 6).
( , ) ( , ) ( , ) ( , ) where ( , )  I x y is the input image, and the additional term of convolution is the filter function.In this way, the convolution term corresponds to a smoothed version of the input image.DoG (Difference of Gaussians) is a linear filter implemented in several artificial vision tasks, which works by subtracting two Gaussian blurs of the image corresponding to different functions widths.
The enhancing process with the DoG works in both the spatial and frequency domain.The performance of the filter is conditioned by parameters n  and in one case, n A peaks estimation.In Eq. ( 7), Standard deviation n  is related with lateral inhibition of the filter, while the term which follows n A peaks normalizes the sum of mask elements to unity in the image processing.Typically these parameters are determined in a heuristic way, according to the desired performance and microcalcifications and image general characteristics.Nevertheless, as a reference method for this research, there are some mathematical expressions (Ochoa, 1996) used to determine the DoG parameters according to microcalcifications' average width and Marr's ratio (Marr, 1982).For reference, an example of the DoG processing is in Figure 2.
As background suppression using DoG filters is a well-known method, it will give us some feedback in order to compare the performance of the current approach and, consequently, to express where it stands relative to the existing literature.

Bottom-up processing in visual cortex
The selection of a part of the available sensory information before a detailed processing stage by intermediate and high visual centers is an ability of the visual system of primates.Koch and Ullman (Koch & Ullman, 1985) introduced the idea of a saliency map to accomplish preattentive selection.Saliency map can be defined as a two-dimensional representation that represents topographically the saliency of objects in the visual field.The competitive behavior of the neurons in this map gives rise to a single winning location, which corresponds to the most salient object.Subsequently, the next conspicuous locations are attended in order of decreasing saliency, given the prior inhibition to already attended locations.Microcalcifications are low-contrast conspicuous locations in a background of distractors (surrounding mammary tissue and noisy regions).The competitive behavior of the neurons in the early stages of visual processing guarantees that there would be a biased competition in favor of certain objects of the space based on certain characteristics which make them 'unique'.But, how do unique features attract attention?Experimental evidence shows that neural structures in Lateral Geniculate Nucleus (LGN) and primary visual cortex (V1) are responsive to features which are common to all objects of the visual field, e.g., intensity, orientation, color opponency, motion, stereo disparity, among others.In this work, we assume that visual input is represented in the form of iconic, topographic feature maps.In order to construct such representations, we use center-surround computations in every feature at different spatial scales and within-feature spatial competition (Itti & Koch, 2000).All the information contained in these maps is combined to obtain a single representation, i.e., the saliency map.This part of our approach computes saliency using two features studied by Itti et al., 1998, for the formerly visual attention model proposed model of Koch & Ullman, 1985: Intensity and orientation.These features are organized into 30 maps (6 for intensity, 24 for orientation; a detailed explanation of this is given further).These maps are combined using across-scale sums in order to obtain the conspicuity maps, which provide input for a unique saliency map (central representation).Figure 3 illustrates the overview of this processing step.
This model is limited to selective attention given by the properties of the visual stimuli and consequently it does not involve any volition-dependent process (top-down visual processing).Low-level visual features are directly extracted from the input image over different resolution scales using pyramid-like linear filters, i.e., the so-called Gaussian pyramids.This approach consists of successive filtering processes and compression of the input images (Burt & Adelson, 1983).This process is illustrated by the following equations: where and , , i j N is the number of levels of the pyramid and, l C and l R are the dimensions of the image at the lth level.Finally, w is defined according to Eq. ( 9) and Eq. ( 10).
where ŵ is a normal and symmetric function: Typically the value of a is 0.4 and the value of b is 0.25, in consequence, the values of   ŵ x are given by Eq. ( 11).

 
are created using the Gaussian pyramid scheme.This approach yields horizontal and vertical image reduction factors from 1:1 (scale zero) to 1:256 (scale eight) in eight octaves.
Subsequently, each feature is calculated using the center-surround scheme, which is highly related to the visual receptive fields.Such center-surround differences are calculated between coarse and fine resolution scales in every feature: The receptive center corresponds to a pixel at resolution level in the pyramid, and the surround is the . As a result of the combination between the receptive center and surround resolution levels, we obtain a total of six feature maps.
Intensity contrast is extracted by standard band-pass filtering to calculate center surround differences between the established resolution levels: where   I c is the center intensity signal,   I s is the surround intensity signal and the symbol "  " is termed across-scale subtraction, i.e., the point-by-point subtraction of images of different resolutions by interpolation to the finer scale.(Greenspan et al., 1994).Thereby, orientation contrast is defined as: where   where  is the resolution level, and exp cos are even and odd Gabor filters, respectively, with aspect ratio  , standard deviation  , wavelength  , phase  , and rotated coordinates by  : Once we obtain the 30 feature maps (6 for intensity and 24 for orientation), feature maps of the same type are linearly combined and, consequently, we obtain two conspicuity maps (one for each feature): The purpose of the function

 
• N is to normalize each conspicuity map.The simplest procedure to achieve such normalization is to adjust the dynamic range of the maps.However, it is possible to obtain a normalized map into an iterative or trained way (Itti & Koch, 2000).
All conspicuity maps are linearly combined into one saliency map according to Eq. ( 21).
Finally, as the objects in the space compete for processing spaces during visual processing, the locations in the saliency map representation compete for the highest saliency value into a winner-take-all (WTA) strategy.This means that the next location to be attended x y is the most salient one in the saliency map; subsequently, the saliency map is inhibited by means of the so-called inhibition of return mechanism, allowing the model to simulate a visual scan path over the whole content of the image.
WTA models have been largely implemented for making decisions from a neurobiologically-inspired perspective (Koch & Ullman, 1985;Itti et al. 1998;Walther & Koch, 2006).It should be noted that in a neuronally plausible implementation, the saliency map could be modeled as a layer of leaky integrate-and-fire neurons, as a backwards WTA selection mechanism (Walther & Koch, 2006) or as a layer of neurons with logistic profiles implemented in the form of mean field equations (Ramirez-Moreno & Ramirez-Villegas, 2011).In the case of the leaky integrate-and-fire neurons, when a threshold potential is reached, a prototypical spike is generated and the capacitive charge of the neuron is shunted to zero (note that neurons here are RC circuit-based models).Therefore, the synaptic interactions among the units ensure that only the most active location of the saliency map remains and the potential elicited by other locations are suppressed.Similarly, using the mean field approach in a network of neural populations, the WTA approach emerges directly from the competitive behavior of the units, thereby, inhibitory and local excitatory connections among the neurons of the same layer produce the most active location to rise above the other ones (Ramirez-Moreno & Ramirez-Villegas, 2011).
As the main aim of the current approach is not to reproduce the brain dynamics in a one-toone implementation, we select the most active location in the saliency map in order to define the position where the model should attend; hence, we define the most salient location as follows: Under this strategy, the focus of attention (FOA) is shifted to the location of the winner neuron.Further, local inhibition must be applied in an area in the location of the FOA, in order to allow the system to determine a new winning location and then produce a new attentional shift.In order to reproduce such inhibition of return mechanism, when selecting the most active location in the map, a small excitation is activated in the surrounds of the FOA (Koch & Ullman, 1985), consequently, the shape of the FOA can be approximated to a disk whose radius is fixed according to the microcalcifications' average width (in this work, we compared the performance obtained using radiuses of 2, 3 and 5 pixels); subsequently, such location is inhibited by setting its activity to zero.

Serial segmentation procedure
Frequently the processing in the collected images is varying in quality (satisfactory quality and poor quality); hence, this establishes some individuality of the grey level contrast (Ramirez-Villegas et al., 2010) provided by the tissue characteristics and image acquisition process.Furthermore, regions of images such as mammograms are suitable to several segmentation algorithms.The image segmentation procedure must be specific enough to avoid false positives in the enhancing process.
In statistical analysis, when outliers are present, the estimates of the data are distorted.Consequently, these estimates are not suitable to make inferences about the data.In this case, these erroneous values should be eliminated for subsequent analysis purposes.
The Tukey outlier test (Hoaglin et al., 1983) assumes that there is no specific distribution of the data series.This method is based on the supposition that any distribution has a group of typical values surrounded by atypical data (i.e., outliers) that exaggerate the histogram length.The larger the sample size, the higher the probability of getting at least one outlier.The Tukey outlier test is based in, at least, two assumptions: (1) that the central part of the distribution contains most of the information of the genuine reference values; and (2) that outliers may be detected as values lying outside limits, taking into account the statistical properties of this central part.
In our work, we implemented this outlier detector as a serial segmentation algorithm using the FOAs determined by the saliency-based bottom-up approach described in Section 3.4. where where where 1 q and 3 q denote the first and third quartiles of the sample, respectively.Once the arguments of the above expressions are obtained, pixels above U and below L are considered as outliers of the distribution.As microcalcifications at this stage of the approach appear as highly bright regions with atypical gray level values (outliers in the distribution of the resulting processed image), the segmentation threshold to segment them is equal to U .
From a neural networks perspective, the segmentation procedure proposed in this work can be seen as a hard-limit transfer function node, where: Under this scheme, the typical gray values of the distribution are discarded (set to zero) and the others are transferred to the next processing step (SOM neural network).

Self-organizing map (SOM) neural network
The final stage of the approach reported in this chapter, is the implementation of a SOM neural network in order to topologically adjust the microcalcifications and show the final outcome for diagnosis purposes.Figure 4 illustrates the architecture of the neural network with the saliency map as input.
Self-Organizing Maps (SOM) have been largely implemented for a plethora of tasks, in a very similar way to those which other neural networks have been used to, e.g., pattern recognition, vision systems, signal processing, among others.In SOM-like neural networks, neighboring cells compete through mutual lateral interactions, and develop adaptively into specific detectors of different signal patterns (Kohonen, 1990).Each point of the input data shaping the structure of an N-dimensional space determines the spatial location of the weight of a cell in the network.Consequently, the network would be capable of giving a categorization of the input space.Let be a two-dimensional input vector in the segmented saliency map.The weight vector of the node j in the SOM layer is therefore denoted by , , We define an analytical measure of match between X and w .The simplest way to define the match may be the inner product T i j X w ; however, the Euclidean distance gives better and more convenient matching criterion (Kohonen, 1990).The Euclidean distance between the input patterns and the vector of weights is defined as: The minimum Euclidean distance defines the winner neuron at the current iteration.Hence, there is a single neuron chosen such that: Lateral interactions among the units are enforced by defining a neighborhood set ' n , around the winner unit.At each learning step the cells within the neighborhood are updated.Depending on the neighborhood function, the cells outside ' n are left intact or almost intact.Such function technically defines the adaptation strength among the neurons of the map.For a closer proximity to the winner node, stronger adaptation strength is elicited by the other nodes.In our work we used an elliptical Gaussian function, which according to our experimentation gave robust solution to the topologic adjustment task: The parameters of the function (the Gaussian widths) define the size of the neighborhood.Typically, it changes according to a monotonically decreasing function throughout the whole training procedure.In the current implementation such function for either Gaussian width is given by: where t represents the current training cycle and T the total number of train cycles.The initial 0 , n  and final , n f  neighborhood sizes can be estimated according to the map size (the neurons' distribution), the segmented saliency map size or in a heuristic way.It should be noted that a wide initial neighborhood first induces a rough global order in the j w values after which narrowing the neighborhood improves the spatial resolution of the map.
Finally, the updating process of the weights is given by the following equation: where   t  is the so-called adaptation gain  , which is related to the rate at which the network learns the topology of the input space.Typically, this parameter is also described by a monotonically decreasing function.In our work, it has the following form: here, the initial 0  and final f  learning rates must be small values and For illustrative purposes, a topology adjustment example by a SOM network is given by Figure 5.In this example, the input space is a square-shaped random distribution of points, while the network (initially) is a circle-shaped array of interconnected units.Note that the weight vectors tend to approximate the density function of the input vectors after a few training cycles (the blue edges indicate that the neurons are neighbors in the grid).

Results
As aforementioned, we tested our approach using The Mini-MIAS Database of Mammograms (Suckling, 1994), which is widely used by researchers to carry out and evaluate their research work before other researchers in the area of CAD of breast cancer.From this database, a total of 23 mammographic images made part of our study (those containing microcalcifications). The background and tissue character in the images enabled us to test the algorithm with certain variability of conditions.ROIs were extracted according to the specifications of the database in the form of squared regions enclosing the microcalcification clusters.In some cases calcifications were widely distributed throughout the mammogram rather than concentrated at single sites; in these cases various ROIs of the images containing microcalcifications were extracted.Subsequently, all the processing steps were performed according to Figure 1.
In this section we present the main outcomes of the proposed methodology.In Section 4.1 we give some relevant examples to illustrate how the proposed CAD application operates during mammogram inspection.Similarly, in Section 4.2 we present comparative Free-Response Operating Characteristic (FROC) curves to test the outcome of our methodology varying the radius of the FOAs in the saliency-based bottom-up model.We also conduct relevant comparisons between the proposed algorithm and the DoG approach and additionally, other comparisons are made between the performance obtained in the detection of benign microcalcification signs and the detection of malign microcalcification signs.

Experimental results
The analysed ROIs containing the microcalcifications in the mammograms vary in radius from 8 to 93 pixels, and the performance of the SOM neural network was achieved in 500 training cycles, in which location of the possibly pathological regions are given as an output.
In Figure 6 and Figure 7, examples of microcalcification detection are presented.Note that after the preprocessing stages (the image histogram equalization and the top-hat algorithm), the saliency-based bottom-up approach reveals the locations of the image which the visual system should attend to.In this case, the visual processing model biases the competition among the different locations of the image in favour of certain objects of the space.The attended conspicuous objects in this case are the microcalcifications present in the mammograms.As the degree of conspicuity of the microcalcifications on the preprecessed images varies, the saliency map activity is somewhat heterogeneous.This illustrates that the neural responses elicited by the objects and the competitive interactions among certain locations in the maps induce one target to rise above the others at a given time instant.In addition, our model incorporates the iterative normalization strategy described by Itti & Koch, 2000, which consists on iteratively convolving the feature maps by a 2D DoG filter, adding the result to the original image and setting the negative results to zero after each iteration.We tested the model with a reduced number of iterations (a maximum of 3 iterations) and a small inhibition factor (between 0 and 1) in order to avoid undesired overcompetitive behaviour among the neurons of the map.7(d) were those for which the statistical procedure detected at least one outlier.Furthermore, like in many other relevant situations, according to our results, it is hard to find an algorithm that can handle all the possible scenarios and all mammographic images' conditions.In addition, regardless of the distribution of the FOA, in absence of outliers (microcalcifications), the Tukey statistical test provided a low rate of false detections (specificity).We performed extensive experiments to evaluate the serial segmentation algorithm by limiting the maximum number of attended locations by the saliency-based bottom-up model; the algorithm's outcome limiting the number of attended locations did not present large variations as if the attention shifting occurred across the whole saliency map.
Figure 6(e) and Figure 7(e) show the topologic adjustment of microcalcifications performed by the SOM neural network.This performance was obtained by training the network over 500 cycles, in which the locations of the possibly pathological regions were given.Although the topological adjustment made by the SOM network is accurate and suitable for the application, some microcalcifications were not associated because the number of neurons in the input space was limited due to computational load constraints.Further research will be needed to evaluate other schemes in the topologic adjustment task.Additionally, Figure 8 illustrates the FOA shifting for the first six attended locations in the saliency map in the example of Figure 6.In this case, the FOA is represented by a disk of radius 2 pixels.In general, although it depends on the ROI size, the processed ROI required approximately from 16 to 300 shifts (with overlapping) to cover all the possible saliency map locations.Since larger ROIs took somewhat longer to be analyzed by the algorithm, in the slowest case the processing steps took approximately one minute to be performed.Furthermore, since mammographic regions could be considered as highly cluttered scenes, the current state of this model reproduces many classical results in psychophysics.

Performance evaluation
The performance of the proposed system is shown by FROC curves in Figure 9 and Figure 10.A FROC or Free-response Receiver Operating Characteristic curve is the plot of the lesion localization vs. the non-lesion localization, as the threshold to report a finding is varied; FROC curves are mainly implemented to objectively evaluate and analyse image processing algorithms, such as imaging CAD algorithms.Increasing the sensitivity of the algorithm can lead to false positives when reaching the detection of subtle signs.The experimental results of this algorithm were directed to how the proposed algorithm can improve the diagnosis of pathological signs (in this case, microcalcifications).When a microcalcification (or microcalcification cluster) is detected at the approximate position given in the database specifications, we count a true positive (TP).Otherwise, if a microcalcification (or microcalcification cluster) is detected outside the approximate radius indicated in the database, we count a false positive (FP).Furthermore, the malignancy of the pathologies in diagnostic images in different modalities should be one of the main topics in CAD evaluations, as it provides information about how specific are the techniques or approaches in detection of pathologies; thereby they can be characterized by powerful descriptors such as the size of the signs, character of the background tissue, characterization of the abnormality (e. g. single or clustered microcalcifications) and the approximated radius of the pathology in each image.
Figure 9 illustrates the FROC curves for different FOA radiuses.Note that as the FOA becomes narrower, the overall performance of the proposed neurobiologically-inspired algorithm increases.This is an expected effect that emerges from the visual system's features: As the circumscribed region to which attention is directed reduces, the sensitivity of the system increases.Furthermore, this is a convenient strategy when the scenes are too cluttered and consequently, difficult to analyse.The maximum performance reached by the proposed algorithm was approximately 92.0% at one false positive and 100.0% at 1.5 false positives implementing a 2-pixel FOA radius.Note that at this stage, the visual cortex model described in this work is limited to the bottom-up control of attention.Furthermore, we have followed this strategy as our main concern is the localization of the stimuli to be attended, not their identification.
From the FROC curves in Figure 10 and specifically the true positive ratios and the average number of false positives per image, it should be noticed that the pre-attentive bottom-up model outperforms the DoG-based approach.In general, DoG kernels exhibit a medium specificity, which allows the use of a single filter to enhance all the microcalcifications under certain conditions.This means that in order to make the system more robust and make microcalcifications of all sizes detectable, a bank of filters would be needed.Although the Fig. 9. FROC curves illustrating the performance of the proposed approach for different FOAs' radiuses (in pixels).Fig. 10.FROC curves illustrating the performance of (a) the proposed approach; and (b) the DoG approach reported by Ramirez-Villegas et al., 2010, for the malign and benign cases.performance of the DoG filters could be somewhat limited, DoG filters attenuate (to some extent) adequately the low frequencies, which is highly desirable in some of the processing stages.On the other hand, the proposed approach adapts better to all mammographic conditions given the multi-resolution processing strategy of the visual attention model and the center surround interactions; this makes the model selective enough to enhance the conspicuous locations and suppress low frequency components as well as some high ones in a band pass-like strategy.

Visual cortex mechanisms
Many computational principles regarding bottom-up and top-down visual processing have emerged from experimental and modeling studies.Different features contribute to perceptual saliency and their weighing can be influenced by top-down modulation (Deco & Rolls, 2004;Peters et al., 2005;Serre et al., 2006).There is experimental evidence concerning strong interactions among different visual modalities such as color and orientation for certain visual locations; these interactions are subjected to top-down modulation and training.On the other hand, the most important issue regarding the bottom-up processing is the contrast among features instead of the absolute intensity of each feature.However, the primary visual neurons are not only tuned to some kind of local spatial contrast, but also to neural responses elicited tightly by context, in a structure that extends the range of the classical receptive field.
It is likely that the relative weight of the features which contribute to the most general representation is modulated by higher cortical centers.In this way, the attention process selects the necessary information to discriminate between the distractors and the target as both bottom-up and top-down processes are carried out to analyze the same scene: Here the top-down process is related to previously acquired knowledge that biases the neural processing competition among the objects, therefore the recognition is performed by selecting the next eye movement that maximizes the information gain.The computational challenge, then, lies in the integration of bottom-up and top-down cues, such as to provide coherent control signals for the focus of attention, and in the interplay between attentional orientating and scene or object recognition (Itti & Koch, 2001).As the current approach is familiar only to bottom-up processes, it presents high resemblance and compatibility to integrate top-down processes.Integrating such processes would raise the overall performance of the mammography CAD system.

Could this approach be extended to mass detection? Through multi-sign detection
Another important issue is the possibility of multi-sign detection, i.e., the detection of multiple signs on the same image modality.Current image processing techniques make the primitive breast abnormalities detection easier (Verma, 2008), nevertheless, the detection of these abnormalities leads to many false detections which depend on the robustness of the vision system (Vilarrasa, 2006;Ramirez-Villegas et al., 2010).On the other hand, the modular architectures of most of such existing systems lead to the necessity of creating separate algorithms for detecting different kinds of cancer signs, e.g., microcalcifications and masses.As these two types of abnormalities are in several ways remarkably different, many researchers have addressed these two diagnostic tasks separately; consequently, the difficulty on detecting cancer rises in direct proportion with the number of implemented algorithms for such tasks (in this case, at least two different processing pathways).Nevertheless, visual attention modeling could be an important step towards the development of a fully-comprehensive CAD system for mammographic image analysis.Up to the knowledge of the authors of this chapter, the potential of such models in the analysis of mammographic images have not been yet issued, nor identified.Theoretically, given the features of the visual processes intended to be modeled, any visual cortex-like model would be capable of helping (to some extent) in the analysis of any diagnostic image.

Perspectives: Through multi-organ and multi-disease CAD
There are many computational approaches which have addressed the problem of diagnosis at reasonable cost-effectiveness relation; however, they commonly are single-purpose systems, i.e., their target is the detection of only one disease and one organ.Such schemes are referred to as abnormality-dependent approaches.These approaches work well in the case of a single-purpose CAD; nonetheless, it is not preferable to apply such approaches to a multidisease CAD, given the disproportionately large amount of algorithms that would be necessary to detect every single disease (at least one per disease).The future integration of CAD systems into multi-disease and multi-organ ones ensures its usage in diagnosing a large amount of target diseases with a fully comprehensive and integral architecture.Conversely, the diversity of acquisition conditions and the features of different kinds of diagnostic images pose additional challenges on well-known CAD processing steps such as segmentation, registration and classification, not to mention that the characteristics of abnormal regions on these images depend largely on the type of disease.Therefore, it is desirable to integrate the diagnostic knowledge of various types of diseases into a universal dictionary of features for diagnosis.Some research efforts have been made in improving multi-organ and multi-disease CAD.As a matter of fact, there is a rising interest on integrating such systems into multi-purpose ones.One aspect of this is that cancer, for instance, can spread to other organs in the body.Therefore, if a single-disease CAD can detect cancer, it would be of quite limited use for predicting metastasis and complications related to such cancer detection; moreover, it also would be of little use in order to detect cancer in other organs when such metastasis has occurred.
Conventional single-disease CAD approaches have addressed as many diseases as the number of involved computer vision algorithms to detect them.As a matter of fact, the wide range of conditions and characteristics of images is the most cumbersome issue that abnormality-dependent approaches have to face.Conversely, from the viewpoint of computational efficiency, it is not desirable to have as many diagnosis algorithms as the number of target diseases.It has become obvious that a more straightforward strategy will be needed to overcome the problem of integrating and understanding all the diagnostic knowledge by processing images and extracting candidate regions having structures and/or characteristics that are not normal.Furthermore, the integration and processing of such wide variety of multi-modal medical images need to be assembled and implemented as part of PACS in order to be used in clinical situations.
For instance, in clinical situations, it is important that the sensitivity of the system to be maintained as high as possible, which is achievable using a complex and strongly wired computational model (such as cortex-like models), and not with an unnecessarily increased number of different and complementary CAD schemes.Furthermore, although a fullycomprehensive CAD scheme can be seen as a highly robust and integral software, processing images from so many modalities is time consuming and therefore, it is likely that diagnosis would not be performed in real time; rather than that, image analysis would be performed offline.As a first goal towards the development of a multi-organ and multidisease CAD, the integration of multi-modal medical images and intelligent assistance in diagnosis of multi-dimensional images has been of great interest (Kobatake, 2007).This task poses additional challenges from the viewpoint of the computational efficiency and the trade-off between the processing efficacy at large datasets and the time taken to reach the diagnosis to support the decision of the physician.Moreover, the analysis of structures is another important issue to consider for the diagnosis of multiple diseases.For example, the thoracic structure contains at least nine different areas of high diagnostic interest (including lung area, trachea and pulmonary vessels) related to at least eight pathology-related signs (large lesions, pulmonary nodules attached to vessels, isolated pulmonary nodules, among others).Therefore, the integrated multi-disease and multi-organ CAD system, in this particular case, would extend the standard lung-cancer detection CAD system, irrespective of the methods used to detect the disease signs.
Finally, beyond the quite simple technical predictions of the authors of this book chapter, comprehensive CAD systems and the potential of cortex-like mechanisms modeling to overcome detection problems and to raise the sensitivity on disease sings detection will contribute dramatically to the development and improvement of the current capabilities of CAD systems.

Fig. 3 .
Fig. 3. Overview of the visual cortex-like bottom-up processing step.

,
O c  and   , O s  are the center and surround orientation signals, respectively.The local orientation maps   , O c  and   , O s  are computed by convolving the levels of the intensity pyramid with standard Gabor filters (note that this procedure can be performed either in the frequency or spatial domain):

Fig. 4 .
Fig. 4. Scheme of the neural network implemented in this work.

Fig. 6 .
Fig. 6.Example illustrating the processing steps of the proposed approach: (a) Equalized ROI; (b) output of the top-hat algorithm; (c) saliency map; (d) segmented image; (e) topologic adjustment of microcalfications by SOM neural network.

Figure 6
Figure 6(d) and Figure 7(d) illustrate the results of the serial segmentation.Note that the specificity of the bottom-up processing increases with the pattern discrimination obtained after the serial calculation of the Tukey outlier test.For these examples the radius of the FOA was 2 pixels.The white locations in Figure 6(d) and Figure 7(d)were those for which the statistical procedure detected at least one outlier.Furthermore, like in many other relevant situations, according to our results, it is hard to find an algorithm that can handle all the possible scenarios and all mammographic images' conditions.In addition, regardless of the distribution of the FOA, in absence of outliers (microcalcifications), the Tukey statistical test provided a low rate of false detections (specificity).We performed extensive experiments to evaluate the serial segmentation algorithm by limiting the maximum number of attended locations by the saliency-based bottom-up model; the algorithm's outcome limiting the number of attended locations did not present large variations as if the attention shifting occurred across the whole saliency map.

Fig. 7 .
Fig. 7. Example illustrating the processing steps of the proposed approach: (a) Equalized ROI; (b) output of the top-hat algorithm; (c) saliency map; (d) segmented image; (e) topologic adjustment of microcalfications by SOM neural network.

Fig. 8 .
Fig. 8. Example of the operation of the visual saliency model with the mammographic ROI in Figure 6.Note that once the visual machinery model combines the information of the topographic conspicuity maps into the saliency map, the most salient locations of the scene are attended into a serial strategy (the black arrows indicate the spatial shifts of the FOA).

Figure 11
Figure 11 illustrates an example of how the saliency-based bottom-up model operates for a mammographic image containing a mass.

Fig. 11 .
Fig. 11.Saliency-based bottom-up approach in the detection of masses: (a) Equalized image; (b) Saliency map; (c) First attended location by the WTA approach.