Colorrectal cancer is the third most frequently diagnosed cancer worldwide. The American Cancer Society estimates that there will be almost 100,000 new patients diagnosed with colorectal cancer and that around 50,000 people will die as a consequence of this in 2016. The increase of life expectancy and the increment of the number of diagnostic tests conducted have had a great impact on the amount of cancers being detected. Among other diagnostic tools, colonoscopy is the most prevalent. In order to help endoscopists cope with the increasing amount of tests that have to be carried out, there exists a need to develop automated tools that aid diagnosis. The characteristics of the colon make pre-processing essential to eliminate artefacts that degrade the quality of exploratory images. The goal of this chapter is to describe the most common issues of colonoscopic imagery as well the existing methods for their optimal detection and correction.
- medical image pre-processing
- specular reflections
- inhomogeneous illumination
- black borders
The unceasing increase in incidences of colorectal cancer (CRC) in recent decades has led to a rise in the number of medical tests being carried out; in the case at hand, colonoscopies. Specialists consequently have a greater amount of work, and find themselves overwhelmed. As a result of this problem, numerous investigations have been conducted in recent years focussing on developing tools to help with diagnoses, thereby supporting specialists. Development of algorithms for the automatic analysis of colonoscopy imaging requires preliminary pre-processing of the images in order to rectify the multiple factors that detract from their quality.
The objective of this chapter is to shed light on the most common problems encountered in colonoscopic imaging, while also providing the most frequentlys-used solutions among the scientific community. The aim is to thusly supply useful information in order to develop automatic algorithms, which may then be implanted in robots that automate tasks currently requiring manual interaction.
2. What is a colonoscopy?
A colonoscopy is a method of reference for diagnosing and treating colonic diseases; essential to both colorectal screening and monitoring. This exploration enables the large intestine to be viewed in its entirety, to extract biopsies and to remove tumours.
It has been proven that carrying out this procedure reduces the colon cancer mortality rate. Before undergoing the procedure, it is necessary for the patient to have been through a preparation phase, so that there is no solid waste in the colon. The procedure is performed by inserting a colonoscope—a flexible tube with a camera at the end—into the anus (see Figure 1 ). In some cases, a sedative is used so as to carry out the procedure without causing discomfort. It is the best means of detecting CRC since it enables localisation and, in the majority of cases, immediate extraction.
3. Main problems of colonoscopies
The principal difficulties in obtaining colonoscopy imaging are described below; which, in many cases, are the result of the equipment used or the environmental difficulties.
Black mask: this is due to the fact that the lenses used in the colonoscopy image capturing system have a black frame around the edge. In many cases, the mask is used to convey information, either pertaining to the patient or the test being carried out. This black frame hinders the development of digital image processing algorithms since it creates false borders, as well as covering a larger area for analysis that would not yield useful information. For these reasons, applying different techniques to eliminate its effects becomes necessary. In Figure 2 , the black mask in colonoscopy imaging can be observed.
Ghost colours: the problem of ghost colours (see Figure 3 ) is linked to a lack of synchronisation of the colour channels. Its appearance is due to the fact that most colonoscopy equipment uses monochromatic cameras, in which the components R, G, and B are obtained at different times. This causes a reduction in the quality of the image, making the subsequent development of PDI algorithms difficult.
Interlacing: interlacing allows twice the number of frames per second to be taken without consuming additional bandwidth. It is used in standard formats such as the National Television System Committee (NTSC) or phase alternating line (PAL), and shows half of the horizontal lines in each iteration. Each frame is divided into two fields: the first contains odd-numbered lines and the second field the even-numbered lines. Due to the phenomenon of the persistence of the human eye, the brain mixes both iterations of the interlaced frame, identifying it as one image. The effects of interlacing cause the appearance of false outlines in the images (see Figure 4 ), which make the development of algorithms more complicated. Therefore, it is necessary to implement techniques to reduce its occurrence.
Specular highlights: specular highlights (see Figure 5 ) are points of high intensity in the image due to the illumination of shiny objects. When a source of light is shone directly on an object, the light is reflected and captured by the camera. This process generates heavily saturated areas in the image, which can lead to unwanted outlines, making it subsequently difficult to process the image. This effect is extremely important in the detection of polyps, which are generally rounded and similar to tumours. Due to their shape, they reflect light and generate specular highlights when illuminated, which can lead to a malfunction of the developed algorithms.
Uneven lighting: the variations in the intensity and direction of lighting are decisive in the appearance of objects in digital images. The illumination of the colon in a colonoscopy is variable, which, because of the colon’s three-dimensional shape, causes shadows to appear, accentuating or diminishing certain aspects of the image. Varying degrees of illumination on the same object cause differing representations of the object, rendering said variability of lighting unwanted. In the literature, there are numerous publications that address this problem. In Figure 6 , an example of uneven lighting in colonoscopic imaging is shown in order to facilitate its detection.
4. What is the pre-processing of colonoscopic images?
Every image capturing process is affected in some way by factors that reduce the quality of the image to some degree. Colonoscopic imaging is no exception, so it is necessary to implement techniques that help to improve the quality and thereby obtain a better visual representation.
Any technique whose objective is to contrast, highlight, accentuate or remove unwanted effects from the image is considered a method of improvement. This is a process of vital importance in medical imaging, in which the limitations of the image capturing system—in the case at hand, colonoscopies—cause unwanted effects which need to be removed. It is crucial to point out that by improving imaging:
No new information is added to the image; the image is only highlighted so as to be used more efficiently by the algorithms that are to be developed.
There is no exact criterion for quantifying the degree of improvement; in many cases, it is based on subjective opinions and/or on the result obtained by the developed algorithms.
Below is an outline of the applicability of pre-processing colonoscopic imagery in robots which may be able to automate tasks that are vital in a colonoscopy.
5. Applicability of pre-processing colonoscopic imagery in robots
Faced with the growing number of diagnostic tests for colon cancer being carried out, it has become necessary to rely on support tools for medical diagnoses. These tools support the specialist by providing objective data, thereby enabling more accurate diagnoses.
The main functions that endoscopists require are related to the automatic detection of polyps and the evaluation of the quality of the test being carried out.
In the case of detecting polyps, having tools available that enable their automatic detection will mean a reduction in the number of missed tumours, which, in many cases, lead to interval cancers. Interval cancers are those that appear between two scheduled diagnostic tests and, in most cases, are due to a polyp or tumour that was not detected by the specialist during the procedure. In this context, publications such as [1–3] have made important contributions to the scientific community.
Moreover, the quality evaluation of the procedure is a necessity, since many of the metrics are currently based on the specialist’s interpretation and are therefore subjective, impeding correct comparison among different health centres with the intention of improving the process. The European Guidelines for Quality Assurance in Colorectal Screening and Diagnosis  provide a series of metrics that evaluate different aspects of the colonoscopy. In this regard, publications such as [5–8] make valuable contributions to the scientific community.
All research studies focused on the development of automated tools for the assistance of medical diagnoses share the need for the availability of an image pre-processing system. The availability of tools to improve the quality of the images is a necessity, as can be observed in investigation .
All the methods for pre-processing imagery outlined in this chapter will be able to be implanted in robots and colonoscopies in such a way as to enable the development of various automated tools, which allows for significant higher reliability of colonoscopies.
6. Pre-processing colonoscopic imagery
Here, we describe the most frequently used techniques in the scientific community for removing the most common discrepancies in colonoscopic imagery. Solutions that have been proposed in the literature are outlined, and the most appropriate focus for each point has also been proposed.
6.1. Removal of black borders
In the literature, there are three tendencies for black border removal: the restoration of the image, the use of thresholding and cropping of the black mask. Following is a brief explanation of each method.
Removal of the black mask through restoring the image: this involves replacing the pixels of the black mask using the median value of the pixels in a certain vicinity. This focus has been used in investigation , obtaining satisfactory results.
Removal of the black mask using thresholding: a threshold is set to detect the real frame of the image, removing the black mask. In many occasions, this focus does not manage to completely remove the black mask, leaving residual lines, which makes it necessary to apply techniques such as the Hough transform  to remove them. This technique was used in investigation .
Removal of the black mask through cropping the image: this is the simplest focus, in which an area of the image is selected and the rest is removed. This method involves obtaining a smaller image but maintaining the maximum amount of information possible from the original image, running the risk of losing valuable information.
In this section, a suggestion for an alternative focus for the removal of black borders is presented. Depending on the model of colonoscope used, the black borders that are generated vary (see Figure 7 ), which makes pre-processing difficult. In many cases, the borders are used to provide information about either the patient or the procedure being carried out (see Figure 7(b) ). This frame makes the development of PDI algorithms difficult, since it generates false borders, as well as entailing a greater area to be analysed that does not provide useful information. Due to these reasons, it is necessary to apply different techniques to remove their effects.
There are various literary references to methods addressing this problem: reconstructing the borders by restoring them , the use of thresholding for their detection  and the cropping of the black mask. In this pre-processing design, a method combining the existing solutions was chosen. This technique involves detecting the black mask using thresholding, as well as cropping and reconstructing. Figure 8 shows the process in which this task is carried out.
The following is a description of the steps to remove the black borders using the proposed method:
Conversion to Hue, Saturation, Value: in order to address the automatic detection of the black mask in colonoscopic images, it is necessary to convert them from the RGB colour model (the original colour model for colonoscopic imagery) to the HSV colour model. This is due to the fact that the RGB model makes certain colour specification difficult, whereas this is one of the HSV model’s strengths. Thanks to this, the thresholding described in the next step is made much more simple.
Channel V thresholding: once the conversion from the RGB colour model to HSV is complete, the image is ready for thresholding. Thresholding offers a wide range of intensity values from which to choose, allowing us to define among them those objects that we want to be detected automatically. In this chapter, channel V thresholding is proposed, in which values of 0.03 and lower are attributed to the black mask. This method enables the separation of useful content in the colonoscopic image from the black borders. This process can be observed in Figure 9 , in which Figure 9(a) shows the process of Channel V thresholding and Figure 9(b) presents the result generated.
Depending on the model of colonoscope used to capture the images, the black borders may be different. This is a problem, since when thresholding is carried out to detect the black borders, the information shown in the borders will remain visible over the image. In order to remove it, an additional step is necessary which involves making a morphological opening by using a size-5 disk structure to the detected black mask. In this way, all the information shown on the black border is removed, leaving it clean. This process can be observed in Figure 10 , in which Figure 10(a) shows the detected black mask with leftover information and Figure 10(b) shows the result of the morphological opening for its removal.
Once the thresholding of the image is complete, it is possible to proceed to the removal of the black borders.
Black border removal: the process of black border removal comprises two steps: cropping and reconstructing. The following is a detailed description of both:
Detection of the upper central point not belonging to the black mask: starting from the pixel in position (max(X)/2.1) searching southwards, the first pixel does not belong to the black mask.
Detection of the lower central point not belonging to the black mask: starting from the pixel in position (max(X)/2.max(Y)) searching northwards, the first pixel does not belong to the black mask.
Detection of the centre-left point not pertaining to the black mask: starting from the pixel in position (1.max(Y)/2) searching eastwards, the first pixel does not belong to the black mask.
Detection of the centre-right point not pertaining to the black mask: starting from the pixel in position (max(X).max(Y)/2) searching westwards, the first pixel does not belong to the black mask.
Once the four positions of the sought pixels have been obtained, a rectangle is generated whi ch contains them and will be what determines the dimensions of the image with the black borders cropped out. Figure 11 shows a visual example of this process. The next step in removing the black borders is the reconstruction of the leftover black borders. This process is addressed in the following section.
Reconstruction of the remnants of the black mask: in Figure 11 it can be seen that the final area of the image highlighted in orange still contains remnants of the black borders. The final task for their removal is to reconstruct them. In order to do so, a restoration is applied which aims to replace the pixels of the black mask by the median value of the pixels in a certain vicinity. This operation is carried out repeatedly until the difference between the values of the neighbouring pixels used in the reconstruction falls below a predetermined amount.
Image without black borders: having performed all the procedures designed for black border removal, we will obtain an image with reduced dimensions and the reconstructed black borders. The result obtained can be seen in Figure 12 , in which Figure 12(a) shows the original image without editing, and Figure 12(b) provides the result obtained through this process.
6.2. Removal of specular highlights
There are numerous methods to detect specular highlights. The following is a brief summary of the most important of these:
Park et al.  propose the detection of specular highlights using a search of saturated areas and small regions with high contrast. The saturated areas are detected by applying adaptive thresholding to the image’s intensity histogram. The value of the threshold is predetermined as the region that surrounds the maximum value of the histogram. The smaller regions with high contrast are detected using the method proposed in Ref. , which applies a top-hat filter followed by a reconstruction and erosion operation by a size-5 disk structure.
Bernal et al.  assume that the specular pixel intensity value is greater than that of the non-specular pixels in their vicinity. Furthermore, they indicate that non-specular pixels which neighbour specular pixels will have higher intensity values than non-specular pixels far from the reflective areas. The detection of specular highlights is carried out by the subtraction of the original image and their median. Once this has been done, specular highlights can be detected through the use of thresholding.
Gross et al.  detect specular highlights based on the space of HSV colour. Specular highlights show a high saturation and low brightness, which makes their detection simple.
The method put forward in Ref.  for the detection of specular highlights uses two different colour spaces. In the first, it is necessary to observe the borders generated by the changes in texture and specular highlights. In the second, only the borders generated by the textures need to be seen. Subtraction of these two colour spaces enables the detection of specular highlights. This method has been used in investigation  with satisfactory results. Therein, the detection of specular highlights based on low saturation of the colour of the highlights is suggested.
Having shown the techniques used in various studies for the removal of specular highlights, the method for their elimination is proposed. Figure 13 shows the steps for a better understanding. A description of each of the modules that comprise them follows.
Conversion to greyscale: in order to commence the process of specular highlight removal, it is necessary to convert the borderless image from the original colour model (RGB) to greyscale. This operation is necessary for subsequent detection of specular highlights, which is described in the next step.
Detection of specular highlights: the method used for the detection of specular highlights has been proposed by the authors of the study . To this end, a system comprising four blocks has been designed, which is shown in Figure 14 . In the following steps, there is a detailed description of the process for specular highlight removal proposed for this investigation.
Calculation of the threshold value (U): to detect specular highlighting automatically, it is vital to affix a threshold value (U) which distinguishes between normal values in the image and specular highlighting. To this end, the median value of the original image (μ) is calculated on a greyscale, which is then multiplied by a weight (W) which, by default, has a value of 0.3. In this way, the value required for addressing the next phase in the detection of specular highlights is calculated.
Subtraction of the original image in greyscale and the threshold value: once the threshold value (U) has been calculated, the subtraction of the original image in greyscale with the threshold value (U) is performed. In this way, a matrix equal in dimensions to that of the image in greyscale is obtained, in which values above 0.75 belong to specular highlighting.
Thresholding: having calculated the matrix with the values pertaining to the subtraction between the original image in greyscale and the threshold value (U), a binary mask will be generated in which values surpassing the threshold (U) are given a value of 1, and everything else a value of 0, thereby obtaining an image that only shows the positions of the specular highlighting that has been detected.
Mask with specular highlights: as a result of this process, a mask is obtained which will be used in the next step and will deal with the reconstruction of the highlighting.
Reconstruction of the image: once the dilation of the specular highlighting mask has been carried out, we can begin to reconstruct the regions of the image indicated by the mask through the following steps:
The damaged section is filled in using information from the rest of the image.
The structure of the area surrounding the deteriorated part is filled in towards its centre, extending the lines that reach the border.
The numerous regions that are generated inside the damaged area from the extension of the contour lines are filled in with the colour of the corresponding bordering region.
Finally, the small details are coloured in to maintain uniformity.
The algorithm repeatedly carries out steps 2 and 3 until the desired quality is achieved. Having carried out this process, an image free of specular highlights is obtained. The result is highly effective, as shown in Figure 15 .
6.3. Lighting normalisation
In the scientific literature, there are numerous publications that deal with uneven lighting in imaging. A brief summary of the most relevant works, as well as a proposal for an alternative to normalise lighting in colonoscopic imagery illumination is presented. Investigation  presents a contrast operator built by means of two primitives involving Weber’s law, and, in doing so, achieving an improvement in the contrast of the image. On the other hand, study  carries out a reduction of the effects of uneven lighting through the local normalisation of the image’s brightness. For this, each pixel is divided by the maximum value of its vicinity. In this publication, vicinity was considered 13 × 13 pixels. Finally, in investigation , an equalisation of the background of the image in greyscale was carried out, thereby strengthening the contrast of the different structures, as well as removing the lighting variation in the image.
The following procedure is proposed to solve the issue of homogenous lighting in colonoscopic imagery. The proposed design is shown in Figure 16 , offering a complete description of the blocks comprising it; i.e. obtaining the subtraction value, subtracting the image with the subtraction value and the image with normalised lighting.
Obtaining the subtraction value: in order to achieve a more uniform illumination in the images, it is a fundamental requirement to calculate a subtraction value for each of the windows into which the image has been divided (20 × 20 pixels). This value is obtained by calculating the median value of each channel inside the said window and multiplying it by a weight (0.3 by default).
Subtraction from the window with the subtraction value: once the subtraction values of the different channels have been calculated, these are subtracted from the corresponding channel of the window. In this way, the effects of the peaks of intensity that the uneven lighting causes are mitigated.
Image with normalised lighting: as an output of the lighting normalisation module, an image is obtained with a range of much more uniform colour intensities, which aids its subsequent analysis. Following this previous step, the colonoscopic images are ready to be used for quality evaluation algorithms for the preparation of the colon, using the BBPS, and automatic polyp detection. In Figure 17 , it is possible to observe the result obtained through the normalisation of lighting. Figure 17(a) shows an image without lighting normalisation and Figure 17(b) shows the result obtained through this process.
6.4. Removal of interlacing effects
The adverse effects of interlacing are habitual in the use of videos, or in the extraction of images from video frames. The removal of these aspects has been addressed in numerous investigations, which achieve very accurate results. Below, the most relevant publications that propose a solution to this problem are shown.
Studies [18–20] address the removal of the effects of interlacing through deinterlacing. The procedure is based on obtaining one in every two horizontal lines, decreasing the vertical size of the image. To maintain the size proportion of the original image, they apply vertical redimensioning by a factor of 0.5.
Figure 18 shows the results of applying these techniques for removing interlacing effects. As can be observed, the obtained result is very good, achieving high effectiveness.
6.5. Removal of ghost colours
This problem has been addressed in the literature, in study , where channel equalisation is proposed, as is carrying out an estimation and compensation of the movements of the camera. Channel equalisation aims to obtain a histogram with a more uniform distribution, i.e. the same number of pixels should exist for each level of grey in the histogram of a monochrome image. The estimation and compensation of the movements of the camera are obtained through the use of the movement vectors from MPEG video standard. These enable an estimate of the deviation affecting each colour channel in obtaining the image, allowing the errors produced to be corrected. This same solution has been addressed in study . The application of this technique corrects the effect very accurately, failing solely in images of very low initial quality. The result obtained using this solution is shown in Figure 19 .
The benefits derived from the tools described in the present chapter are in the improvement of colonoscopic images. Specifically:
The scientific community is provided with information about the origin and characteristics of the most prevalent artefacts that corrupt colonoscopic images, thus allowing for their identification, detection and removal.
The techniques that have to be applied to the images in order to increase their quality are described, as well as the methodology that has to be used to apply them.
The scientific community is also given a useful guide to a system of medical diagnosis aid based on colonoscopic images, thus allowing to offer tools better suited to the needs of the patients.
Since the systems to aid diagnosis are constantly on the rise nowadays and are likely to be in the immediate future, we consider the current chapter is undoubtedly necessary to the specialist in the area.