MATLAB® (The MathWorks, Natick, MA, USA) is a software package for numerical computing that can be used in various scientific disciplines such as mathematics, physics, electronics, engineering and biology. More than 40 toolboxes are available in the current release (R2013b released in September 2013), which include numerous built-in functions enhanced by access to a high-level programming language.
Since images can be represented by 2D or 3D matrices and the MATLAB processing engine relies on matrix representation of all entities, MATLAB is particularly suitable for implementation and testing of image processing workflows. The Image Processing Toolbox™ (IPT) includes all the necessary tools for general-purpose image processing incorporating more than 300 functions which have been optimised to offer good accuracy and high speed of processing. Moreover, the built-in Parallel Computing Toolbox™ (PCT) has recently been expanded and now supports graphics processing unit (GPU) acceleration for some functions of the IPT. However, for many image processing applications we still need to write our own code, either in MATLAB or, in the case of GPU-accelerated applications requiring specific control over GPU resources, in CUDA (Nvidia Corporation, Santa Clara, CA, USA).
In this chapter, the first part is dedicated to some essential tools of the IPT that can be used in image analysis and assessment as well as in extraction of useful information for further processing and assessment. These include retrieving information about digital images, image adjustment and processing as well as feature extraction and video handling. The second part is dedicated to GPU acceleration of image processing techniques either by using the built-in PCT functions or through writing our own functions. Each section is accompanied by MATLAB example code. The functions and code provided in this chapter are adopted from the MATLAB documentation ,  unless otherwise stated.
2. Image processing on CPU
2.1. Basic image concepts
2.1.1. Pixel representation
A digital image is a visual representation of a scene that can be obtained using a digital optical device. It is composed of a number of picture elements, pixels, and it can be either two-dimensional (2D) or three-dimensional (3D). Different bit-depth images can be found but the most common ones in scientific image processing are the 1-bit binary images (pixel values 0 or 1), the 8-bit grey-scale images (pixel range 0-255) and the 16-bit colour images (pixel range 0-65535) . Figure 1 shows the grey-scale variation, from black to white, for an 8-bit image.
2.1.2. MATLAB pixel convention
MATLAB uses one-based indexing, where the first pixel along any dimension has index1, whereas many other platforms are zero-based and consider the first index to be 0. By convention, counting of pixel indices starts from the top-left corner of an image with the first and second indices increasing down and towards the right, respectively. Figure 2 visualises the way that MATLAB indexes a 512×512 pixel image. This information is particularly important when the user intends to apply a transformation to a specific pixel or a neighbourhood of pixels.
2.1.3. Image formats
Various image formats are supported by MATLAB including the most commonly used ones, such as JPEG, TIFF, BMP, GIF and PNG. Images can be read, processed and then saved in a format other than their initial one. Various parameters such as the resolution, the bit-depth, the compression level or the colour-space can be adjusted according to the user’s preferences.
2.1.4. Digital image processing
Digital image processing refers to the modification of a digital image using a computer in order to emphasise relevant information. This can be achieved by both revealing latent details and suppressing unwanted noise. The process is usually designed according to the desired final outcome which can include simple image enhancement; object detection, segmentation or tracking; parameter estimation; or condition classification. Moreover, when dealing with images intended for inspection by people, the structure and function of the human visual system may be a critical factor in designing any such technique as this determines what can be perceived as an easily distinguishable feature.
2.2. Image pre-processing
Image pre-processing is a procedure that gives initial information about the digital condition of a candidate image. In order to receive such information, we need to load the image on the software platform and examine its type and pixel values.
2.2.1. Image input and output
Just as any other data set in MATLAB, images are represented by a variable. If we consider an image file with name ‘image’ and format ‘tiff’, using the function
2.2.2. Image type conversions
Image type conversion is a useful tool as the user can convert the input image into any desired type. A special and often useful conversion is the transformation of an unsigned integer into a double-precision representation. Any image processing algorithm may thus result in more accurate outcomes since this conversion increases the dynamic range of intensities. The range of the resulting image is 0.0 to 1.0 with MATLAB maintaining up to 15 decimal digits. The following commands are examples of image conversions.
2.2.3. Pixel information
A histogram is a useful intensity representation as it reveals the pixels’ intensity distribution. It can be obtained using the function
2.2.4. Contrast adjustment
One of the main pre-processing techniques is contrast adjustment since this process can enhance desired features whilst suppressing other, unwanted ones. MATLAB has various tools for varying the image contrast. The function
Figure 3 presents an original grey-scale image and its contrast adjusted counterpart using the parameters specified in the previous example.
Other techniques that can affect contrast are histogram-based ones. A histogram represents an image’s grey-level intensity distribution or probability density function. Such knowledge can assist in further processing by helping the user choose the right tools . Histogram stretching can be performed through the
2.2.5. Arithmetic operations
Arithmetic operations refer to addition, subtraction, multiplication and division of two images or an image and a constant. Images subject to arithmetic operations need to have the same dimensions and grey-scale representation. The resulting image will have the same dimensions as the input. When a constant value is added or subtracted (instead of a second image), this constant is added or subtracted to each pixel’s intensity, increasing or reducing the image luminosity. Most often, such operations are used for detail enhancement or suppression of unnecessary information.
In the code above, the second input parameter (
2.2.6. Miscellaneous transformations
Other useful image transformations include cropping, resizing and rotation. Cropping can be used if the user is interested only in one particular part of the input image. The user can define a specific region of interest and apply any transformation only to this part. Resizing can be applied in order either to expand or reduce the image size. Image size reduction can be especially useful in speeding up a process in case of larger images or large data sets. Rotation can be particularly useful when an image includes features of a particular directionality. The user can specify the applied interpolation method out of nearest neighbour, bilinear and bicubic. Inversion of grey-scale intensities can be useful when the interesting objects have intensities lower than the background. The following functions perform these processes.
2.3. Image processing
Thresholding is one of the most important concepts in image processing as it finds application in almost all projects. Thresholding can be manual or automatic, global or local. In manual mode, the user defines a threshold value, usually depending on the conception of the image (several trials may be needed). In automatic mode, a more detailed understanding of the image is required in order to select the correct method. The IPT provides the function
This method can be easily extended to multi-thresholding by using the IPT function
2.3.2. Edge detection
Edge detection is an essential part of image processing as it usually emphasises objects’ outline or internal structure. An edge is a representation of discontinuity in an image and may characterise a surface that separates two objects or an object from the image background . Boundaries can be characterized by single pixels or connected sequences of pixels. Such a feature can assist in further object recognition and, thus, edge detection is applied in many image processing sequences. The outcome of edge detection is a binary image with edges presented by white pixels.
2.3.3. First-order edge detection operators
The IPT includes the standard first-order edge detectors such as the Roberts, Sobel and Prewitt. Roberts edge detector relies on 2×2 masks whereas the other two rely on 3×3 masks. An optional
Other first-order edge-detectors can also be designed. Examples are the Kirsch and the Robinson masks which are not included in the IPT but are easy to design. They are examples of directional edge detectors which scan the image from different directions in order to detect edges with various orientations. A single kernel is used which, through rotations from 0° to 315° in steps of 45°, creates eight different masks. The image is convolved with each mask and the pixels in the final image are assigned the highest edge detection magnitude obtained from any of the masks  The following code presents these two edge-detectors, respectively , .
The point detector, another example of an edge detector, detects bright points based on the intensity difference between a central pixel and its neighbours. A point detector can be specified by the following code .
2.3.4. Second-order edge detection operators
In addition to first-order edge detectors, second-order edge detectors can find wide application. Such detectors are for example the Canny, zero-cross and Laplacian-of-Gaussian (LoG; also called Marr-Hildreth). The Canny method uses the derivative of a Gaussian filter for finding the gradient of the original image after which it relies on local maxima of the resulting image.  The zero-cross method searches for zero crossings after an arbitrary filter has been applied. Finally, the LoG method searches for zero crossings after the LoG transformation has been applied. Useful code for such detectors follows.
In this case,
2.3.5. Image filtering
Spatial filtering is one of the most important processes in image processing as it can extract and process specific frequencies from an image while other frequencies can be removed or transformed. Usually, filtering is used for image enhancement or noise removal. IPT includes the standard tools needed for image filtering. The function
Edge detectors can also be applied by filtering the image with the edge operator. An example follows with the application of the previously mentioned point edge detector.
Apart from user designed filters, IPT includes filters that can be directly applied to the image. Such examples are the median filter (
Figure 7 shows examples of Gaussian and order statistics filtered images.
2.3.6. Morphological image processing
Morphological image processing refers to the extraction of descriptors which describe objects of interest and, thus, their morphology determines the tools that are used . Structuring elements are used to probe the image . The function
Figure 8 presents examples of dilation and top-hat transformation with a ‘disk’ structuring element of radius 10.
The distance transform is usually applied to binary images and represents the distance between a white pixel and its closest zero pixel. Pixels in the new image obtain higher values with larger distance from a zero pixel. This transformation can act as a segmentation function and it is often used in the segmentation of overlapping disks , .
2.3.7. Colour image processing
Colour images are subject to processing in many scientific fields, as different colours can represent different features. The most commonly used colour representation is RGB (Red-Green-Blue). Transformation of RGB images into grey-scale intensity or extraction of a specific colour can be done using the following code:
2.4. Feature extraction
Feature extraction is the process through which recognised objects are assessed according to some geometrical criteria. The first step of this process is to ‘label’ the objects of a binary image
Apart from the standard features that are included in the IPT, custom-defined features can be measured either by using already calculated features or by introducing completely new measurements. An example could be the measurement of an object’s standard geometric characteristics as well as the thinness ratio and compactness (or irregularity) using a
Measured features are stored in structure arrays. Usually, further processing of features requires transforming structure arrays into matrices. MATLAB cannot perform such transformations without the application of an intermediate step: the structure arrays have first to be transformed into cell arrays which in turn can be converted into matrices.
Notice that the transpose of the cell array
2.5. Processing series of images
In many cases, handling of multiple images can be a laborious task unless an automated process can be established. Assuming we have a batch of 100 images that we would like to process, using a
2.6. Video handling
2.6.1. Video to frames
An interesting application for image processing is handling video data. In this case, the video file has to be divided into single frames. The function
2.6.2. Frames to video
Since every frame is stored as a single image it can be processed accordingly, either one by one or in batch mode. A possible next step in this process can be to combine the processed images into a single video again. If a new movie (called
3. Image processing on GPU in MATLAB
Large amounts of image data are produced in many technical and experimental situations, in particular where images are repeatedly acquired over time or when dealing with images of higher dimensionality than two. Time-lapse imaging and video recording can be mentioned as examples of the former, whereas the latter can be represented by any of the many tomographic imaging modalities present. 4D computed tomography (CT), where 3D CT images are acquired at regular intervals to monitor internal patient motion, is an example of an application pertaining to both categories. It is often desirable or even critical to speed up the analysis and processing of such large image data sets, especially for applications running in or near real-time. Due to the inherently parallel nature of many image processing algorithms, they are well suited for implementation on a graphics processing unit (GPU), and consequently we can expect a substantial speedup from such an implementation over code running on a CPU. However, despite the fact that GPUs are nowadays ubiquitous in desktop computers, only 34 out of the several hundred functions of the IPT are GPU-enabled by the PCT in the current MATLAB release (2013b). In this sub-chapter we will explore the possibilities available for someone either wanting to harness the computing power of the GPU directly from MATLAB or to incorporate external GPU code into MATLAB programs. The focus will be on image processing applications, but the techniques presented can with little or no effort be adapted to other applications.
In the first part of this part we look at how to use the built-in, GPU-enabled image processing functions of the PCT. Following this, we explain how pixel-wise manipulations can be carried out using the GPU-enabled version of
gpuArrayand built-in GPU-enabled functions
For the examples in this part to work, we need a computer equipped with an NVIDIA GPU of CUDA compute capability 1.3 or greater which is properly set up and recognised by the PCT . The functions
To be able to process an image on the GPU, the corresponding data first have to be copied from main CPU memory over the PCI bus to GPU memory. In MATLAB, data on the GPU are accessed through objects of type
creates a new
copies the data in the
creates a 3072 × 3072 array of normally distributed pseudorandom numbers with zero mean and standard deviation one. As with the corresponding standard MATLAB functions, the last function argument specifies the array element type (in this case
creates a standard MATLAB array of
Once our image data are in GPU memory, we have two options for manipulating them: either we can use the sub-set of the built-in MATLAB functions (including some functions available in toolboxes) that run on the GPU, or we can write our own functions using only element-wise operations and launch them through
A list of the GPU-enabled MATLAB functions available on the current system, together with all static constructors of
(where a variance of 0.09 equals a standard deviation of 0.3). Another, potentially more useful, GPU-enabled function from the IPT is
filters the image using a horizontal Sobel filter. Note that
The second option for manipulating GPU arrays directly from MATLAB is by calling our own functions through the built-in
and then calling
The benefit of writing functions and launching them through
Before moving on to the next part we should stress that since GPUs are built to process large amounts of data in parallel, there is no guarantee that running code on a GPU instead of a CPU will always result in a speedup. Although image processing algorithms provide good candidates for substantial speedups, this characteristic of the GPU means that vectorisation of code and simultaneous processing of large amounts of data (i.e. avoiding loops wherever possible) becomes even more crucial than in ordinary MATLAB programs. Further, GPU memory latency and bandwidth often limit the performance of GPU code. This can be alleviated by ensuring that, as far as possible, data that are operated on at the same time are stored nearby in memory. Since arrays are stored in a sequential column-major order in MATLAB, this means avoiding random memory-access patterns where possible and organising our data so that we mostly operate on columns rather than on rows. Finally, when evaluating the performance of our GPU code we should use the function
3.2. Calling CUDA kernel functions from MATLAB
By using the techniques described in the previous part we can use MATLAB to perform many of our standard image processing routines on the GPU. However, it does not allow us to test our own CUDA-implemented algorithms or use existing ones in our MATLAB programs, nor does it provide a means to explicitly control GPU resources, such as global and shared memory. In this part we demonstrate how this can be remedied by creating a
the second argument above,
sets the block size to 32×8 threads, the grid size to 96×384 blocks and the shared memory size per block to 4×32×8=1024 bytes, enough to hold 256
To execute our kernel function we call
To function in a MATLAB context, a call to a CUDA kernel through a
produces two outputs. With this information we can now call the
Similarly, we can call
The output from a
with its corresponding
3.3. MEX functions and GPU programming
In the previous part we saw how, by running our own CUDA kernels directly from MATLAB, we can overcome some of the limitations present when working only with built-in MATLAB functionality. However, we are still (in release 2013b) limited to using kernel functions that take only standard type arguments, and we can access neither external libraries, such as the NVIDIA Performance Primitives, the NVIDIA CUDA Fast Fourier Transform or the NVIDIA CUDA Random Number Generation libraries, nor the GPU’s texture memory with its spatial optimised layout and hardware interpolation features. Further, we need to have an NVIDIA GPU, be writing our code in CUDA and have access to the PCT to use the GPU in our MATLAB code. In this section we look at how we can use MEX functions to partly or fully circumnavigate these limitations. This again assumes previous experience of GPU programming and some knowledge of how to write and use MEX functions. A good overview of the latter can be found in the MATLAB documentation .
The first option, which removes the technical restrictions but still requires access to the PCT (running on a 64-bit platform) and a GPU from NVIDIA, is to write MEX functions directly containing CUDA code. The CUDA code itself is written exactly as it would be in any other application, and can either be included in the file containing the MEX function entry point or in a separate file. Although this process is well documented in the PCT documentation  and through the PCT examples , we briefly describe it here for consistency. The main advantage of this approach over the one described later is that it enables us to write MEX functions that use GPU arrays as input and output through the underlying C/C++ object
For the above to work, we need to include
A bare-bone MEX function calling the kernel function
After compilation, assuming the above code is found in
Note the explicit type conversion necessary to convert the second argument to
The rest of this part will be dedicated to describing how we can call GPU code from MATLAB even if we do not have access to the PCT or write our code in CUDA. Since without the PCT, the
The first alternative is to compile our GPU code, regardless of in which language it is written, into a static library using our external compiler, and then to call this library from a MEX function that we compile using the
Once the static library, normally a .
Note that it is normally better, especially if using a pre-existing library, to use
The second alternative is to build the MEX function directly in our external compiler, without going through the
A MEX function is a dynamically linked shared library which exports a single entry point called
In the case of an existing code library, we can add a new component (such as a new project to an existing solution in Visual Studio) containing a MEX function calling our desired GPU functions so that it is built directly with the original library without any additional steps.
In conclusion, MATLAB is a useful tool for prototyping, developing and testing image processing algorithms and pipelines. It provides the user with the option of either using the functions of the IPT or leveraging the capabilities of a high-level programming language combined with many built-in standard functions to create their own algorithms. When high performance or processing of large amounts of data is required, the computational power of a GPU can be exploited. At this point, there are three main options for image processing on GPU in MATLAB: i) we can stick entirely to MATALB code, making use of the built-in, GPU-enabled functions in the IPT and the PCT as well as our own GPU functions built from element-wise operations only; ii) we can use the framework provided by the PCT to call our own CUDA kernel functions directly from MATLAB; iii) we can write a MEX function in C/C++ that can be called from MATLAB and which in turn calls the GPU code of our choice.
The authors would like to acknowledge the European Commission FP7 ENTERVISION programme, grant agreement no.: 264552