Different parameters of our physics-inspired feature decomposition method PAGE.
Emulated by an algorithm, certain physical phenomena have useful properties for image transformation. For example, image denoising can be achieved by propagating the image through the heat diffusion equation. Different stages of the temporal evolution represent a multiscale embedding of the image. Stimulated by the photonic time stretch, a realtime data acquisition technology, the Phase Stretch Transform (PST) emulates 2D propagation through a medium with group velocity dispersion, followed by coherent (phase) detection. The algorithm performs exceptionally well as an edge and texture extractor, in particular in visually impaired images. Here, we introduce a decomposition method that is metaphorically analogous to birefringent diffractive propagation. This decomposition method, which we term as Phase-stretch Adaptive Gradient-field Extractor (PAGE) embeds the original image into a set of feature maps that selects semantic information at different scale, orientation, and spatial frequency. We demonstrate applications of this algorithm in edge detection and extraction of semantic information from medical images, electron microscopy images of semiconductor circuits, optical characters and finger print images. The code for this algorithm is available here (https://github.com/JalaliLabUCLA).
- computational imaging
- physics-inspired algorithms
- phase stretch transform
- feature engineering
- Gabor filter
- digital image processing
Physical phenomena described by partial differential equations (PDE) have inspired a new field in computational imaging and computer vision . Such physics-inspired algorithms based on PDEs have been successful for image smoothening and restoration. Image restoration can be viewed as obtaining the solution to evolution equations by minimizing an energy function. The most popular PDE technique for image smoothening treats the original image as the initial state of a diffusion process and extracts filtered versions from its evolution at different times. This embeds the original image into a family of simpler images at a hierarchical scale. Such a scale-space representation is useful for extracting semantically important information . Physics based algorithms not only outperform their conventional counterparts, but also have enabled new applications. Usage of these algorithms range from feature detection in digital images [3, 4, 5], to 3D modeling of objects from 2D images [6, 7], to optical character recognition  as well as for restoring audio quality .
Phase Stretch Transform (PST) is a physics inspired algorithm that emulates 2D propagation through a medium with group velocity dispersion, followed by coherent (phase) detection [10, 11]. The algorithm performs exceptionally well as edge and texture extractor, in particular in visually impaired images . This transform has an inherent equalization ability that supports wide dynamic range of operation for feature detection [12, 13, 14]. It also exhibits superior properties over conventional derivative operators, particularly in terms of feature enhancement in noisy low contrast images. These properties have been exploited to develop image processing tools for clinical needs such as a decision support system for radiologists to diagnose pneumothorax [15, 16], for resolution enhancement in brain MRI images , single molecule imaging , and image segmentation .
PST emulates the physics of photonic time stretch , a real time measurement technology that has enabled observation as well as detection of ultrafast, non-repetitive events like optical rogue waves , optical fiber soliton explosions  and birth of mode locking in laser . Further, by combining photonic time stretch technology with machine learning algorithms, a world record accuracy has been achieved for classification of cancer cells in blood stream [24, 25].
The photonic time stretch employs group-velocity dispersion (GVD) in an optical fiber to slow down an analog signal in time by propagating a modulated optical pulse through the time stretch system which is governed by the following equation:
where, = GVD parameter, z is propagation distance, is the reshaped output pulse at distance
In the above equations, is the input image, is 2D Fast Fourier Transform, is 2D Inverse Fast Fourier Transform, and are the spatial variables and, and are spatial frequency variables. The function is called the warped phase kernel implemented in frequency domain for image processing.
PST utilizes the GVD dispersion to convert a real image to a complex quantity such that the spatial phase after the operation is a function of frequency. Upon thresholding, the high frequency edges survive. The phase kernel for the PST is designed by converting the 2D Cartesian frequencies and to polar coordinates which results in a symmetric Cartesian phase kernel. However, as digital images are fundamentally two-dimensional, there is an inherent loss of information in the features detected by PST. This motivates us to develop a more comprehensive approach that captures angular as well as spatial frequency information in a semantic fashion.
In this chapter, we introduce Phase-stretch Adaptive Gradient-field Extractor (PAGE), a new physics inspired feature engineering algorithm that computes a feature set comprising of edges at different spatial frequencies, at different orientations, and at different scales. These filters metaphorically emulate the physics of birefringent (orientation-dependent) diffractive propagation through a physical medium with a specific diffractive property. In such a medium, the dielectric constant of the medium and hence, its refractive index is a function of spatial frequency and the polarization in the transverse plane. To understand this metaphoric analogy, we consider an optical pulse with two linearly orthogonal polarizations, and , propagating through a dispersive diffractive medium such that
As the propagation constant is a function of refractive index (spatially varying), the two orthogonal polarizations and will have different propagation constants and hence, a phase difference at the output given by the following equation:
By controlling the value of and , as well the dependence of refractive index on frequency and , we are able to detect a semantic hyper-dimensional feature set from a 2D image. We demonstrate with several visual examples in the later part of this chapter that the above filter banks can be applied for image processing and computer vision applications such as for detection of fabrication artifacts in semiconductor chips, development of clinical decision support systems, recognition of optical characters or finger prints. In particular, we show that PAGE features outperform the conventional derivative operators as well as directional Gabor filter banks.
Further, we address the dual problem of spatial resolution and dynamic range limitations in an imaging system. In an ideal imaging system, the numerical aperture and the wavelength of an optical set up are the only factors that determine the spatial resolution offered by the modality. But under non-ideal conditions, the number of photons collected from a specimen control its dynamic range (the ratio between the largest and the smallest value of a variable quantity) which in turn also limits the spatial resolution. This leads to the fundamental dual-problem of spatial resolution and dynamic range limitations in an imaging modality .
Certain approaches to improve the resolution of the imaging system include use of wide-field fluorescence microscopy [27, 28] which offers better resolution than confocal fluorescence microscopy , multiple fluorophores [30, 31]. Also, various image processing techniques such as multi-scale analysis using wavelets [32, 33] have been proposed for improving the resolution while retaining important visual information post the image acquisition. We show later in the chapter that we are able to alleviate this dual-problem by incorporating, in our algorithm, a local adaptive contrast enhancement operator, also known as Tone Mapping Operator (TMO) which leads to excellent dynamic range.
Other steps of the proposed decomposition method are discussed at length in the next section. The organization of the chapter is as follows. In Section 2, we describe the details of the proposed decomposition method. Experimental results and conclusions are presented in Sections 3 and 4, respectively.
2. Mathematical framework
Different steps of our proposed decomposition method Phase-stretch Gradient-field Extractor (PAGE) for feature engineering are shown in Figure 1. The first step is to apply an adaptive tone mapping operator (TMO) to enhance the local contrast. Next, we reduce the noise by applying a smoothening kernel in frequency domain (this operation can also be done in spatial domain). We then apply a spectral phase kernel that emulates the birefringence and frequency channelized diffractive propagation. The final step of PAGE is to apply thresholding and morphological operations on the generated feature vectors in spatial domain to produce the final output. The PAGE output embeds the original image into a set of feature maps that select semantic information at different scale, orientation, and spatial frequency. We show in Figure 2 how PAGE embeds semantic information at different orientations for an X-ray image of a flower.
The sequence of steps of our physics-inspired feature extraction method, PAGE, can be represented by the following equations. We first define the birefringent stretch operator as follows:
where is a complex quantity defined as,
In the above equations, is the input image, and are the spatial variables, is the two-dimensional Fast Fourier Transform, is the two-dimensional Inverse Fast Fourier Transform, is a spatially adaptive Tone Mapping Operator and and are frequency variables. The function is called the PAGE kernel and the function is a smoothening kernel, both implemented in frequency domain. For all our simulations here, we consider to be low pass Gaussian filter whose cut off frequency is determined by the sigma of the Gaussian filter ().
The PAGE operator can then be defined as the phase of the output of the stretch operation applied on the input image :
where is the angle operator.
In the next subsections, we discuss each of the above mentioned kernels in detail and demonstrate the operation of each step using simulation results.
2.1 Tone mapping operator (TMO)
A tone mapping operator () is applied to enhance the local contrast in the input image . This technique is a standard method in the field of image processing to solve the problem of limited contrast in an imaging system while still preserving important details and thereby, helps in improving the dynamic range of an imaging system via post processing. By applying a tone mapping operator to the input image, an enhanced contrast can be achieved. While various TMO operators have been developed for adaptive contrast enhancement, here, we implement the TMO step by applying a Contrast Limited Adaptive Histogram Equalization (CLAHE) operator to the input image.
We operate on the input image using a first, followed by smoothening operator (low pass filter) and not vice versa. The reason to follow this sequence of operation is as follows. Noise present in an image is mostly represented by the high frequency components in the spectrum. These high frequency components can be present at both low-light-level or at high-light-level in the spatial domain. Because of the use of a tone mapping operator, the low-light-level features get over emphasized [34, 35]. This also leads to amplification of the image noise particularly in low-light scenarios. By applying a smoothening filter after the TMO operation, we aim to remove these noise artifacts from the contrast enhancement step. Alternatively, where any noise is left after the application of a smoothening kernel on the input image, it could be amplified by the TMO operation in the next step. Therefore, one may need to alternate between the smoothening step and TMO before obtaining a final satisfactory result .
2.2 Phase-stretch adaptive gradient-field extractor (PAGE) kernel
Phase-stretch adaptive gradient-field extractor (PAGE) filter banks are defined by the PAGE kernel and are designed to compute semantic information from an image at different orientations and frequencies. The PAGE kernel , consists of a phase filter which is a function of frequency variable and , and a steerable angle variable which controls the directionality of the response. We first define the translated frequency variable and
such that the frequency vector rotates along the origin with
We then define the PAGE kernel as a function of frequency variable and and steerable angle as follows:
There are two important things that should be noted here. First, we consider the modulus of our translated frequency variable and so that our kernel is symmetric for proper phase operation as discussed in . Second, for all our simulation examples here, when we consider a bank of PAGE filters, we first normalize and in the range (0,1) for all values of and then, multiply the filter banks with and , respectively, in order to make sure that the amplitude of each filter in the bank is same.
These filter banks can detect features at a particular frequency and/or in a particular direction. Therefore, by selecting a desired direction and/or frequency, a hyper-dimensional feature map can be constructed. We list all parameters in Table 1 that control different functionalities of our proposed decomposition method PAGE.
|and||Translated spatial frequency|
|Log normal filter|
|Strength of filter|
|Strength of filter|
|Mean of normal distribution for filter|
|Mean of log-normal distribution for filter|
|Sigma of normal distribution for filter|
|Sigma of log-normal distribution for filter|
|Sigma of Gaussian distribution for smoothening kernel|
|Bi-level feature thresholding for morphological operations|
Figure 3A–P show the generated phase profiles for that select semantic information at different orientation and frequency as described in Eqs. (10)–(13) using PAGE kernels. These phase kernels are applied to the input image spectrum. Using the steerable angle, the directionality of edge response can be controlled in the output phase of the transformed image. The detected output response for each directional filter is thresholded using a bi-level method. This is done to preserve negative high amplitude values as well as positive high amplitude values.
In order to detect features in a particular direction spread over the all the frequency components in the spectrum, we construct the PAGE filter banks by using Eqs. (9)–(13) for , and respectively. By controlling the value of sigma of normal distribution for filter, we avoid any overlapping of directional filters as seen in Figure 4.
We first evaluate the performance of these kernel by qualitatively comparing the feature detection of PAGE with PST. The image under analysis is a gray-scale image of a rose. For a better visual understanding of our method, we first compute orthogonal directional responses as shown in Figure 5. We then show results of edge detection using PST and PAGE in Figure 6. The values for the parameters strength , , , , , , and . The number of filters considered for a 1° resolution is equals to 180. Morphological operations used for the result shown in Figure 6C include edge thinning and isolated pixel removing for each directional response. As evident in Figure 6, edges are accurately extracted with our technique. Different colors in the computed edge response indicate the edge directionality.
2.2.2 Frequency selectivity
The PAGE filter banks can also be designed to detect edges at a particular frequency by controlling the spread of log normal distribution. To demonstrate this functionality, we show the features detected at low and high frequency using the rose image as an example in the Figure 7. As seen in the figure, the features detected at low frequency are smoother and at high frequency are sharper.
3.1 Comparison to Gabor feature extractors
We demonstrate the effectiveness of our decomposition method by comparing the directional edge response obtained by applying Gabor filter banks to an optical character image. We design 24 Gabor directional filters and augment the response from each of the filters to generate the image in Figure 8B. As seen in Figure 8C, with PAGE we have a better spatial localization of the edge response. By spatial localization, we mean that inherently PAGE has a sharper edge response, as seen in the figure. This is because, unlike the Gabor filters whose bandwidth is determined by the sigma parameter of the filter, in PAGE, the bandwidth of the response is determined by the input image dimension. Therefore, there is better localization of edge with PAGE. The parameters values are strength , , , , , , and . The number of filters considered for a 1° resolution is equals to 180.
3.2 Comparison to derivative feature extractors
To demonstrate the superiority of our decomposition method, we compare the edge response obtained by applying derivative based operators to a test image shown in Figure 9A. The response to a derivative based operator is computed by using the edge function of Matlab software (canny) and is shown in Figure 9B. As seen in Figure 9C, PAGE outperforms derivative based operators by producing the orientation information and low contrast details. The parameters values are strength , , , , , , and . The number of filters considered for a 1° resolution is equals to 180.
3.3 Simulation results
We apply our decomposition method to different types of images to show that the directional edge response obtained by PAGE can be used for various computer vision applications. For example, in Figure 10, we show application of PAGE to a Single Electron Microscope (SEM) image of an integrated circuit chip. As seen, the PAGE feature response is able to capture the edges corresponding to the chip layout (even the low contrast details). Based on the viewing angle (camera position), the layout edges should appropriately be rendered in the image as well as in the edge map. This can be used to identify any chip artifacts during the fabrication process. The parameters values for generating the feature map shown in Figure 10 are strength , , , , , , and . The number of filters considered for a 1° resolution is equals to 180.
We also apply PAGE to detect directional edge response to an image of a finger print as shown in Figure 11. Not only does PAGE detects a directional edge response, but also has an inherent equalization property to detect low contrast edges. The parameters values are strength , , , , , , and . The number of filters considered for a 1° resolution is equals to 180.
Next, we show application of our decomposition method PAGE to extract edges of vessels from a retinal image in Figure 12. The distribution of vessels based on the orientation of the edges can be used as an important feature to detect any abnormalities in the eye structure. As seen, the PAGE feature response is able to capture both the low contrast details as well as information about the directionality of the vessel edges which is coded in form of the color value in RGB space. The parameters values are strength , , , , , , and . The number of filters considered for a 1° resolution is equals to 180.
In this chapter, a presentation is made on a new feature engineering method that takes inspiration from the physical phenomenon of birefringence in an optical system. The introduced method called Phase-stretch Adaptive Gradient-field Extractor (PAGE) controls the diffractive properties of the simulated medium as a function of spatial location and channelized frequency. This method when applied to 2D digital images extracts semantic information from the input image at different orientation, scale and frequency and embeds this information into a hyper-dimensional feature map. The computed response is compared to other directional filters such as Gabor to demonstrate superior performance of PAGE. Applications of the algorithm for edge detection and extraction of semantic information from medical images, electron microscopy images of semiconductor circuits, optical character and finger print images is also shown.
The authors would like to thank Dr. Ata Mahjoubfar for his helpful comments on this work during his post-doctoral studies in Jalali Lab at UCLA. This work was partially supported by the National Institutes of Health (NIH) Grant No. 5R21 GM107924-03 and the Office of Naval Research (ONR) Multi-disciplinary University Research Initiatives (MURI) program on Optical Computing.
Conflict of interest
The authors declare no conflict of interest.