Second-order differential invariants.

## Abstract

The chapter introduces multiscale methods for image analysis and their applications to segmentation of microscopic images. Specifically, it presents mathematical morphology and linear scale-space theories as overarching signal processing frameworks without excessive mathematical formalization. The chapter introduces several differential invariants, which are computed from parametrized Gaussian kernels and their derivatives. The main application of this approach is to build a multidimensional multiscale feature space, which can be subsequently used to learn characteristic fingerprints of the objects of interests. More specialized applications, such as anisotropic diffusion and detection of blob-like and fiber-like structures, are introduced for two-dimensional images, and extensions to three-dimensional images are discussed. Presented approaches are generic and thus have broad applicability to time-varying signals and to two- and three-dimensional signals, such as microscopic images. The chapter is intended for biologists and computer scientists with a keen interest in the theoretical background of the employed techniques and is in part conceived as a tutorial.

### Keywords

- Laplacian of Gaussian
- scale spaces
- mathematical morphology
- Fourier domain

## 1. Introduction

Neurophysiological data are very much variable, and while certain patterns are prominent and reproducible (e.g., action potentials, tissue textures, and cells) they by no means can be easily defined precisely in a quantitative way. Data are enriched with unwanted patterns having complicated temporal and spatial structure, which are misleadingly called “noise.” Unlike the noise, natural objects have features on a limited number of spatial or temporal scales. This observation is the starting point of all available multiscale methods of analysis. The main focus of the chapter are digital images; however the presented approaches can be applied in the more simple setting of time-varying one-dimensional signals, such as voltage electrophysiological recordings. In the subsequent presentation, the images will always be considered as two-dimensional signals sampled on a rectangular spatial grid. The reason is that all common microscopic approaches acquire images on a plane of illumination; thus three-, four-, and five-dimensional images are essentially sets of correlated planar signals. The third dimension can represent depth, time, or an acquisition channel. Obviously, in the case of four and five dimensions, the number of combinations increases. Therefore, one cannot assume isotropic resolution of the transfer function for more than two dimensions. This situation introduces anisotropy in microscopic images, compared to other imaging modalities, such as magnetic resonance imaging or computer tomography. In microscopy, in such way, the default case is a planar image.

All physical signals are bounded and of finite duration. Such signals are acquired as discrete samples from an underlying physical process which, as an idealization, can be considered as continuous. The physical signals are naturally related to the properties of the measurement apparatus. As another idealization, these properties are described by a linear transfer function so that the measurement process becomes a convolution (denoted further by *central limit theorem* of probability theory. The theorem roughly states that the weighted sum of uncorrelated random variables having finite variance approaches a normally distributed random variable. Hence, if the noise is not spatially and temporally correlated, methods suitable for treating Gaussian or Poisson noise are applicable. Evidently, very fast sampling or sampling from processes having long memories, such as viscoelastic interactions, can violate these requirements. In such settings, other noise models can become more suitable. The readers are directed to [1] for a useful noise classification.

The chapter is organized as follows. Section 2 discusses the segmentation problem in general. Section 3 gives an overview of the mathematical morphology theory and provides examples. Section 4 gives an overview of the geometrical image features from the perspective of differential geometry. Section 5 introduces several types of scale spaces and their application in segmentation. Section 6 discussed implementation details. The chapter is intended for biologists and data scientists with keen interest in theoretical background of the employed techniques and is in part conceived as a tutorial. The references cited in the chapter are suitable for introductions on the mentioned topics.

## 2. Brief overview of image segmentation approaches

Extraction of an object’s boundaries from a digital image is called *segmentation*. Image segmentation is related also to object classification, which does not require precise delineation of the object boundary. Therefore, segmentation can be also viewed as classification on a pixel level.

The image segmentation is a nontrivial problem. For a successful image segmentation, it is important to have prior knowledge of the image composition, that is, the texture properties of the background and the objects of interest. Segmentation generates a mask consisting of a binary image delimiting the objects of interest present in the raw image. The challenge is to define an accurate segmentation methodology or at least an approach that enables segmentation of biologically relevant features. There are several classes of methods, which can be applied in different circumstances. These can be classified broadly into two classes: (i) intensity based, where the hypothesis is that only differences in the image intensity histogram can be sufficient for segmentation and (ii) geometry based, where the image is transformed so that the geometrical features of interest become enhanced.

Historically, the first and simplest segmentation methods are based on global thresholding of the histogram. Classical threshold-based methods consist of identifying a given pixel intensity level that allows for separating the object of interest from the background. There are about 40 different global thresholding algorithms [2]. Classically, an algorithm involving thresholding includes the following stages:

Preprocessing steps, which decrease the spatial variation of the image

Thresholding, which produces one or more binary masks

Masking or region of interest (ROI) selection

Post-processing steps, for example, including second thresholding or watershed

The watershed is based on a topographical interpretation of the grayscale image as terrain of mountains and valleys; the algorithm interpolates boundaries between objects based on the continuity in intensity peaks.

Various image filtering techniques can be introduced as preprocessing steps. These transformations are representations of mathematical operators. These operators have formal properties, which make them suitable for certain types of signals. For example, the convolution-based linear operators assume continuity of the sampled signal, while morphological operations do not. The notion of scale comes as the support of the sampled kernel, so multiscale analysis provides rules how the supports of different operators change with scale. It is also important to consider the sampling of the operators in the digital domain.

There are various geometry-based segmentation approaches, for example, using edges, distances, or texture statistics. In addition, there is a vast array of pre- and post-processing techniques, such as smoothing, mathematical morphology operations (i.e., watershed), partial differential equation methods, and shape methods. This manuscript will focus on the geometry-based methods with a particular emphasis on the edge detection techniques. Geometry-based approaches are invariant to changes of illumination, which is an issue in natural images and some microscopic techniques. In contrast, geometry-based approaches are susceptible to structural or texture “noise” so extra care must be taken to address such issues.

## 3. Mathematical morphology

The mathematical morphology (MM) theory is a way of analyzing objects’ shapes by way of interaction with shape primitives called structuring elements (SE) or *kernels*. A structuring element can be thought of as a small window that scans the image and alters the central pixel within its frame. The mathematical morphology theory was developed by Matheron and Serra [3].

MM operators are useful for the analysis of both binary and grayscale images. Their common usages include edge detection, noise removal, image enhancement, and image segmentation. MM approaches employ topological transformations and hence do not depend, on the particular noise model. Therefore, they can be used also in situations, where the noise is non-Gaussian.

The main building blocks of MM are the *erosion**dilation**E*, is expressed as

The closing with a SE is expressed as

The opening operation removes the objects, which are covered by *E*, while the closing, by duality, removes object’s complement (i.e., holes in objects), which are covered by *E*. The so-developed theory is topological in nature because it does not depend explicitly on the concept of size but only on covering and inclusion. Classically, the MM theory was developed for uniform homothetic scaling of the SEs, but it can be extended to nonhomogeneous groups of scaling transformations. The scaling can be interpreted as generating a system of neighborhoods of every given point, thus reinforcing the topological interpretation. This gives rise to partial differential equation interpretation of the MM theory [4].

The multiscale aspects of the theory are due to the scaling of the structure elements. For example, the seed SE can be rescaled homothetically and then applied to the image. Such series of successive openings provides a measure of the prevalence of objects of a given size and is called granulometry (Figure 2). Granulometry can be used also to segment compact bright objects by means of a top-hat transform, where from the primitive image its opened version is subtracted:

The second equation represents the granulometric filtering operation, which can extract bright objects of a specific size range from an image [5, 6].

Homogeneous scaling, that is, homothety, can be varied with the metric, which is induced on the SE. This can be box-like, circular, diamond, etc.

Another useful realization is the morphological gradient operation, which is the difference between an opening and a closure

## 4. Geometrical image features

Mathematically, images can be represented as surfaces in the three-dimensional Euclidean space, where the elevation represents the signal intensity. In this sense, the intensity at a certain point in the direction *x* + *r* can be interpolated from its local neighborhood as

Components of the gradient are given by

where for smooth signals the partial derivatives commute

The fact that digital images are sampled on a discrete grid may represent some difficulty as differentiation in the literal sense does not work for discrete signals. Notably, naive computations are numerically unstable and amplify the high-frequency noise. This difficulty can be overcome by applying the distribution theory, starting from the Leibniz identity for smooth signals [7]:

where

From this point on, differentiation of a digital image will be interpreted only in the generalized sense as a convolution with some smooth kernel. In such way, various local differential geometric invariants can be also incorporated into the processing. There are several filter families, which possess desirable properties, which can be exploited for systematic image noise suppression and computation of differential invariants. These families are formalized by the framework of the scale-space theory. Notable examples are the spatial derivatives of the Gaussian, which are used in the linear scale-space theory 5.1.

### 4.1 Differential invariants

There are several types of geometric features, which are useful for segmentation applications. Typical interesting image features are blobs, filaments, and corners. Notably, object boundaries can be represented in terms of edges, which can be approximated by steps in image intensity. All these features can be computed from the local differential structure of the image. The theory will be exemplified with the Gaussian derivatives, which, in view of the duality property of Eq. (7), can be used to compute the image derivatives.

The first four differential invariants are given in Table 1. The gradient vector field of the test image is represented in Figure 3.

Gradient amplitude | |

Gradient orientation | |

Laplacian | |

Determinant of the Hessian |

The eigenvalues of the Hessian tensor are solutions of the characteristic equation

If the eigenvalues have opposite signs, this is an indication for a saddle point at the point of reference. Therefore, the zero-crossing of the Laplacian operator can be used to delimit regions, encompassing blobs. The zero-crossings form the so-called zero space, which can be used to identify objects. The regions where the Laplacian changes sign can be extracted by connected component analysis, which are defined as regions of adjacent pixels that have the same input label. In this regard, different neighborhoods can be considered for the blobs (4-connected, N4) and for the contours (8-connected, N8). To compute the connected components of an image, we first (conceptually) split the image into horizontal runs of adjacent pixels and then color the runs with unique labels, reusing the labels of vertically adjacent runs whenever possible. In a second phase, adjacent runs of different colors are then merged [9].

The zero space is demonstrated in Figure 4, where the connected components where the Laplacian changes sign are labeled. From the figure, it is apparent that the cell nuclei can be enclosed well by the blobs.

The number of differential invariants increases with the increase of the image dimensions. However, the theory can be extended along similar lines. A very useful development in this direction is geometric algebra and calculus, which provide a dimension invariant representation of the geometrical structures.

The so-introduced geometric image features can be used as building blocks for advanced machine learning strategies for interactive segmentation and classification. This strategy was implemented in two segmentation platforms based on ImageJ/Fiji. The Trainable Weka Segmentation (TWS) [10] and the Active Segmentation [11] have recently presented new opportunities for analyzing complex datasets. Specifically, the active segmentation uses the scale-space-based filters presented here.

## 5. Scale-space theory

In the digital domain, smoothing leads to loss of resolution and, therefore, of some information. However, the information loss can be limited if one uses multiple smoothing scales (see Figure 5).

Scale-space theory is a framework for multiscale image representation, which has been developed by the computer vision scientists with intuitive motivations from physics and biological vision, as introduced by [12]. The underlying idea is to account for the multiscale nature of real-world objects, which implies that objects may be perceived in different ways depending on the scale of observation. Taken to the limit, a scale-space representation furthermore considers representations at all scales simultaneously. Scale spaces have been introduced independently in Japan and Europe by [12, 13]. The axiomatic linear scale-space theory was formalized in series of works by Witkin [14] and Koenderink [15].

Scale-space approaches are ubiquitous in feature detection/description, as well as dense correspondence mapping (e.g., large-offset optical flow is typically done in coarse-to-fine fashion) [9].

### 5.1 The Gaussian scale space

The liner scale-space theory provides a systematic way of dealing with spatially uncorrelated Gaussian noise. A fundamental result of scale-space theory states that if some general conditions are imposed on the types of computations that are to be performed in the earliest stages of visual processing, then convolution by the Gaussian kernel and its derivatives provides a canonical class of image operators with unique properties. The Gaussian kernel in 1D is given by

and

in two dimensions. A very useful property of the kernel is its separability, which allows for efficient computation of convolutions for multiple spatial dimensions. That is, for example, in two dimensions

Therefore, the computational cost scales linearly with the support of the kernel rather than quadratically.

The Gaussian scale space depends on a free scalar parameter *s* representing the scale of possible structures in the image [12, 13, 14, 15]. In the typical implementation of the theory, the scale parameter enumerates a space of smooth Gaussian test kernels of rapid decay, which are convolved with the digital image. In one dimension, Gaussian smoothing implies that new local extrema or new zero-crossings cannot be created with increasing scales. Gaussian kernels provide several advantages: (i) they are rotationally invariant, (ii) they do not produce artificial extrema in the resulting image, and (iii) successive convolutions with different kernels can be combined. Mathematically, this imposes a very useful semigroup structure, equivalent to the heat/diffusion equation. In this sense, the image structures diffuse or “melt down,” so that the rate of this diffusion indicates the “robustness” of the structure.

In its typical presentation, the scale-space theory applies only smoothing steps. Later, the theory was extended to include also differentiation and thus account for the differential structure of the images [16]. In the spatial domain, the Gaussian derivatives for the one-dimensional case can be computed in closed form as

where *Hen*(*x*) is the statistician’s Hermite polynomial of order *n*. The sequence of statistician’s Hermite polynomials satisfies the recursion

starting from *n* in order to populate the n-jet space. The n-jet components can be used to build the differential invariants up to order *n*. An example is presented in Figure 6, where the five unique components of the Gaussian jet-2 space are computed. The original dataset is present in the ImageJ public image database.

In spite of several properties that make linear diffusion filtering useful, it also reveals some drawbacks [17]:

An obvious disadvantage of Gaussian smoothing is the fact that it does not only smooth noise but also blurs important features such as edges. Moreover, it is uncommitted to any prior information about the image structure.

Linear diffusion filtering propagates edges when moving from finer to coarser scales, which can lead to difficulties in edge identification and instabilities.

### 5.2 *α*-Scale spaces

The *α*-scale spaces introduce nonlinearity on the level of differentiation. Notably, the Gaussian differentiation is replaced by another convolution operation, involving a power law. Pauwels et al. [18] and later Duits et al. [19] investigated the use of fractional powers of the Laplacian in connection with scale invariant smoothing and scale-space theory, respectively. This approach tries to overcome some of the limitations of the Gaussian scale spaces identified above. The evolution is governed by two parameters—the scale *s* and the order of differentiation *α*. The approach leads to formulation and solving of a fractional heat problem:

The Riesz fractional Laplacian operator is defined in the Fourier domain by

where the

### 5.3 Nonlinear scale spaces

Linear diffusion scale spaces are well-posed and have a solid axiomatic foundation. On the other hand, for some applications, they have the undesirable property that they do not permit contrast enhancement and that they may blur and delocalize structures. Nonlinear scale spaces try to overcome some of these limitations. Such scale spaces arise in nonlinear partial differential equation framework, which will be sketched below. The formal properties of some types of scale spaces have been established by Alvarez et al. [4]. In particular, they established a strong link with the related field of mathematical morphology (see Section 3). The following second-order partial differential equation was demonstrated in particular

where *Hu* are the components of the Hessian tensor, *f*(*x*) is the original image. It is interesting that MM operations can also be represented in this framework as

In this line of development, the Laplacian of Gaussian (LoG) operator can be decomposed into orthogonal and tangential components ([17], Ch. 1). The representation is provided below:

The parentheses denote scalar multiplication with the component of the gradient. The orthogonal decomposition is equivalent to an effective vectorization of the filter. The normal component is antiparallel to the gradient (i.e., in normal direction to the isophote curve), while the tangential component is parallel to the isophote curve passing through the point. These components can be used to segment blob-like or tubular structures. Segmentation based on the orthogonal decomposition is illustrated in Figure 7.

The orthogonal decomposition leads naturally to anisotropic diffusion (Figure 8). For example, if the tangential component is selected, this will lead to preservation of globular structures, while if the normal component is selected, this will lead to enhancement of the tubular structures. The equation *mean curvature motion* equation [17]. An example is presented in Figure 8.

## 6. Implementation

The filters described in the present manuscript are implemented in ImageJ as a set of plug-ins (Table 2). Two implementation strategies have been used: the integer order filters are implemented in the spatial domain, while the fractional order filters are implemented in the Fourier domain [21].^{1} The plug-ins are distributed under the GNU Lesser General Public License v3.0 and are available as code repository from the GitHub website [22].^{2} The choice of implementation platform was due to the widespread use of ImageJ in the biomedical and life science communities.

Plug-in | Function |
---|---|

LoG filter | Laplacian of Gaussian (LoG) |

ALoG filter | Anisotropic decomposition of LoG |

ADiff filter | Anisotropic diffusion |

LoG2 filter | Bi-Laplacian of Gaussian |

LoGN2D filter | N-order power of the Laplacian of Gaussian |

Gaussian jet | Gaussian jet of order n |

Zero-crosser | Connected components |

ImageJ is a public domain image processing program written in Java. Since its inception in 1997, ImageJ has evolved to become a standard analytical tool in a number of scientific communities. In particular, for life science communities, it is available as the Fiji plug-in platform, which allows for easy plug-in deployment and dependency management. ImageJ has an open architecture providing extensibility via third-party Java modules (called plug-ins) and scripting macros. It is developed by Wayne Rasband since 1997 and expanded via contributed software code by an international group of contributors. Plug-ins are distributed together with their source code under various licenses determined by the plug-in authors. The user guide of the platform [23] is maintained at

## 7. Discussion

The morphological complexity of the nervous tissue is a challenge for conventional segmentation techniques developed for computer vision applications or cultured cells. The challenges lie in the morphological complexity of neurons and glial cells overlaid on the heterogeneity of the extracellular matrix. This complexity translates into variations of the tracer signal and touching of relevant structures.

Segmentation of fluorescent images poses particular issues due to low signal-to-noise ratio, unequal staining, as well as the complexity of structures that need to be identified. This irreducible variation must inform choices about segmentation methods. In particular, methods employing multiple spatial scales are favorable. Structure identification is inherently a multiscale problem because object structure is recursive, that is, objects may contain substructures, which themselves contain substructures, etc.

A large number of algorithms for image segmentation have been proposed in literature (overview in [9]). However, many of them completely ignore the issue of scale. As a result, they are capable of identifying only limited types of structures. In contrast, multiscale approaches eventually rely on the topological properties of the segmented objects, either by means of scale spaces or by nonlinear vector field transforms [25, 26]. As a result, such methods are able to combine detected features into robust segmentation tools. The present chapter introduced two classes of multiscale methods for image segmentation: the mathematical morphology operations and scale spaces. The main applications of the theory are classification and segmentation of signals. Presented methods are generic and thus have broad applicability to both one-dimensional signals, such as electrophysiological recordings, and to two and three-dimensional signals, such as microscopic images.

## 8. Conclusions and outlook

The main utility of the presented approaches is to build a multidimensional multiscale feature space, which is subsequently used to learn characteristic “fingerprints” of the objects of interests. The large variation of structures present in microscopic images precludes the design of an “ideal” tool. Instead, multiple approaches should be combined and features computed that would inform machine learning approaches, which are able to adapt to the morphology of the cells and tissues at hand. Development in this direction has been undertaken with the advent of deep learning techniques. ImageJ-based implementations, such as the Trainable Weka Segmentation [10] and the Active Segmentation platforms [11], have been made available to end-users.

## Acknowledgments

The author declares no conflict of interest.

## A.1 Ranking operations

This section starts with a brief introduction to the set notation. In many sources it is called also the “set builder notation.” The empty set is denoted as

From a formal perspective, the mathematical morphology is the application of lattice theory to spatial structures [3]. Formally, the erosion is expressed as

for binary images, while for grayscale discrete images, it is

Formally, the dilation for binary images is

while for grayscale discrete images

In the continuous approximation case, the minimum and maximum should be replaced by infimum and supremum, respectively. Consider the set *X*. The *infimum* inf, or the *greatest lower bound*, is the greatest number, which is not necessarily in *X* but is smaller (or equal) than all members of *X*. For a finite set, the *infimum* coincides with the minimum. The *supremum* sup, or the *least upper bound*, is the smallest number, which is not necessarily in *X* but is greater (or equal) than all members of *X*. For a finite set, the *supremum* coincides with the maximum.

## A.2 Some useful Fourier transforms

The concept of frequency and the decomposition of waveforms into elementary “harmonic” wave motions first arose in the context of music and sound. The Fourier transform and its inverse in the continuous domain are defined as

The reader is directed to the book of [27] for an introduction on the topic.

The Fourier transform of the Gaussian is given by

and of its derivatives by

In the Fourier domain, the fractional heat kernel is expressed as

**Integer-order powers.** Integer powers of the Laplacian operators are successive compositions of the Laplacian operator [28]:

By substitution with the radial wavenumber *k* and gets sharper with increasing the order *n*:

**Fractional-order powers.** In the fractional domain the operator can be expressed as a direct generalization:

Therefore, the kernel bandwidth can be controlled by the fractional power *α*.

## A.3 Convolutions and Fourier domain processing

So far the described theory can work both in the spatial and the Fourier domain. A schematic treatment of the Fourier transform is given in Section A.2. Interested readers are referred to Burgers for a more complete introduction. The Fourier domain processing implemented via Fast Fourier Transform has a certain advantage. On the first place, for large convolution kernels, it can lead to speedup. This is so because convolution in spatial (respectively temporal) domain corresponds to multiplication in the Fourier domain. This incurs fixed computation costs; therefore, the convolution operation scales as *N* log(*N*), where *N* is the size of memory occupied by the digital image. Therefore, the following processing scheme becomes useful:

In the diagram above, the arrows indicate transformation, while FFT and IFFT denote forward and Inverse Fast Fourier Transforms, respectively. In the example of differentiation in the previous section, the kernel is the wave vector

## List of acronyms

mathematical morphology

structuring element

fast Fourier transform

inverse fast Fourier transform

glial fibrillary acidic protein

Laplacian of Gaussian

region of interest

## Notes

- The installation procedure of the spatial-domain filters is straightforward, and this is the reason why only spatial-domain filters are included in the public repository.
- https://github.com/dprodanov/scalespace.