Graph Construction for Hyperspectral Data Unmixing Graph Construction for Hyperspectral Data Unmixing

This chapter presents graph construction for hyperspectral data and associated unmixing methods based on graph regularization. Graph is a ubiquitous mathematical tool for modeling relations between objects under study. In the context of hyperspectral image analysis, constructing graphs can be useful to relate pixels in order to perform corporative analysis instead of analyzing each pixel individually. In this chapter, we review funda- mental elements of graphs and present different ways to construct graphs in both spatial and spectral senses for hyperspectral images. By incorporating a graph regularization, we then formulate a general hyperspectral unmixing problem that can be important for applications such as remote sensing and environment monitoring. Alternating direction method of multipliers (ADMM) is also presented as a generic tool for solving the formulated unmixing problems. Experiments validate the proposed scheme with both synthetic data and real remote sensing data.


Introduction
Hyperspectral imaging analysis has found a wide range of applications including agricultural monitoring, environment detection, meteorological information forecast, medical examination, and camouflage tests [1]. In a hyperspectral image, pixels are typically mixtures of several pure material components due to the limitation of spatial resolution and intimate interactions among materials. Spectral unmixing is thus one of the most important tasks in hyperspectral data analysis, aiming to separate the observed pixel spectra into a collection of constituent spectra, or spectral signatures, called endmembers and to estimate fractions associated with each component called abundances. Spectral unmixing provides a comprehensive and quantitative mapping of the elementary materials that are present in the acquired data, and it is widely used for many applications, such as determining the constitutions of geological mixtures and performing a classification of crops and vegetation.
Most spectral unmixing approaches are designed based on pre-assumed mixture models that describe in an analytical way how the endmembers are combined to mixed spectra measured by the sensor [2]. The linear mixing model (LMM) is the most widely used one, and it assumes that the mixing occurs at a macroscopic scale [3]. A measured spectrum is the linear combinations of the endmembers, weighted by the fractional abundances. To be physically interpretable, LMM is usually performed with two physical constraints, abundance nonnegative constraint (ANC) and abundance sum-to-one constraint (ASC). Multiple scattering effects and intimate interactions in real environment require using nonlinear mixture models. Such models include intimate mixture model [4], bilinear model [5], linear-quadratic mixing model (LQM) [6], polynomial post-nonlinear mixing model (PPNM) [7], to cite a few. However, due to the simplicity and interpretability of the analysis results, LMM-based unmixing strategies are mostly used in practice [2]. A number of unmixing algorithms are proposed, including long-standing geometrical and statistical approaches and the recently introduced sparse regression-based unmixing algorithms [8][9][10][11].
Considering inherent spatial-spectral duality exists in hyperspectral data, regularized unmixing algorithms have been proposed in recent years to make use of spectral information and spatial contextual information to enhance the unmixing performance. For instance, in [8], authors introduce a total variation (TV) regularizer to promote spatial consistency of estimated abundances. In [12], the quadratic Laplacian regularization is introduced based on graph representation. In [13], authors present a spatial spectral coherence regularization that relates abundance estimation of a pixel to that of its neighboring pixels with spectral similarities. In [14], authors perform the unmixing with low-rank spatial regularization within fixed-size square windows.
However, it is necessary to establish a frame for these various ways of regularization via a proper mathematical tool. A graph is a ubiquitous structure that describes the connection relationship of a set of vertices. Graph theory is actively used in fields such as biochemistry (genomics), electrical engineering (communication networks and coding theory), computer science (algorithms and computation), and operations research (scheduling) [15]. In addition, several works apply graph theoretical techniques to hyperspectral images, including methods for dimensionality reduction [16], anomaly detection [17], and classification [18]. In the context of hyperspectral data unmixing, a graph can be used to model relations of spatial and spectral information of hyperspectral pixels. In this chapter, we will present a variety of ways to construct a graph for hyperspectral unmixing and formulate the associated unmixing problem with solutions given by the alternating direction method of multipliers (ADMM) strategy.
The remainder of the chapter is organized as follows: Section 2 introduces graph theory and graph construction methods in the context of hyperspectral unmixing. Section 3 formulates the sparse linear unmixing problem based on graph regularization. In Section 4, the solution to the formulated problem is derived via the ADMM algorithm. Section 5 reports the experiment's results. Finally, Section 6 concludes this chapter.

Graph construction 2.1. Introduction to graphs
We firstly review some fundamental elements of a graph. A graph is a general data structure described by G ¼ V; E ð Þ, where a finite set of vertices, also called nodes, is denoted by V and a finite set of pairs of the form v i v j À Á is referred to as edges. Edges indicate the relation between vertices, and they can be directed or undirected. Directed edges utilize ordered pairs of points that indicate the source and sink of each connection, that is, v i v j À Á represents an edge from v i to v j . Undirected edges only indicate the relationship between vertices and do not consider the ordering, that is, v i v j À Á is the same as v j v i À Á : We may associate each edge with a weight to describe the importance or the cost of this connection (Figure 1).
In a simple setting, if two vertices are connected by an edge, the weight is set to 1, otherwise the weight is 0. The following part introduces some other ways to measure the similarity among vertices, in other words, to define the weights. We can use either adjacency matrices or incidence matrices to describe graphs depending on the type of operations to be performed. Elements of the matrix A indicate whether pairs of vertices are connected or not in a graph. Element A ij is 1 when there is an edge from vertex i to vertex j and zero when there is no edge. If the graph is undirected, the adjacency matrix is symmetric. Incidence matrices show the relationship between vertices and edges. An undirected graph can have two kinds of incidence matrices: unoriented and oriented matrices. An oriented incidence matrix in the undirected graph can be denoted by B ∈ ℝ nÂm , where n is the number of vertices and m is the number of edges. That is, in the column of edge e k , the positive undirected graph can be denoted by B ∈ ℝ nÂm , where n is the number of vertices and m is the number of edges. That is, in the column of edge e k , there is positive weight A ij in the row corresponding to one vertex v i of e k and negative weight -A ij in the row corresponding to the other vertex v j of e k , and all other rows are set to 0.
In addition, a degree matrix for a graph is a diagonal matrix D ¼ diag d 1 ; ⋯d i ; ⋯d n ð Þ , where n is the number of vertices and d i is the degree of the vertex v i in G. The degree matrix is indicating every vertex's degree which is the number of edges connecting to one vertex. It is  normally used together with the adjacency matrix to construct the Laplacian matrix L of a graph, which is L ¼ D À A.

Graph construction for hyperspectral images
In this part, we elaborate the ways to construct graphs in the context of hyperspectral image analysis. The performance of spectral unmixing is closely tied to the graph construction of images, and in the hyperspectral remote sensing literature, there are a number of techniques.
In [19], authors summarize a survey of spectral graph construction techniques and discuss advantages and disadvantages of these techniques. Generally, each pixel can be viewed as a vertex (or node), and each vertex is associated with a continuous spectrum. A set of edges can be set and assigned with weights in different senses as presented here below.

Four-neighbor graph
A common and straightforward construction is to consider the four-neighbor graph, where every vertex (i.e., every pixel) is connected to four nearest spatially adjacent neighboring pixels, as illustrated in Figure 2.

Threshold-compared graph
Another alternative to construct a graph is to calculate all pairwise distances and an edge is placed if the distance between two vertices is less than a user-predefined threshold. The distance in the hyperspectral image can be measured using the spectral distance or spatial distance. For instance, v i and v j are two vertices that are associated with spectral vectors with L bands, then their Euclidean distance is

K-nearest neighbor graph
Constructing a graph with k-nearest neighbors (k-NN) is a popular method. In this case, an edge is set between two vertices if vertex v j is in k-NN of vertex v i . Each vertex has its own k-nearest neighbors. Consequently, the graph is a directed graph. It is worth noting that constructing such a graph requires calculating all pairwise distances and ordering these values on each vertex, and these operations lead to high computational costs.

Spatial-spectral graph
As pixels in a hyperspectral image possess spatial locations and spectral signatures, it can be beneficial to construct a graph by incorporating both spatial and spectral information. For instance, a graph can be constructed with local four neighborhood pixels and by considering spectral similarity among pixels, as described in Figure 3.

Weighted graph
Above methods construct unweighted graphs with only connection indications among pixels. Several other methods further impose weights on each edge. For instance, spectral similarity measured by the Gaussian kernel can be used to define weights: where σ is the kernel bandwidth and defined by users. As a generalization, a radial basis function (also called a diffusion kernel) in spectral distance with two parameters σ i and σ j is introduced in [20], given by: Weights can also be calculated by considering both spatial and spectral information. For instance, [21] proposes to define weights by: where x i is the spatial coordinates of pixel v i , c ij is an integer indicating the number of common neighbors between v i and v j , σ i and σ j are defined in [20], and the parameter σ d is defined by users which limits the size of regions spatially. In [22], authors consider the similarity of the spectral angle instead of the spectral Euclidean distance.
where θ ij denotes the spectral angle between v i and v j , x i is the spatial coordinates of pixel v i and σ is the parameter defined by users. Note that some schemes of calculating weights can make edges to be severed so as to change the structure of the graph [19].
There are also some other methods to construct graphs adapted to hyperspectral images, such as adaptive nearest neighbor graphs, density-weighted k-NN graphs, and shared nearestneighbor graphs [19].

Graph-based regularization in unmixing
With a constructed graph at hand to model the relation of pixels, in this section, we present the way to perform a sparse unmixing with the graph regularization.

Sparse unmixing
Consider the linear mixing model: y ¼ Sx þ n , where y ∈ ℝ L is one observed pixel with L spectral bands, S ¼ s 1 ; s 2 ; ⋯; s R ½ ∈ ℝ LÂR is the library of spectral signatures including R pure spectral signatures, and x ∈ ℝ R is an abundance vector, n is the additive white noise vector.
Since it is often that an observed pixel is only composed of a small number of materials in the library, the majority of entries of the abundance vector x are zero-valued, namely, x is sparse. Assuming the library is available beforehand and the spare unmixing problem can be defined as [23]: where λ is the regularized parameter.
In this chapter, we formulate the problem without ASC constraint because of using the l 1 norm regularization. Moreover, the validity of ASC is often questioned in the literature for practical scenarios. In what follows, we introduce the graph regularization to the above formulated problem.

Graph regularization for sparse unmixing
Since a graph relates the pixels of image via spatial and spectral relations, we can regularize the unmixing problem with pixel relations defined by the graph. Let Y ¼ y 1 ; y 2 ; ⋯y n Â Ã ∈ ℝ LÂn be a spectral matrix, where each column is one observed pixel including L spectral bands and n is the number of pixels, and in a graph, n is also the number of vertices. Let X ¼ x 1 ; x 2 ; ⋯x n ½ ∈ ℝ RÂn be an abundance matrix in which each column is an abundance vector associated with one observed pixel. With the graph representation of hyperspectral data, we achieve the sparse unmixing by solving the following optimization problem: where This graph regularization term Eq. (7) is based on the assumption that if two vertices are connected by an edge, then the abundances of the two vertices are similar. This term measures the differences between all pairs of abundances weighted by their degrees of similarity in the graph. The graph regularization then promotes piecewise constant transitions of estimates among the related pixels. Parameter λ g controls the regularization strength. Note that we can rewrite Eq. (7) with the incidence matrix B as: Problem Eq. (6) is equivalently expressed as: subject to : X ≥ 0 (9) If we use a spatial four-neighborhood graph in this unmixing problem with the weights being simply set to 1 and 0, it can generally be identical with the SUnSAL-TV algorithm [8]. The right term can promote piecewise constant transitions in the fractional abundance among neighborhood pixels and achieve spatial consistency of estimated abundances.
Instead of considering promoting the similarities among estimated abundances, an alternative way is to promote the similarities of reconstructed spectra among the connected pixels. In [24], authors propose a nonlocal TV regularization, with the regularization term given as: This can also be written with incidence matrix B as:

Solution to the formulated problem
We propose to solve the formulated unmixing problem Eq. (9) via the ADMM algorithm. In this section, we first briefly review the ADMM algorithm and then apply it to our unmixing problem.

Introduction of ADMM
ADMM is an algorithm that is intended to blend the decomposability of dual ascent with the superior convergence properties of the method of multipliers. The algorithm solves problems in the form [25]: with variables x ∈ R n and z ∈ R m , where A ∈ R pÂn , B ∈ R pÂm , and c ∈ R p . The first step is to write the augmented Lagrangian of problem Eq. (12): ADMM suggests achieving the optimum via the following iterations: where ρ > 0. The algorithm is very similar to dual ascent and the method of multipliers: it consists of an x-minimization step Eq. (14), a z-minimization step Eq. (15), and a dual variable update Eq. (16). As in the method of multipliers, the dual variable update uses a step size equal to the augmented Lagrangian parameter ρ.

Solutions via ADMM
In order to apply the canonical ADMM algorithm to the problem (9), we introduce the auxiliary variables and transform the problem as follows: where l S is the indicator function of the set S , such as l S x ð Þ ¼ 0 if x ∈ S and l S x ð Þ ¼ þ∞ if x∉S. Thus the augmented Lagrangian for Eq. (17) is as follows: where D 1 , D 2 , D 3 , D 4 are Lagrange multipliers and ρ is the penalty parameter.
The algorithm steps are as follows: Step 1. Input the observed pixels Y matrix, the library S , and the regularization parameters μ, λ g ; Step 2. Initialization: Step 9. Update the Lagrangian multipliers: Step 10. Until stopping criterion is satisfied.
In step 4 of minimizing the augmented Lagrangian with respect to X, the solution is: Similarly, the solution of V 1 minimization step 5 is: To compute V 2 in step 6, the solution is the well-known soft threshold [17]: where v 2 , x, d 2 is the row of V 2 , X, D 2 , respectively.
The solution of V 3 minimization step 7 is: The solution of V 4 minimization step 8 is: where v 4 , f, d 4 is the row of V 4 , F ¼ V 3 Â B, D 4 , respectively.

Experiments
In this section, we illustrate the experimental results via a synthetic hyperspectral data set (denoted by Data 1) and a real hyperspectral data set (denoted by Data 2) with various ways of graph construction.

Experiments with simulated data sets
In this part, the synthetic data consists of 75 Â 75 pixels and is generated by 9 endmembers. The endmembers are randomly selected from the spectral library advanced spaceborne thermal   The abundance maps of first, fifth, sixth, and eighth . From left to right: Real abundance maps, estimated abundance maps with four-neighbor graph, spectral-spatial graph and threshold-compared graph, respectively.
where x i and b x i are the actual and estimated abundance vectors of the ith pixel, n is the number of pixels, and R is the number of endmembers.
We define the graph based on the simulated data using three methods: the four-neighbor graph, the threshold-compared graph and the spectral-spatial graph respectively.
In the experiment, the threshold-compared undirected graph is constructed as follows: where all pairs of spectral distance are compared with a user-defined threshold. Meanwhile, the spectral-spatial graph is constructed by considering four neighbors of spatial location and k-nearest neighbors of spectral distance.
From this table, we can see that the performance of the proposed algorithm with a thresholdcompared graph is better than the others. Although the second graph in Table 1 combines the spectral and spatial information, using spatial relation is not always a good way to connect pixels because it is possible that two adjacent pixels may have significantly different spectral features. Figure 4 shows the true abundance maps and the abundances estimated by the proposed algorithm associated with the three constructed graphs. We observe that the second row of the square regions is better conserved with the proposed algorithm using the threshold-compared graph.

Experiments with AVIRIS data
We also tested algorithms with a real hyperspectral image. The image is captured on the Cuprite mining district by AVIRIS. A sub-image of 250 Â 191 pixels was chosen, and it contains 188 spectral bands. The number of endmembers was estimated and set to 12 [26]. VCA algorithm was then used to extract the endmembers. Here, we compare the FCLS [9], SUnSAL-TV, and the proposed algorithm with a threshold-compared graph. Figure 5 shows the first and fifth abundance maps of three algorithms respectively. We can see that the proposed algorithm highlights localized targets without oversmoothing the image like in SUnSAL-TV and with less impurity than in FCLS.

Conclusion
In this chapter, we propose to use graph as a mathematical tool to relate pixels in hyperspectral data. We the present a variety of methods of constructing a graph according to spatial information and spectral information embedded in an image. A sparse unmixing problem is then formulated with the graph regularization to enhance the estimation performance. An ADMMbased algorithm is then presented to solve the formulated problem. In the experiments, we compare the unmixing performance of the presented unmixing algorithm with different graphs, using a synthetic hyperspectral data and a real data. Future works include evaluating the unmixing performance with weighted graphs.