Open access

An Exploration of Color Fusion with Multispectral Images for Night Vision Enhancement

Written By

Yufeng Zheng

Submitted: 19 October 2010 Published: 24 June 2011

DOI: 10.5772/17121

Chapter metrics overview

3,767 Chapter Downloads

View Full Metrics

1. Introduction

Multispectral images usually present complimentary information such as visual-band imagery and infrared imagery (near infrared or long wave infrared). There are strong evidences that the fused multispectral imagery (in gray scales) increases the reliability of interpretation (Rogers & Wood, 1990; Essock et al., 2001) and thus good for machine analysis (computer vision); whereas the colorized multispectral imagery improves observer performance and reaction times (Toet et al. 1997; Varga, 1999; Waxman et al., 1996) and thus good for visual analysis (human vision).

Imagine a nighttime navigation task that may be executed by an aircraft equipped with a multispectral imaging system. Analyzing the synthesized (fused or colorized) multisensory image will be more informative and more efficient than simultaneously monitoring multispectral images such as visual-band imagery (e.g., image intensified, II), near infrared (NIR) imagery, and infrared (IR) imagery, which may be displayed either on several split panels on a big screen or on several small screens. The focus of this chapter is how to synthesize a color presentation of multispectral images in order to enhance night vision. It is anticipated that the successful applications of night vision colorization techniques will lead to improved performance of remote sensing, nighttime navigation, target detection, and situational awareness. This colorization approaches mentioned here involve two main techniques, image fusion and colorization, which are briefly reviewed as follows, respectively.

Image fusion combines multiple-source imagery by integrating complementary data in order to enhance the information apparent in the respective source images, as well as to increase the reliability of interpretation. This results in more accurate data (Keys et al., 1990) and increased utility (Rogers & Wood, 1990; Essock et al., 1999). In addition, it has been reported that fused data provides far more robust aspects of operational performance such as increased confidence, reduced ambiguity, improved reliability and improved classification (Rogers & Wood, 1990; Essock et al., 2001). A general framework of image fusion can be found in Reference (Pohl & Genderen, 1998). The discussions of image fusion here are limited to pixel-level fusion.

Two commonly used fusion methods are the discrete wavelet transform (DWT) (Pu & Ni, 2000; Nunez et al., 1999) and various pyramids (such as Laplacian, contrast, gradient, and morphological pyramids) (Jahard et al., 1997; Ajazzi et al., 1998), which both are multiscale fusion methods. Recently, an advanced wavelet transform (aDWT) method (Zheng et al., 2004) has been proposed, which incorporates principal component analysis (PCA) and morphological processing into a regular DWT fusion algorithm. The aDWT method can produce a better fused image in comparison with pyramid methods and regular DWT methods. Image fusion is a necessary step for the following color fusion and colorization methods.

On the other hand, a night vision colorization technique can produce colorized imagery with a naturalistic and stable color appearance by processing multispectral night-vision imagery. Although appropriately false-colored imagery is often helpful for human observers in improving their performance on scene classification, and reaction time tasks (Essock et al., 1999; Waxman et al., 1996), inappropriate color mappings can also be detrimental to human performance (Toet & IJspeert, 2001; Varga, 1999). A possible reason is lack of physical color constancy (Varga, 1999). Another drawback with false coloring is that observers need specific training with each of the unnatural false color schemes so that they can correctly and quickly recognize objects; whereas with colorized nighttime imagery rendered with natural colors, users should be able to readily recognize and identify objects.

Toet (2003) proposed a night vision (NV) colorization method that transfers the natural color characteristics of daylight imagery into multispectral NV images. Essentially, Toet’s natural color-mapping method matches the statistical properties (i.e., mean and standard deviation) of the NV imagery to that of a natural daylight color image (manually selected as the “target” color distribution). However, this color-mapping method colorizes the image regardless of scene content, and thus the accuracy of the coloring is very much dependent on how well the target and source images are matched. Specifically, Toet’s method weights the local regions of the source image by the “global” color statistics of the target image, and thus will yield less naturalistic results (e.g., biased colors) for images containing regions that differ significantly in their colored content. Another concern of Toet’s “global-coloring” method is that the scene matching between the source and target is performed manually. To address the aforementioned bias problem in global coloring, Zheng et al. (2005); (2008) presented a “local coloring” method that can colorize the NV images more like daylight imagery. The local-coloring method will render the multispectral images with natural colors segment by segment (i.e., “segmentation-based”), and also provide automatic association between the source and target images (i.e., avoiding the manual scene-matching in global coloring). This local coloring method is also referred to as “segmentation-based” colorization in contrast with “channel-based” color fusion introduced later.

In this chapter, we will discuss and explore how to enhance human night vision by presenting a color image with a set of multispectral images. Certainly, a color presentation of multispectral night vision images can provide a better visual result for human users. We would prefer the color images resembling natural daylight pictures that we are used to; meanwhile the coloring process shall be efficient enough ideally for real time applications. A segmentation-based colorization procedure is first reviewed, and a channel-based color fusion is then introduced. The remainder of this chapter is organized as follows. The multispectral image preprocessing, registration and fusion are described in Section 2. Next, the segmentation-based colorization method is completely discussed in Section 3. Then, a new channel-based color fusion method is introduced in Section 4. The experiments and discussions are given in Section 5. Conclusions are finally drawn in Section 6.


2. Multispectral image preprocessing

The multispectral images that we acquired include visible (RGB color) images, image intensified (II, enhanced visible) images, near infrared (NIR; spectral range: 0.9~1.7 μm) images, and long-wave infrared (LWIR; spectral range: 7.5~13 μm) images. Before performing multispectral colorization, image preprocessing, image registration, and image fusion are required.

2.1. Standard preprocessing

Standard image preprocessing such as denoising, normalization and enhancement can favorite the following processes, i.e., image registration, fusion, and colorization. The noise in digital images may be caused by imperfection of imaging sensors, scene contents in FOV (field of view, extremely cold or hot objects for infrared imaging), environment (atmosphere) disturbance, or poor illumination (for visible band imaging). Noise can be reduced according to the nature of the noise sources that depends on a particular application. For example, pepper-and-salt noise can be removed by a median filter; periodic noise may be reduced by a designed frequency filter in Fourier transformed domain; and a random noise can be suppressed by a Gaussian filter or a nonlinear diffusion filter.

Night-vision images (NIR and LWIR) were acquired under different background and conditions, which may cause images to have different background (brightness) and contrast (dynamic range). We employed a general image normalization (also called contrast stretching) to standardize all multispectral images.


where IN is the normalized image, I0 is the original image; IMin and IMax are the maximum and minimum pixel values in I0, respectively; LMin and LMax are the expected maximum and minimum pixel values in IN, which normally equal 0 and 1, respectively. After image normalization, IN [0, 1].

The image contrasts of near infrared (NIR) images are significantly affected by illumination conditions. Nonlinear enhancement like histogram equalization or histogram matching usually increases noises while enhancing a NIR image. A linear enhancement such as piecewise contrast stretching is preferred. (Eq. 1) is still applicable but just applied within each piece of intensity interval. For example, given [IMin, IMax] = [0, 0.8], and [LMin, LMax] = [0, 1.0], after piecewise contrast stretching, the pixels within [0, 0.8] will be linearly scaled to [0, 1.0], while those pixels originally within (0.8, 1.0] are unchanged. To simply writing, this transform can be notated as S[0,0.8][0,1.0] thereafter.

2.2. Image registration

Image registration is a required preprocess by image fusion and image colorization. In general, image registration aligns multiple images by performing affine transformations that allows translation, rotation, and scaling. Similarity metrics are used to decide the optimized transformation parameters. Normalized mutual information (NMI) turns out to be the robust metric for noisy and multi-modality image registration (Hill & Batchelor, 2001). The computation complexity increases with the number of degrees of freedom. For 2D image registration, Fourier-Mellin transform (FMT, Chen et al., 1994) is much faster than NMI-based registration, but FMT is sensitive to noise. For multispectral night-vision image registration, we utilize two registration algorithms, i.e., use the FMT method for translation registration, and then use the NMI-based method for scaling and rotation registration.

We used the FMT method only accounting for translation alignment although it can be alternated for scaling and rotation (but not reliable). The image alignment by scaling and rotation is accomplished with affine transforms using NMI metric. The image transforming parameters can be estimated by maximizing the NMI value. Calculation of NMI and interpolation of transforming (e.g., fractional scaling) are quite time consuming. However, the searching spaces of parameters (for scaling and rotation) are small because two cameras are sitting on the same fixture by turns and aiming at the same target. This expedites the registration process on the other hand.

Different FOV of multispectral images is another challenge for image registration. For example, FLIR SC620 camera (used in our experiments) is a two-band imaging device with a LWIR camera (640×480 pixels; FOV: 24˚) and a built-in visible camera (2048×1536 pixels; FOV: 32˚). Before registration with LWIR image cropping the visible image is desired. To find the matched block (region) of LWIR on the visible image, (i) scan the visible image block by block with step movement 5~10 pixels (Left to Right, Top to Bottom), where the block image is of size (960×720, estimated according to view angles); (ii) compute the NMI between the scanning block (on the visible image) and LWIR; (iii) select the scanning block (region) with maximal NMI as the matched block for the following registration. The framework of general image registration was documented elsewhere (Brown, 1992), and the details of our proposed course-to-fine registration method will be discussed in a separate paper.

2.3. Image fusion

Image fusion is a necessary step for the color fusion discussed in this chapter. Image fusion serves to combine multiple-source imagery using advanced image processing techniques. Laplacian pyramid and DWT-based fusion methods are briefly reviewed, while the details of image fusion were documented elsewhere (Zheng et al., 2005).

The Laplacian pyramid was first introduced as a model for binocular fusion in human stereo vision (Burt & Adelson, 1985), where the implementation used a Laplacian pyramid and a maximum selection rule at each point of the pyramid transform. Essentially, the procedure involves a set of band-pass copies of an image is referred to as the Laplacian pyramid due to its similarity to a Laplacian operator. Each level of the Laplacian pyramid is recursively constructed from its lower level by applying the following four basic steps: blurring (low-pass filtering); sub-sampling (reduce size); interpolation (expand); and differencing (to subtract two images pixel by pixel) (Burt & Adelson, 1983). In the Laplacian pyramid, the lowest level of the pyramid is constructed from the original image.

The regular DWT method is a multi-scale analysis method. In a regular DWT fusion process, DWT coefficients from two input images are fused pixel-by-pixel by choosing the average of the approximation coefficients (i.e., the low-pass filtered image) at the highest transform scale; and the larger absolute value of the detail coefficients (i.e., the high-pass filtered images) at each transform scale. Then, an inverse DWT is performed to obtain a fused image. At each DWT scale of a particular image, the DWT coefficients of a 2D image consist of four parts: approximation, horizontal detail, vertical detail, and diagonal detail. In the advanced DWT (aDWT) method (Zheng et al., 2004), we apply PCA (principle component analysis) to the two input images’ approximation coefficients at the highest transform scale. That is, we fuse them using the principal eigenvector (corresponding to the larger eigenvalue) derived from the two original images, as described in (Eq. 2) below:


where CA and CB are approximation coefficients (image matrices) transformed from input images A and B. CF represents the fused coefficients; a1 and a2 are the elements (scalars) of the principal eigenvector, which are computed by analyzing the original input images. Note that the denominator in (Eq. 2) is used for normalization so that the fused image has the same energy distribution as the original input images.

For the detail coefficients (the other three quarters of the coefficients) at each transform scale, the larger absolute values are selected, followed by neighborhood morphological processing, which serves to verify the selected pixels using a “filling” and “cleaning” operation (i.e., the operation fills or removes isolated pixels locally). Such an operation (similar to smoothing) can increase the consistency of coefficient selection thereby reducing the distortion in the fused image.


3. Segmentation-based colorization

In segmentation-based colorization (i.e., local coloring) method, multispectral night vision imagery is rendered segment-by-segment with the statistical color properties of natural scenes by using the color mapping technique. Eventually, the colorized images resemble daylight pictures. The main steps of segmentation-based colorization are given below: (1) A false-color image (source image) is first formed by assigning multispectral (two or three band) images to three RGB channels. The false-colored images usually have an unnatural color appearance. (2) Then, the false-colored image is segmented using the features of color properties, the techniques of nonlinear diffusion, clustering, and region merging. A set of “clusters” are formed by analyzing the histograms of the three components of the diffused image in lαβ color space. Those clusters are merged to “segments” if their similarity values in lαβ space are greater than a preset threshold. (3) The averaged mean, standard deviation, and histogram of a large sample of natural color images are used as the target color properties for each color scheme. The target color schemes are grouped by their contents and colors such as plants, mountain, roads, sky, water, buildings, people, etc. (4) The association between the source region segments and target color schemes is carried out automatically utilizing a classification algorithm such as the nearest neighbor paradigm. (5) The color mapping procedures (statistic-matching and histogram-matching) are carried out to render natural colors onto the false-colored image segment by segment. (6) The mapped image is then transformed back to the RGB space. (7) Finally, the mapped image is transformed into HSV (Hue-Saturation-Value) space and the “value” component of the mapped image is replaced with the “fused NV image” (a grayscale image). Note that this fused image replacement is necessary to allow the colorized image to have a proper and consistent contrast.

3.1. Color space transform

In this subsection, the RGB to LMS (long-wave, medium-wave and short-wave) transform is discussed first. Then, an lαβ space is introduced from which the resulting data representation is compact and symmetrical, and provides a higher decorrelation than the second order. The reason for the color space transform is to decorrelate three color components (i.e., l, α and β) so that the manipulation (such as statistic matching and histogram matching) on each color component can be performed independently. Inverse transforms (lαβ space to the LMS and LMS to RGB) are needed to complete the proposed segmentation-based colorization, which are given elsewhere (Zheng & Essock, 2008).

The actual conversion (matrix) from RGB tristimulus to device-independent XYZ tristimulus values depends on the characteristics of the display being used. Fairchild (1998) suggested a “general” device-independent conversion (while non-priori knowledge about the display device) that maps white in the chromaticity diagram to white in the RGB space and vice versa.


The XYZ values can be converted to the LMS space using the following equation


A logarithmic transform is employed here to reduce the data skew that existed in the above color space:

L=logL,   M=logM,    S=logSE5

Ruderman et al. (1998) presented a color space, named lαβ (Luminance-Alpha-Beta), which can decorrelate the three axes in the LMS space:


The three axes can be considered as an achromatic direction (l), a yellow-blue opponent direction (α), and a red-green opponent direction (β). The lαβ space has the characteristics of compact, symmetrical and decorrelation, which highly facilitate the subsequent process of color-mapping (see Section 3.4).

3.2. Image segmentation

The nonlinear diffusion procedure has proven to be equivalent to an adaptive smoothing process (Barash & Comaniciu, 2004). The diffusion is applied to the false-colored NV image here to obtain a smooth image, which significantly facilitates the subsequent segmentation process. The clustering process is performed separately on each color component in the lαβ color space to form a set of “clusters”. The region merging process is used to merge the fragmental clusters into meaningful “segments” (based on a similarity metric defined in 3D lαβ color space) that will be used for the color-mapping process.

3.2.1. Adaptive smoothing with nonlinear diffusion

Nonlinear diffusion methods have been proven as powerful methods in the denoising and smoothing of image intensities while retaining and enhancing edges. Barash and Comaniciu (2004) have proven that nonlinear diffusion is equivalent to adaptive smoothing and bilateral filtering is obtained from an extended nonlinear diffusion. Nonlinear diffusion filtering was first introduced by Perona and Malik (1990). Basically, diffusion is a PDE (partial differential equation) method that involves two operators, smoothing and gradient, in 2D image space. The diffusion process smoothes the regions with lower gradients and stops the smoothing at region boundaries with higher gradients. Nonlinear diffusion means the smoothing operation depends on the region gradient distribution. For color image diffusion, three RGB components of a false-colored NV image are filtered separately (one by one). The number of colors in the diffused image will be significantly reduced and will benefit the subsequent image segmentation procedures – clustering and merging.

3.2.2. Image segmentation with clustering and region merging

The diffused false-colored image is transformed into the lαβ color space. Each component (l, α or β) of the diffused image is clustered in the lαβ space by individually analyzing its histogram. Specifically, for each intensity component (image) l, α or β, (i) normalize the intensity onto [0,1]; (ii) bin the normalized intensity to a certain number of levels NBin and perform the histogram analysis; (iii) with the histogram, locate local extreme values (i.e., peaks and valleys) and form a stepwise mapping function using the peaks and valleys; (iv) complete the clustering utilizing the stepwise mapping function.

The local extremes (peaks or valleys) are easily located by examining the crossover points of the first derivatives of histograms. Furthermore, “peaks” and “valleys” are expected to be interleaved (e.g., valley-peak-valley-…-peak-valley); otherwise, a new valley value can be calculated with the midpoint of two neighboring peaks. In addition, two-end boundaries are considered two special valleys. In summary, all intensities between two valleys in a histogram are squeezed in their peak intensity and the two end points in the histogram are treated as valleys (rather than peaks). If there are n peaks in a histogram, then an n-step mapping function is formed. If there are two or more valley values (including the special valley at the left end) at the left side of the leftmost peak, then use the special (extreme) valley intensity.

Clustering is done by separately analyzing three components (l, α & β) of the false-colored image, which may result in inconsistent clusters in the sense of colors. Region merging is necessary to incorporate the fragmental “clusters” into meaningful “segments” in the sense of colors, which will improve the color consistency in a colorized image. If two clusters are similar (i.e., Qw(x,y) > TQ (a predefined threshold)), these two clusters will be merged. Qw(x,y) is a similarity metric between two clusters, x and y, which is defined in the lαβ color space as follows:


where wk is a given weight for each color component. Qk(x,y) is formulated as (Eq. 7b):

where x¯ and σx are the mean and the standard deviation of cluster x in a particular component, respectively. Similar definitions are applied to cluster y. The sizes (i.e., areas) of two clusters (x and y) are usually unequal. Notice that Qk(x,y) is computed with regard to the diffused false-color image.

3.3. Automatic segment recognition

A nearest neighbor (NN) paradigm (Keysers et al., 2002) is demonstrated to classify the segments obtained from the preceding procedure (described in Section 3.2). To use the NN algorithm, a distance measure between two segments is needed. The similarity metric Qw(x,y) (as defined in (Eq. 7)) between two segments, x and y, is used as the distance measure. Thus, the closer two segments in lαβ space, the larger their similarity.

Similar to a training process, a look up table (LUT) has to be built under supervision to classify a given segment (sj) into a known color group (Ci), i.e., Ci = T(sj), (i ≤ j), where sj is a feature vector that distinguishingly describes each segment; Ci stands for a known color scheme (e.g., sky, clouds, plants, water, ground, roads, etc.); and T is a classification function (i.e., a trained classifier). We use segment color statistics (e.g., mean and deviation of each channel) as features (of six statistical variables). The statistical features (sj) are computed using the diffused false-color images and the color mapping process is carried out between a false-color segment and a daylight color scheme. The reason for using the diffused false-color images here is because the diffused images are less sensitive to noise. In a training stage, a set of multispectral NV images are analyzed and segmented such that a sequence of feature vectors, {sj} can be computed and the LUT (mapping) between {sj} and {Ci} can be manually set up upon the experimental results. In a classifying (testing) stage, all Qw(xk, sj) values (for j = 1, 2, 3, …) are calculated, where xk means the current classified segment and sj represents one of the existing segments from the training stage. Certainly, xk is automatically classified into the color group of the largest Qw (similarity). For example, if Qw(x1, s5) is the maximum, then the segment of x1 will be colorized using the color scheme T(s5) that is the color used to render the segment of s5 in the training stage.

3.4. Color mapping

3.4.1. Statistic matching

A “statistic matching” is used to transfer the color characteristics from natural daylight imagery to false color night-vision imagery, which is formulated as:


where IC is the colored image, IS is the source (false-color) image in lαβ space; μ denotes the mean and σ denotes the standard deviation; the subscripts ‘S’ and ‘T’ refer to the source and target images, respectively; and the superscript ‘k’ is one of the color components: { l, α, β}.

After this transformation, the pixels comprising the multispectral source image have means and standard deviations that conform to the target daylight color image in lαβ space. The color-mapped image is transformed back to the RGB space through the inverse transforms (lαβ space to the LMS, exponential transform from LMS to LMS, and LMS to RGB, refer to (Eq. 3); (Eq. 6) (Zheng & Essock, 2008).

3.4.2. Histogram matching

Histogram matching (also referred to as histogram specification) is usually used to enhance an image when histogram equalization fails (Gonzalez & Woods, 2002). Given the shape of the histogram that we want the enhanced image to have, histogram matching can generate a processed image that has the specified histogram. In particular, by specifying the histogram of a target image (with daylight natural colors), a source image (with false colors) resembles the target image in terms of histogram distribution after histogram matching. Similar to statistic matching, histogram matching also serves for color mapping and is performed component-by-component in lαβ space. Histogram matching and statistic matching can be applied separately or jointly.


4. Channel-based color fusion

The segmentation-based colorization described in Section 3 can usually produce colorized night-vision images closely resembling natural daylight pictures. However, this segmentation-based coloring procedure involves many processes and heavy computations such as histogram analysis, color space transform, image segmentation, and pattern classification. It will be a grand challenge for real time applications. Therefore, we propose a fast color fusion method, termed as channel-based color fusion, which is efficient enough ideally for real time applications. Notice that the term of “color fusion” means combing multispectral images into a color-version image with the purpose of resembling natural scenes. Relative to “night vision colorization”, color fusion trades the realism of colors with speed. On the other hand, false coloring techniques have no intention of resembling natural color scenery.

The general framework of channel-based color fusion is as follows, (i) prepare for color fusion, preprocessing (denoising, normalization and enhancement) and image registration; (ii) form a color fusion image by properly assigning multispectral images to red, green, and blue channels; (iii) then fuse multispectral images (gray fusion) using aDWT algorithm (see Section 2.3); and (iv) replace the value component of color fusion in HSV color space with the gray-fusion image, and finally transform back to RGB space.

In night vision imaging, there may be two or several bands of images available, for example, visible (RGB), image intensified (II), near infrared (NIR), medium wave infrared (MWIR), long wave infrared (LWIR, also called thermal). The discussions of following subsections focus on how to form a channel-wise color fusion with the available multispectral images.

4.1. Color fusion with two-band images

Upon the available images and common applications, we will discuss two-band color fusion of (II LWIR), (NIR LWIR), (RGB LWIR), and (RGB NIR), although other combinations of two bands may be possible in some applications. The symbol ‘’ denotes the fusion of multiband images.

4.1.1. Color fusion of (II LWIR)

Suppose a color fusion image (FC) consists of three color planes, FR, FG, FB, the color fusion of II and LWIR images are formed by using the following expressions,,


where S[0.1,I_Gmax][0.2,1] denotes piecewise contrast stretching defined in (Eq. 1) and I_Gmax = min([μII+3σII],0.8), µ and σ are the mean and standard deviation of an II image; [1.0- ILWIR] is to invert LWIR image; symbol ‘•’ means element-by-element multiplication; VF is the value component of FC in HSV space, Fus() means image fusion operation using aDWT algorithm. Although the limits given in contrast stretching are obtained empirically according to the night vision images that we had, it is viable to formulate the expressions and automate the fusion based upon a set of conditions (imaging devices, imaging time, and application location). Notice the transform parameters in (Eq. 9) were applied to all color fusions in our experiments.

4.1.2. Color fusion of (NIR LWIR)

A color fusion of NIR and LWIR is formulated by,


where I_Gmax = min([μNIR+2σNIR],0.8), min() is an operation to get the minimal number. Other notes are the same as that in (Eq. 9).

4.1.3. Color fusion of (RGB LWIR)

Two-band color fusion of RGB and LWIR is described as follows,


where IRed, IGreen and IBlue are the three channel images of a RGB image; I_Rmax = min([μRed+8σRed],0.6), min() is an operation to get the minimal number; max[ILWIR, IRed] is to take the maximal pixel values between ILWIR and IRed. In fact, this color fusion only modifies the red channel in a RGB image, where the piecewise contrast stretching in (Eq. 11a) is to keep a good color balance and contrast. No image fusion is used because the RGB images captured at night time are usually very noisy. Of course, no HSV transform is performed.

4.1.4. Color fusion of (RGB NIR)

The color fusion of RGB and NIR is defined as,


where I_Gmax = min([μGreen+6σGreen],0.6). Other notes are the same as that in (Eq. 11). This color fusion actually modifies the green channel in a RGB image. No image fusion and no HSV transform are performed. The color fusion of (RGB NIR) is not used as often as the fusion of (RGB LWIR).

4.2. Color fusion with three-band images

Due to the available image databases, we only discuss one application of three-band color fusions, (RGB NIR LWIR). A color fusion of RGB, NIR and LWIR can be described as,


where I_Gmax = min([μNIR+2.5σNIR],0.8), I_Bmax = min([μBlue+3σBlue],0.85), IBlue is the blue channel image of a RGB image. The other two channels (red and green) are not used for the color fusion.


5. Experimental results and discussions

Two sets of multispectral images were used in our experiments, which were taken at night time and referred as to “NV-set 1” and “NV-set 2”. In NV-set 1, three pairs of multispectral images (as shown Figs. 1-3), image intensified (II) and long wave infrared (LWIR), were analyzed by using the aDWT fusion algorithm and segmentation-based colorization (also referred as to “local coloring”) algorithm as described in Section 3. The results of segmentation-based colorization are illustrated in Figs. 1-3. Note that there was no post-processing imposed on the resulted fusion and colorization images.

The two input images and the fused images used in the coloring process are shown in Figs. 1-3a, Figs. 1-3b and Figs. 1-3c, respectively. The image resolutions are given in figure captions. Two input images in NV-set 1 were preregistered. The false colored images (not shown in Figs. 1-3) were obtained by assigning image intensified (II) images to blue channels, infrared (IR) images to red channels, and providing averaged II and IR images to green channels. The rationale of forming a false-color image is to assign a long-wavelength NV image to the red channel and to assign a short-wavelength NV image to the blue channel. The number of false colors were reduced with the nonlinear diffusion algorithm with AOS (additive operator splitting for fast computation) implementation that facilitated the subsequent segmentation. The segmentation was done in lαβ space through clustering and

Figure 1.

Segmentation-based colorization with Sample #1 (531×401 pixels) in NV-set 1: (a) and (b) are II and LWIR images; (c) Fused image by aDWT; (d) is the segmented image from a false-colored image (not shown), where 16 segments were merged from 36 clusters; (e) is the colored image, where six auto-classified color schemes (sky, clouds, plants, water, ground and others) were mapped by jointly using histogram-matching and statistic-matching; (f) Channel-based color fusion of (IILWIR).

Figure 2.

Segmentation-based colorization with Sample #2 (360×270 pixels) in NV-set 1: (a) and (b) are II and LWIR images; (c) Fused image by aDWT; (d) is the segmented image of 12 segments merged from 21 clusters; (e) is the colored image with five auto-classified color schemes (plants, roads, ground, building and others); (f) Channel-based color fusion of (IILWIR).

Figure 3.

Segmentation-based colorization with Sample #3 (360×270 pixels) in NV-set 1: (a) and (b) are II and LWIR images; (c) Fused image by aDWT; (d) is the segmented image of 14 segments merged from 28 clusters; (e) is the colored image with three auto-classified color schemes (plants, smoke and others); (f) Channel-based color fusion of (IILWIR).

merging operations (the clustered images are not shown in Figs. 1-3). The parameter values used in clustering and merging are NBin = [24 24 24], wk = [0.25 0.35 0.40] and TQ = 0.90. To emphasize two chromatic channels (due to more distinguishable among segments) in lαβ space, relatively larger weights were assigned in wk. With the segment map as shown in Figs. 1-3d, the histogram-matching and statistic-matching were performed segment by segment in lαβ space. The source region segments were automatically recognized and associated with proper target color schemes (after the training process is done). The finally colored images by segmentation-based colorization are shown in Figs. 1-3e. From a visual examination, the colored images appear very natural, realistic, and colorful. The comparable colorization results by using global coloring algorithm are presented in Reference (Zheng & Essock, 2008). This segmentation-based coloring process is fully automatic and well adaptive to different types of multispectral images.

Two-band channel-based color fusion (described in Eqs. 9) was applied to the II and LWIR images (shown in Figs. 1-3a, b), and the results are illustrated in Figs. 1-3f. The color fusion results are very good especially in representing vegetation. Compared to the segmentation-based colorization results, the channel-based color fusion seems less realistic such as the sky and roads shown in Figs. 1-2f. However, the processes of channel-based color fusion eliminate the needs of segmentation and classification, and also reduced the color transforms. The processing speed of is much faster than that of segmentation-based colorization.

In NV-set 2, four pairs of multispectral images (as shown Figs. 4-7), color RGB, near infrared (NIR) and long wave infrared (LWIR), were analyzed by using the channel-based color fusion algorithm as described in Section 4. The results of channel-based color fusion are presented in Figs. 4-8.

The three-band input images used in the color fusion process are shown in Figs. 4-7a, b and c, respectively. The image resolutions are given in figure captions. The RGB images and LWIR images were taken by a FLIR SC620 two-in-one camera, which has LWIR camera (of 640×480 pixel original resolution and 7.5~13 μm spectral range) and an integrated visible-band digital camera (2048×1536 pixel original resolution). The NIR images were taken by a FLIR SC6000 camera (640×512 pixel original resolution and 0.9~1.7 μm spectral range). Two cameras (SC620 and SC6000) were sat on the same fixture by turns and aimed at the same direction. The images were captured during sunset time and dusk time in fall season. Of course, image registration as described in Section 2.2 was applied to the three band images shown in Figs. 4-7, where manual alignments were employed to the RGB images shown in Figs. 6-7a since those visible images are so dark and noisy. To better present the RGB images, contrast and brightness adjustments (as described in figure captions) were applied. Notice that piecewise contrast stretching (Eq. 1) was used for NIR enhancements. The fused images using aDWT algorithm was shown in Figs. 4-7d. Two-band channel-based color fusion (Eqs. 10) was applied to the NIR and LWIR images (shown in Figs. 4-7b, c), and the results are illustrated in Figs. 4-7e; while three-band color fusion (Eqs. 13) of (RGBNIRLWIR) are shown in Figs. 4-7f. Relative to gray-fusion (Figs. 4-7d), the images shown in two-band color fusion (Figs. 4-7e) resemble natural colors, which makes scene classification much easier. In the color-fusion images, the trees and grasses can be easily distinguished from grounds (parking lots) and sky. The car and person are easily identified in Figs. 6-7e. In Fig. 6e, the water area (between ground and trees, shown in cyan color) is clearly noticeable, but it is hard to realize the water area in the gray-fusion image (Fig. 6d). There is some improvement in three-band color fusion of (RGBNIRLWIR) in Figs. 4-5f when the light condition is good. For example, the tree, sky and ground shown in Figs. 4-5f are represented in more realistic colors than that in Figs. 4-5e. However, there is no significant difference between two-band and three-band color fusions as shown in Figs. 6-7 because the RGB images were taken at poor lighting condition.

The two-band channel-based color fusion of (RGBLWIR) as defined in (Eq. 11) is demonstrated in Fig. 8 a-c; while the color fusion of (RGBNIR) as defined in (Eq. 12) is illustrated in Fig. 8 d-f. No additional brightness or contrast adjustments were applied to these color-fusion images. In Fig. 8, the top-row images appear reddish, while the bottom-row images show greenish. These color-fusion images (under poor illumination) are not very realistic but have better representations and visibilities than the original RGB images (Figs. 4-6a). No color fusions of (RGBLWIR) or (RGBNIR) using the images shown in Fig. 7 are presented here due to the poor quality of RGB image (Fig. 7a).

The segmentation-based colorization demonstrated here took two-band multispectral images (II and LWIR) as inputs. Actually, this segmentation-based colorization procedure can accept two or three input images (e.g., II, NIR, LWIR). If there are more than three bands of images available (e.g., II, NIR, MWIR, LWIR), we may choose the low-light intensified (visual band) image and two bands of IR images. As far how to choose two bands of IR images, we may use the image fusion algorithm as a screening process. The two selected IR images for colorization should be the two images that can produce the most (maximum) informative fused image among all possible fusions. For example, given three IR images, IR1, IR2, IR3, the two chosen images for colorization, IC1, IC2, should satisfy the following equation: Fus(IC1, IC2) = max{Fus(IR1, IR2), Fus(IR1, IR3), Fus(IR2, IR3)}, where Fus stands for the fusion process and max means selecting the fusion of maximum information.

Figure 4.

Channel-based color fusion with Sample #1 (Case# AT008 – sunset time; 640×480 pixels) in NV-set 2: (a) Color RGB image (contrast increased by 10%); (b) NIR image; (c) LWIR image; (d) Fused image of (b) & (c) by aDWT algorithm; (e) Channel-based color fusion of (NIRLWIR); (f) Channel-based color fusion of (RGBNIRLWIR).

Figure 5.

Channel-based color fusion with Sample #2 (Case# AT010 – after sunset; 640×480 pixels) in NV-set 2: (a) Color RGB image (brightness and contrast both increased by 10%); (b) NIR image; (c) LWIR image; (d) Fused image of (b) & (c) by aDWT algorithm; (e) Channel-based color fusion of (NIRLWIR); (f) Channel-based color fusion of (RGBNIRLWIR).

Figure 6.

Channel-based color fusion with Sample #3 (Case# AT012 – dusk time; 640×480 pixels) in NV-set 2: (a) Color RGB image (brightness and contrast both increased by 10%); (b) NIR image; (c) LWIR image; (d) Fused image of (b) & (c) by aDWT algorithm; (e) Channel-based color fusion of (NIRLWIR); (f) Channel-based color fusion of (RGBNIRLWIR).

Figure 7.

Channel-based color fusion with Sample #4 (Case# AT013 – dusk time; 640×480 pixels) in NV-set 2: (a) Color RGB image (brightness and contrast both increased by 10%); (b) NIR image; (c) LWIR image; (d) Fused image of (b) & (c) by aDWT algorithm; (e) Channel-based color fusion of (NIRLWIR); (f) Channel-based color fusion of (RGBNIRLWIR).

Figure 8.

Two-band channel-based color fusion: (a-c) Color fusion of (RGBLWIR); (d-f) Color fusion of (RGBNIR). Original images are shown in Figs. 4-6a, b, and c.

We exhibited the channel-based color fusion with possible combinations of two-band and three-band multispectral images. The processing speed of channel-based fusion is much faster than segmentation-based colorization, while the colors in channel-based fusion are less natural than the colors in segmentation-based colorization. Parameter settings in channel-based color fusion (Eqs. (9-13)) may be varied with different bands of images and with image capturing time and season, which can be conducted and stored before a field application.


6. Conclusions

In this chapter, a set of color fusion and colorization approaches are presented to enhance night vision for human users, which can be performed automatically and adaptively regardless of the image contents. Experimental results with multispectral imagery showed that the colored images contain clear information, and realistic colors. Specifically, the segmentation-based colorization (local-coloring) procedure is based on image segmentation, pattern recognition, and color mapping, which produces more colorful and more realistic colorized night-vision images. On the other hand, the channel-based color fusion procedure generates very impressive color-fusion images using linear transforms and channel assignments, which can be implemented very efficiently for real-time applications. The synthesized multispectral imagery with proposed colorizing approaches will eventually lead to improved performance of remote sensing, nighttime navigation, and situational awareness.



This research is supported by the U. S. Army Research Office under grant number W911NF-08-1-0404.


  1. 1. AjazziB.AlparoneL.BarontiS.CarlaR. 1998 Assessment of pyramid-based multisensor image data fusion, in Proc. SPIE 3500 237248 .
  2. 2. BarashD.ComaniciuD. 2004 A common framework for nonlinear diffusion, adaptive smoothing, bilateral filtering and mean shift, Image Vision Computing 22(1), 73 EOF81 EOF .
  3. 3. BrownL. 1992 A survey of image registration techniques, ACM Comput. Surv., 24 325376 .
  4. 4. BurtP. J.AdelsonE. H. 1983 The Laplacian pyramid as a compact image code, IEEE Trans. Commun. Com-31 (4), 532 EOF540 EOF .
  5. 5. BurtP. J.AdelsonE. H. 1985 Merging images through pattern decomposition, Proc. SPIE 575 173182 .
  6. 6. ChenQ.DefriseM.DeconinckF. 1994 Symmetric Phase-Only Matched Filtering of Fourier-Mellin Transforms for Image Registration and Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 16 12 11561168 .
  7. 7. EssockE. A.Mc CarleyJ. S.SinaiM. J.De FordJ. K. 2001 Human perception of sensor-fused imagery, in Interpreting Remote Sensing Imagery: Human Factors, R. R. Hoffman and A. B. Markman, Eds., Lewis Publishers, Boca Raton, Florida.
  8. 8. EssockE. A.SinaiM. al. 1999 Perceptual ability with real-world nighttime scenes: imageintensified, infrared, and fused-color imagery, Hum. Factors 41(3), 438-452.
  9. 9. FairchildM. D. 1998 Color Appearance Models, Addison Wesley Longman Inc., 0-20163-464-3 MA.
  10. 10. GonzalezR. C.WoodsR. E. 2002 Digital Image Processing (Second Edition), Prentice Hall, 0-20118-075-8 Saddle River, NJ.
  11. 11. HillD. L. G.BatchelorP. 2001 Registration methodology: concepts and algorithms, in Medical Image Registration, Hajnal, J. V.; Hill, D. L. G.; & Hawkes, D. J. Eds, Boca Raton, FL.
  12. 12. JahardF.FishD. A.RioA. A.ThompsonC. P. 1997 Far/near infrared adapted pyramid-based fusion for automotive night vision, in IEEE Proc. 6th Int. Conf. on Image Processing and its Applications (IPA97), 886890 .
  13. 13. KeysL. D.SchmidtN. J.PhillipsB. E. 1990 A prototype example of sensor fusion used for a siting analysis, in Technical Papers 1990, ACSM-ASPRS Annual Conf. Image Processing and Remote Sensing 4, 238249 .
  14. 14. KeysersD.ParedesR.NeyH.VidalE. 2002 Combination of tangent vectors and local representations for handwritten digit recognition, Int. Workshop on Statistical Pattern Recognition, Lecture Notes in Computer Science, 2396 538547 , Windsor, Ontario, Canada.
  15. 15. al. 1999 Image fusion with additive multiresolution wavelet decomposition; applications to spot1landsat images, J. Opt. Soc. Am. A 16 467474 .
  16. 16. PeronaP.MalikJ. 1990 Scale space and edge detection using anisotropic diffusion, IEEE Transactions on Pattern Analysis and Machine Intelligence 12 629639 .
  17. 17. PohlC.GenderenJ. L. V. 1998 Review article: multisensor image fusion in remote sensing: concepts, methods and applications, Int. J. Remote Sens. 19(5), 823 EOF854 EOF .
  18. 18. PuT.NiG. 2000 Contrast-based image fusion using the discrete wavelet transform, Opt. Eng. 39(8), 2075 EOF -2082.
  19. 19. RogersR. H.WoodL. 1990 The history and status of merging multiple sensor data: an overview, in Technical Papers 1990, ACSMASPRS Annual Conf. Image Processing and Remote Sensing 4, 352360 .
  20. 20. ToetA. 2003 Natural colour mapping for multiband nightvision imagery, Information Fusion 4 155166 .
  21. 21. ToetA.JspeertI.J. K. 2001 Perceptual evaluation of different image fusion schemes, in: I. Kadar (Ed.), Signal Processing, Sensor Fusion, and Target Recognition X, The International Society for Optical Engineering, Bellingham, WA, 436441 .
  22. 22. ToetA.JspeertI.J. K.WaxmanA. M.AguilarM. 1997 Fusion of visible and thermal imagery improves situational awareness in: J.G. Verly (Ed.), Enhanced and Synthetic Vision 1997, International Society for Optical Engineering, Bellingham, WA, 177188 .
  23. 23. VargaJ. T. 1999 Evaluation of operator performance using true color and artificial color in natural scene perception (Report ADA363036), Naval Postgraduate School, Monterey, CA.
  24. 24. WaxmanA. M.GoveA. al. 1996 Progress on color night vision: visible/IR fusion, perception and search, and low-light CCD imaging, Proc. SPIE 2736 96107 , Enhanced and Synthetic Vision 1996, Jacques G. Verly; Ed.
  25. 25. ZhengY.EssockE. A. 2008 A local-coloring method for night-vision colorization utilizing image analysis and image fusion, Information Fusion 9 186199 .
  26. 26. ZhengY.EssockE. A.HansenB. C. 2005 An advanced DWT fusion algorithm and its optimization by using the metric of image quality index, Optical Engineering 44 (3), 037003-1-12.
  27. 27. ZhengY.EssockE. A.HansenB. C. 2004 An advanced image fusion algorithm based on wavelet transform-incorporation with PCA and morphological processing, Proc. SPIE 5298 177187 .
  28. 28. ZhengY.HansenB. C.HaunA. M.EssockE. A. 2005 Coloring Night-vision Imagery with Statistical Properties of Natural Colors by Using Image Segmentation and Histogram Matching, Proceedings of the SPIE, 5667 107117 .

Written By

Yufeng Zheng

Submitted: 19 October 2010 Published: 24 June 2011