Representation of Radar Micro-Dopplers Using Customized Dictionaries Representation of Radar Micro-Dopplers Using Customized Dictionaries

Human motions give rise to frequency modulations, known as micro-Dopplers, to contin- uous wave radar signals. Micro-Doppler signals have been extensively researched for the classification of different types of human motions as well as to distinguish humans from other moving targets. However, there are two main scenarios where the performance of existing algorithms deteriorates significantly—one, when the channel consists of multiple moving targets resulting in distorted signatures, and two, when the systems conditions during the training stage deviate significantly from the conditions during the test stage. In this chapter, it is demonstrated that both of these limitations can be overcome by representing the radar data through customized dictionaries, fine-tuned to provide sparser representations of the data, than traditional data-independent dictionaries such as Fourier or wavelets. The performances of the algorithms are evaluated with both simulated and measured radar data gathered from moving humans in indoor line-of-sight conditions. the carrier frequency—the fundamental parameter that controls the resolution, extent, and quality of micro-Doppler data—was varied. The algorithm learned dictionaries from diverse training data and was capable of correctly classifying completely different test data. While the dictionary learning techniques were examined in the context of micro-Doppler data, the encouraging results suggest that these techniques may be successfully extended to other radar scenarios especially the range-Doppler images of ISAR.


Introduction
Radar detection of humans has emerged as a topic of considerable research interest in the last two decades for varied applications such as security and law enforcement, through-wall surveillance, search and rescue operations, biomedical applications, and automobile radars. Two types of radars have been studied for these applications. The first type is the broadband impulse radars with high range resolution. These radars map the surroundings and moving targets (such as humans) are detected with the assistance of moving target detection algorithms [1,2]. The second type of radars is the phase coherent continuous wave radars. These could be either broadband frequency-modulated continuous wave radars [3], such as those used in automobile radar applications. Alternately, narrowband (single tone) radars have been used for through-wall surveillance applications [4]. The latter are cheap to build with off-the-shelf components. In this chapter, human detection and classification by a monostatic continuous wave Doppler radar that transmits a sinusoidal signal of carrier frequency f c is discussed. The human body can be considered as an extended target with multiple point scatterers on different body parts. When the radar signal impinges on a non-rigid moving human body, the micro-motions of the arms, legs and torso introduce micro-Doppler shifts ( f D ) on the scattered radar signal at the radar receiver. Each of these shifts is proportional to the carrier frequency and the radial relative velocity ( v ) between the body part and radarf Each of these body parts follows a unique trajectory giving rise to multiple micro-Doppler components that superpose. The scattered radar signal at the receiver is amplified and demodulated. There are several methods for representing the time-domain micro-Doppler returns, x ( t ) , of which, the most popular technique is to use joint time-frequency transforms such as the shorttime Fourier transform (STFT) as shown below: In the above equation, the short time window is given by h ( t ) . The STFT showcases the timevarying nature of the individual Doppler tracks from multiple body parts [5,6]. For example, Figure 1 shows the micro-Doppler signature of a human walking toward a monostatic continuous wave radar at 7.5 GHz. The micro-Dopplers are mostly positive since the human is moving towards the radar. Dopplers from the right and left arms and legs alternate with each other due to the swinging motion of the limbs.
The highest Dopplers arise from the feet followed by the arms and then the torso. The torso, though, gives rise to the strongest signal due to the high radar cross-section.
Different periodic motions-such as running, crawling, and boxing-each give rise to unique micro-Doppler spectrograms. Hence, these STFT-based signatures have been extensively studied for classification of different human activities as well as to differentiate humans from other movers [7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23]. The main limitation of the STFT, however, is that the resolution along both the time and frequency domains are controlled by the choice of the dwell time or short time window used in the transform. Depending on the type of target, the dwell time parameters are often chosen heuristically, to realize the most informative signatures. Other representations include the use of signal processing techniques such as independent component analysis [24], Hilbert-Huang transforms [25,26], etc. In all of these cases, the dictionaries or bases for representation of the radar data are independent of the data. While data-independent dictionaries are usually computationally simple to derive, they are not specifically fine-tuned to the underlying features of the data and are dependent on heuristic parameter selection. Hence, while they have had success in simple scenarios-such as target classification when the channel consists of only a single target or when the test and training conditions are identical-they are not suited for more complex realistic scenarios.
Over the last decade, dictionary learning has emerged under the aegis of machine learning. Here, customized dictionaries are derived directly from the data and hence are capable of abstracting the fundamental characteristics of the data. Dictionary learning has been used successfully in a variety of domains such as image processing and face recognition [27,28] and energy disaggregation [29]. Due to the abstraction of the data bases, the hypothesis is that they will be useful for addressing detection and classification challenges under more complex scenarios. This chapter specifically addresses two such scenarios discussed in [30,31].
In the first scenario in [30], the problem when the propagation channel consists of multiple simultaneously moving targets is considered. Whether the radar is deployed in indoor or outdoor environments, the presence of multiple simultaneous movers is highly likely. For instance, in indoor environments, multiple human movers are encountered along with dynamic clutter from moving fans, loud speakers, etc. Similarly, in outdoor environments, moving vehicles and animals along with pedestrians are encountered. When multiple targets move simultaneously in the propagation channel, their radar scattered returns superpose giving rise to distorted spectrograms. Therefore, the radar returns from these targets must be disaggregated before they can be fed into classifiers. In other words, the research problem focuses on single channel source separation or multiple target detection rather than single target classification. Specifically, unique dictionaries are learned directly from the raw time domain radar data from each of the single target categories. This means that no type of parameter selection operation is carried out. Then, these unique dictionaries are used to detect the presence of multiple targets in test data.
Next in [31], the scenario when the training and test conditions of radar data measurement deviate considerably is considered. Usually training data are gathered in tightly controlled laboratory conditions. But the test data are gathered in real world scenarios where the radar system deployment may encounter some challenges. For instance, presence of wireless interference sources or dispersion in the propagation channel (say through-wall scenario) may render some change in the carrier frequency inevitable. In such instances, a degree of reconfigurability in the radar hardware parameters and flexibility in the radar software is desirable. While the hardware reconfigurability can be realized by implementing the radar on software defined radio platforms, the flexibility in the software can only be realized if the processing algorithms can handle diversity in the training and test data. When data-driven dictionaries are derived from diverse radar data, collected across multiple carrier frequencies, the algorithms derive fundamental characteristics of the data that are common to specific motions across different carrier frequencies. In this manner, they are capable of recognizing that specific motion category even when the radar data are gathered at a new carrier frequency that is distinct from those used previously while training.
The chapter is organized as follows. Section 2 details the analytical framework for learning dictionaries from raw time domain data along with description of how these dictionaries can be utilized for both single channel source separation (or disaggregation) and for classification (when training and test data are gathered in different conditions). Sections 3 and 4 provide the experimental validation and results for both of these scenarios. Section 5 concludes with a discussion on the advantages and limitations of the dictionary learning algorithm in the context of radar.

Theory
Throughout the chapter, vectors are indicated with small bold letters, matrices with capital bold letters, and constants or variables with small letters. In this section, the concept of representing time-domain micro-Doppler data with uniquely customized dictionaries is introduced. These dictionaries are learnt from training data. In this chapter, we discuss the synthesis learning dictionary framework which is based on the idea that radar data, x, can be synthesized with unique data dependent bases functions, D, and their corresponding coefficients, z. In other words, x = Dz. An alternate framework that has also been investigated is called the analysis dictionary learning framework. Here, the data is analyzed by dictionaries to create sparse coefficient representations. This can be shown as Dx = z. The analysis dictionary framework is quicker to execute than the synthesis dictionary learning in the test stages, since it involves a multiplicative operation rather than matrix inversion. Interested readers may refer to [32] for further details on the analysis dictionary learning framework and its experimental validation.

Dictionary learning
Consider the case of I targets. Each target is assumed to be moving in isolation in the propagation environment. The radar data from the target, i, is represented as a vector x i which consist of N samples. The training data corresponding to target i consist of M such measurements and The objective is to represent X i as shown in Eq. (2): are the dictionary and coefficient matrices corresponding to the target i. The distinction of this dictionary learning process from other data-independent transforms such as Fourier or DCT is that Z i must be a row-wise sparse coefficient matrix with a sparsity value of τ . Therefore, the objective of the dictionary learning algorithm is given as Eq. (3) It is well known that the l 0 -minimization problem stated above is NP-hard [33]. There are therefore, two approaches for solving Eq. (2). One is to implement the l 0 -minimization using computationally expensive greedy techniques. Alternately, the l 0 -minimization can be reduced to a l 1 -minimization technique as shown in Eq. (4) Here, λ is the regularization parameter that trades off between the representational accuracy and the sparsity in the l 1 -minimization operation. An iterative alternative minimization approach is used to solve Eq. (4). First, D i is initialized using random columns selected from the training data X i . Then Z i is determined using Eq. (5) by implementing the iterative soft thresholding algorithm (ISTA) discussed in [34] is estimated using the simple least squares minimization shown in Eq. (6). Each of the P columns of the dictionary has to be normalized to less than unity in order to prevent scale ambiguities Equations (5) and (6) are iterated until the representation error falls below a pre-defined threshold or when it converges. The process is repeated to learn unique dictionaries for every target. Then, all the dictionaries are concatenated to form an aggregate over-complete dic- The aggregate dictionary, D ∈ [ N × IP ] , is used for both single channel source separation and for classification of test micro-Dopplers. Both of these are discussed in the succeeding sections.
The dictionaries of multiple targets are learned individually in Eq. (4). An alternate mechanism would be to learn the multiple dictionaries together as suggested by [35]. Here, besides the sparsity penalty in Eq. (3), an additional penalty is introduced to increase the discrimination across multiple target categories. This step increases the computational complexity during the training stages. But since there was no discernible improvement in the performance of the disaggregation and classification algorithms, this scheme is not discussed in this chapter.

Single channel source separation of multiple radar micro-Dopplers
Now each single test measurement-a vector consisting of N samples, x ~ ∈ [ N × 1 ] is considered. In the test scenario, there may be single or multiple targets moving simultaneously in the propagation channel. Therefore, the test signal may be the aggregate of time-domain micro-Dopplers from multiple targets. Using the aggregate dictionary, the test coefficient vec- Note that the test coefficient vector, z ~ 1:I , is distinct from the columns of the coefficient matrix, Z i , derived earlier from Eq. (5). In z ~ 1:I , the coefficients ( z ~ i ) corresponding to each of the I categories of targets are obtained at once as opposed to Eq. (5), where the coefficients corresponding to just the training target category are realized. If one or more targets are present, then the hypothesis is that their corresponding coefficients ( z ~ i ∈ [P × 1] ) extracted from the composite test coefficient vector ( z ~ 1:I ) will be non-sparse, while the coefficients belonging to the absent targets will be close to zeros. The intuition here is that if the dictionaries are sufficiently discriminative, then they will be able to extract the corresponding target coefficients even when the radar data consists of superposition of returns from multiple targets. Therefore, a target i is determined to be present if the corresponding D i z ~ i is above a predefined threshold.

Classification of radar micro-Dopplers
In the previous section, single channel source separation of test data that comprises of radar returns from multiple targets was discussed. Now, the focus is on classification of radar data from an unknown test target category. It is important to note that while the previous problem focused on multiple target detection, the problem in this section focuses on single target classification. In both cases, the aggregate dictionary matrix, D, obtained from concatenation of the individual dictionaries learned from each of the target categories is utilized. For classification purposes, the training features for each target i are derived from the coefficient vector, Z ¨ 1:I i , obtained using the training data, X i , and the aggregate dictionary using Eq. (8) Though the training data matrix is identical, the training feature, Z ¨ During the test stage, test features, z ¨ 1:I , are extracted from each test measurement, x ¨ i (from a single target category) using Eq. (9) The test data are subsequently classified based on the similarity between z ¨ 1:I and the columns of Z ¨ 1:I i across all the I target categories. In this chapter, the support vector machine classifier is used. The data were tested on other popular classifiers such as KNN but did not result in any significant difference in the results.

Experimental data collection
The proposed dictionary learning methods are validated on experimental radar data gathered from moving humans. Both simulated radar data and measured data are considered. Both types of data offer some advantages and some limitations. The simulated radar data offer an opportunity to test the performance of the algorithms quickly and under a variety of radar operating conditions. In other words, the simulations offer a flexible mechanism to change radar parameters. The data are also highly sanitized due to the absence of noise and limitations of radar hardware. The measurement data collection is limited by radar system parameters such as the dynamic range of the radar receiver, the carrier frequency, sampling frequency, and antenna characteristics. On the other hand, current state-of-the-art simulation methodologies for human radar data do not capture the entire physics of the human scattering phenomena. The measurement data, therefore, are crucial for validating the proposed algorithms in real world scenarios. The second important limitation of the simulation data (unlike the measurement data) is that with the current techniques, the channel can consist of only a single target. Therefore, the simulation data are only used for the single target classification problem and not for the multiple target detection or disaggregation problem.
In the following two subsections, both the simulation and measurement methodologies are detailed.

Simulation data
Simulation of radar scatterings of still humans has been investigated with full wave electromagnetic techniques as well as the computationally cheaper ray tracing technique at frequencies below X band [36]. The results from the simulations of a uniformly dielectric human body showed that the ray tracing results were comparable to the results from full wave methods. However, both of these methods are computationally not feasible for simulating radar returns from dynamic humans since this requires modeling of multiple human poses. Alternately, the simple primitive-based modeling has proven to be reasonably accurate for modeling human motions [37,38]. Here, the different body parts on the human are modeled as simple primitives such as the head as a sphere, the torso as a cylinder or ellipsoid, and so forth. The radar cross-sections of these simple shapes ( σ b ) are well characterized for different carrier frequencies ( f c ) and aspect angles. One or more point scatterers ( r → b ) are identified for each body part. When the human moves, the time-varying positions of the point scatterers give rise to micro-Dopplers. Then, the radar returns of the human are obtained by the complex sum of the returns from each of the body parts as shown in Eq. (10) This simplistic model is easy to execute in real time. However, it does not capture the entire physics of the human scatterings. For instance, it does not capture the multiple scatterings of waves between the different body parts or their shadowing effects.
There are three methods that are currently used to describe human motions. The simplest method is to model the swinging motion of the two legs as a double pendulum. A more complete analytical model, known as the Boulic-Thalmann model, was proposed in [37]. This model provides analytical equations to describe different human body parts (arms, legs, hands, and feet) as a function of the human height and relative velocity with respect to height. However, the model is restricted only to a simple human walking motion. More complex and realistic motions, such as crawling, hopping, and running, can be obtained using computer animation data. The radar scattered returns from complex human motions can be therefore obtained by combining animation data with the primitive-based electromagnetic modeling [38]. However, the animation data are obtained through motion capture technologies of a live actor. Therefore, the model cannot be parameterized to obtain varied data for different humans (of different heights or gait patterns) through a single measurement.
In this study, the Thalmann model was used to model human walking motions for multiple human heights (1.5-1.8 m) and velocities (1.5-3.6 m/s). Due to the limitations of the model, a variety of human motions could not be simulated. Instead, just two types of motions were considered-when a human is walking towards the radar and when the human is walking away from the radar. The human moves in the line-of-sight of a monostatic continuous wave radar anywhere from a distance of 2-8 m. The duration of the human motion is 1 s and the sampling frequency of the simulation is 1 kHz. We imparted frequency diversity to the simulation data by varying the carrier frequency across five values-{2.5, 3, 3.5, 4, and 4.5 GHz}. Three hundred and sixty distinct walking motions are simulated both toward and away from radar at each of the carriers. Of these, 80% of the 1800 total simulations-corresponding to four of the five carrier frequencies-were used for training the classification algorithm; and the remaining 20%-from the fifth remaining carrier frequency-were used for testing purposes. The objective here is to test the capability of the dictionary learning algorithm to learn fundamental features from the human micro-Doppler data pertaining to a specific motion despite the variations of the carrier frequency. The dictionaries are learned directly from the raw data without any additional post processing. Therefore, at no stage were any parameters heuristically chosen.
A third category of indoor mover may give rise to significant dynamic clutter-a fan. Hence, a fan was modeled with three blades and multiple point scatterers on each blade. A lot of variety in the experimental data from the fan was generated by changing its speed of rotation (200-400 rpm), the length of the blades (0.2-0.4 m), width of the blades (0.14-0.17 m) and the orientation of the fan with respect to the monostatic radar. The fan micro-Dopplers at five carrier frequencies (1800 distinct simulations) were simulated of which data from four carriers were used for training and the data from the fifth carrier for testing purposes. The results and analyses for dictionary-based classification of simulated human radar data are presented in the following section.

Measurement data
Next, experimental data were collected in indoor line-of-sight conditions. A monostatic continuous wave radar was set up using a N9926A FieldFox vector network analyzer (VNA) and two linearly polarized broad band horn antennas (HF907). The VNA was configured to make narrowband measurements around a center carrier frequency (Figure 2).
The transmitted power of the VNA was set at +3 dBm. The instrument is highly sensitive with a dynamic range of over 100 dB. The received signal is amplified, in-phase quadrature (IQ) demodulated and digitized inside the instrument and then directly processed in a 2.4GHz Intel Pentium processor. Two sets of experiments were carried out.
The first experiment was carried out to validate the multiple target detection algorithm based on dictionary learning. The center frequency of the VNA was set to 7.5 GHz. Then, human radar data were collected for two types of motions-when a human is walking towards the radar (FH) and when a human is walking away from the radar (BH). Each measurement trace is 2.7 s long with 1000 samples. The low sampling frequency is due to the system constraints of the VNA when it is configured in the narrowband mode. Measurement data were gathered for 40 humans of both genders and of varying heights and gait patterns. The humans move between 1 and 9 m away from the radar in line-of-sight conditions. The third motion class that was considered was of a table fan (TF) with three blades. The table fan was operated at three different rotation rates and was placed at different distances and orientations from the radar. Then, measurement data were gathered with multiple targets moving simultaneously. The cases are: FH + BH, FH + TF, BH + TF, and FH + BH + TF. The objective is to learn dictionaries using training data in the single-target scenario. Then these dictionaries are combined and used to detect targets in multiple target scenarios.
The next experiment that was conducted was again based on the same set up. However, this time only single-target scenarios were considered. Instead, the carrier frequency of the measurement data was varied across five values-{2.5, 3, 3.5, 4, and 4.5} GHz. A variety of human motions were considered-two humans walking simultaneously before the radar (TH), human standing still but boxing with his arms (HB) and a human walking while holding a stick (HHS). The last case that was considered was of the rotating table fan (TF). The challenge in this experiment is to learn dictionaries and training features from measurement data corresponding to four carrier frequencies, while testing the classifier with data from a fifth distinct carrier frequency. This experiment was specifically chosen since the human micro-Doppler shows a lot of variations due to the carrier frequency and the Dopplers are directly proportional to the frequency. The table fan was selected for both types of experiments, since it is one of the key contributors of dynamic clutter in indoor environment.

Results and analysis
In this section, the results of using the customized micro-Doppler dictionaries of the different types of motions for both multiple target detection as well as single target classification are discussed.

Results from multiple target detection using disaggregation of radar data
First, the measured data that are gathered for multiple target detection are considered. As mentioned in Section 3.2, three target classes-human walking toward the radar (FH), away from the radar (BH), and table fan (TF) are considered. The dictionaries are learned from single target data and use them to detect the presence of multiple simultaneous movers in three scenarios-single-target scenario, two-target scenario, and three-target scenario. The true detection and false alarm percentage for each of these cases is summarized in Table 1.
The results show that for a single-target scenario, the true detection is very high (above 93%) in all the three cases. The true detection of the fan is slightly poorer than that of the humans because of greater probability of aliasing arising with the fan micro-Dopplers due to the limited sampling frequency of the radar measurements. This also gives rise to the slightly higher false alarm rate of the fan when compared to the humans. Next, three two-target scenarios are considered. In each of the scenarios, the algorithm correctly detects the presence of two targets in more than 80% of the cases by disaggregating their micro-Dopplers. The false alarm rate though is high especially from the table fan due to the aliasing. In the three-target scenario, the algorithm correctly detects the presence of all three targets in more than 90% of the cases. This sort of multiple target detection cannot be carried out using basic data independent transforms and this result demonstrates the usefulness of representing micro-Dopplers with unique data-dependent dictionaries.

Results of single target classification using frequency diverse radar micro-Dopplers
Now the objective is to correctly classify that target class when the propagation channel consists of only one type of target class. The challenge here is to classify test data where system conditions (carrier frequency) during test deviate significantly from the training conditions. Therefore, during the training stage, the dictionaries for each target class are learned from diverse frequency data (from four carrier frequencies) and the test data consists of micro-Dopplers from a fifth carrier frequency that was previously not used for training.
The following five carrier frequencies-{2.5, 3, 3.5, 4, and 4.5} GHz are considered. In fold 1 through fold 5, the test frequencies are 2.5, 3,…,4.5 GHz, respectively, and the training data for each fold are obtained from the corresponding complimentary set from the total set of carrier frequencies. The performance of the algorithm for both simulation and measurement data are studied. In the simulation set up, three target classes are considered-a human walking towards the radar (FH), a human walking away from the radar (BH), and a fan (TF).
The classification results obtained from the fivefold classification are presented in Table 2.
The results show very high classification accuracy (close to 100%) for all the cases. This shows that the algorithm is capable of learning unique dictionaries corresponding to each human motion from frequency diverse training data. Again, this type of classification would not be possible with other data-independent transforms which rely on heuristic parameter selection.
The simulation data are highly sanitized, since they are not affected by real world conditions such as noise, interference, and radar system limitations. Therefore, in the next study, the performance of the algorithm on real world data is studied. Four types of motions-two humans walking simultaneously in the propagation channel (forming a single class of motion-TH), a single human standing still and boxing his arms (HB), a human walking with a stick (HHS), and a table fan are considered. Five-fold classification is performed on the target data and the results are presented in the following two tables. Data from four carriers are used for training while data from the fifth carrier are used for testing. there are times when the scattered signal from one human may be weaker than the other human (due to shadowing or due to different distances of the two humans from the radar). As a result, there are similarities at times between TH and HHS. Second, when the human is boxing, the micro-Dopplers occur at both positive and negative frequencies due to motions of the arms toward and away from the radar. This can be confused with the HHS, especially from the back swing of the legs, stick, and arms. The micro-Dopplers of both these motions could resemble each other especially across different carrier frequencies. Table 4 shows the classification results across all the fivefolds. The results show a fairly good classification performance (above 85%) for all of the cases. The confusion of the table fan with the human motions can again be attributed to the limited sampling frequency of the measurement data.

Conclusion
In conclusion, the usefulness of representing human micro-Dopplers with unique data dependent dictionaries is investigated. These dictionaries are applied subsequently to two applications. The first application is for detecting multiple simultaneously moving targets in the propagation channel. Results demonstrated that weak targets are detected in the presence of stronger targets and strong targets are correctly identified even when their signatures are distorted by returns from the weaker targets. The second application that was discussed was classification when there is significant variation between the training and test conditions. Specifically, Data from four carriers are used for training while data from the fifth carrier are used for testing. The classifier was trained using measured data at {2.5, 3, 3.5, and 5} GHz from [30]. the carrier frequency-the fundamental parameter that controls the resolution, extent, and quality of micro-Doppler data-was varied. The algorithm learned dictionaries from diverse training data and was capable of correctly classifying completely different test data. While the dictionary learning techniques were examined in the context of micro-Doppler data, the encouraging results suggest that these techniques may be successfully extended to other radar scenarios especially the range-Doppler images of ISAR.