Open access peer-reviewed chapter

Sense Smart, Not Hard: A Layered Cognitive Radar Architecture

By Stefan Brüggenwirth, Marcel Warnke, Christian Bräu, Simon Wagner, Tobias Müller, Pascal Marquardt and Fernando Rial

Submitted: June 12th 2017Reviewed: September 29th 2017Published: May 16th 2018

DOI: 10.5772/intechopen.71365

Downloaded: 1196


In this chapter, we present a cognitive radar architecture based on the three-layer model by Rasmussen. The skill-based-layer is characterized by adaptive signal-processing approaches and target matched waveforms. The rule-based-layer comprises reactive execution of optimal illumination policies and resource-management. The knowledge-based layer allows for long term, goal-oriented mission- and trajectory planning. Each layer is illustrated by example algorithms and applications for implementation.


  • adaptive filters
  • cognitive systems
  • closed-loop controllers
  • robotics
  • signal processing
  • system architectures

1. Introduction

Modern multifunctional radars with electronic beam-steering (AESA) provide many degrees of freedom to point the antenna beam, usage of the electromagnetic spectrum and waveform selection (Figure 1). Complex surveillance and reconnaissance scenarios require increased automation and suited man-machine-interfaces, which is enabled by the cognitive radar approach [1, 2, 3].

Figure 1.

Airborne multifunctional radar.

In this article we explain a cognitive radar architecture developed at the Fraunhofer FHR based on the three-layer-model of Rasmussen [4]. In the following, we will first introduce the concept of cognitive automation and derive our cognitive radar architecture. For each cognitive subfunction several technologies for realization are discussed and illustrated by example applications.


2. Cognitive automation for radar

The concept of Dual-Mode Cognitive Automation [5] is well suited to deal with the challenges of highly automated radar systems. As shown in Figure 2, intelligent software-agents (depicted as robot-heads) can be introduced into the work equipment to increase the level of automation under the supervisory control paradigm [6] .

Figure 2.

Concept of dual-mode cognitive automation [5].

Alternatively the software-agent can cooperate with the human operator in the sense of an intelligent assistant system [7]. Even though the cognitive radar architecture can be used for both approaches, we will focus on the more traditional supervisory control role in the following.


3. Three layer model of a cognitive radar architecture

The three-layer model of human cognitive performance published by Jens Rasmussen in 1983 is widely used in human factors [8], cognitive psychology and robotics [9, 10]. As shown in Figure 3 the complex process of human cognition is simplified and broken down into cognitive subfunctions (shown as gray boxes) with the indicated flow of information. The Rasmussen-model distinguishes three layers of cognitive performance with increasing level of abstraction.

Figure 3.

Three-layer-model of a cognitive radar architecture with supporting technologies. Modified from Ref. [4].

The skill-based-layercomprises subconscious and very efficient perception and control tasks (such as steering along a curvy road). Above it, the rule-based-layerdescribes reactive behavior. Learned procedures are triggered by certain cues in familiar situations (such as stopping the car at a red traffic light). The knowledge-based-layerenables deliberate, goal-based behavior. By inferring novel solutions from a-priori knowledge flexible reaction in unknown situations is achieved (e.g. bypassing a traffic jam based on a road-map).

For the development of a cognitive radar architecture in analogy to the Rasmussen-model, each cognitive subfunction had to be mapped into five different radar-technologies as shown in Figure 3.

Modern radar system can generate arbitrary waveforms in real-time. This allows for transmit signals to be matched to the target transfer function or the electromagnetic spectrum as explained in Sections 4.1 and 4.2. Perception tasks of a radar comprise signal-processing and classification aspects. We use a machine learning approach that is illustrated in Section 4.3. Rule-based behavior in a radar is emulated by using optimal control policies or resource management approaches as shown in Section 5.1 or Section 5.2. Knowledge-based behavior can be implemented using Bayesian networks or automated planning algorithms. We show an example for robot-trajectory planning in Section 6.1.


4. Skill-based-layer

The skill-based layer represents the basic signal-generation and processing capabilities of the radar system. It operates on the smallest timescale in the architecture in a continuous processing loop. Below, we give an example for adapting the transmit waveform to the target-transfer function using arbitrary waveform generation capabilities. As an extension, the waveform can further be adapted to the electromagnetic spectrum that has to be continuously sensed.

4.1. Matched illumination

If a priori information about a target is available, it is possible to optimize the transmission waveform for this target. Advantages arise for example by discriminating two classes of targets or by reducing resources of the sensor. One example is the reduction of the required bandwidth, if the available a priori information about the target is comprehensive.

In order to resolve the size of the object, two transmission frequencies are sufficient to estimate the extension of two scattering points with a spacing of Δz[11] (see Figure 4). The maximal energy at the receiver can be achieved when two frequencies are superposed to a beat frequency where the envelope covers the dimension of the target. If there are more than two scattering points the frequency spacing must be higher to achieve a higher period of the beat. In practice, the assumption of a known target impulse response is often difficult to

Figure 4.

Transmitting two frequencies with a spacing ofΔF = c/2Δz, the size of the object can be obtained. The shape of the object requires an even larger frequency separationΔF = c/2δz(modified from ref. [11]).

realize. In a cognitive radar system, the a priori knowledge of the target can be presupposed by previous measurements and is assumed to be predicted for the next time step. An adapted waveform can be used to update the target track with respect to its extension by a lower allocation of the bandwidth.

Assuming a linear, time invariant channel with additive white Gaussian noise w, the complex received signal ycorresponds to a convolution of the transmission signal sand the target transfer function hi


where ys, irepresents the undisturbed signal component. The linear convolution can be expressed by a matrix vector multiplication where the Toeplitz structured convolution matrix Hifor the target index iis created by elements of the impulse response hi. The detection performance is directly related to the signal to noise ratio (SNR) and depends on the receiver bandwidth and the power of the received signal ys


If the target characteristic Hiis known, the signal to noise ratio can be increased by optimizing the waveform s[12]. The optimisation problem can be formulated to

maxsSNR=sHAssubject toEs=sHs=s22=1E3

with the constraint of an energy limited transmission signal and the Hermitian correlation matrix A=1σw2HiHHi. One possibility to solve this optimisation problem is the Lagrangian multiplier method


Eq. (4) is obviously an eigenvalue equation where the Lagrange multiplier λrepresents the real eigenvalue and the waveform sis the corresponding eigenvector. By choosing the maximal eigenvalue, the signal to noise ratio is maximized. The eigenvector which corresponds to the maximal eigenvalue is directed towards the highest energy (variance).

To gain a better understanding of the solution, the basic example of Figure 4 is presented for the two dominant scattering points


The corresponding frequency spectrum to Eq. (5) is


The target impulse response fluctuates due to the interference of the scattering centers with a period of t1t0=2c0r1r0and depends therefore on the target dimension as already visualized in Figure 4. The phase shift of the second point target causes a shift of all frequency maxima (see also Figure 5(a)). Processing an eigenvalue decomposition according to Eq. (4) to obtain the optimal waveform for this example (see Figure 5(b)).

Figure 5.

Target impulse response and optimal transmission waveform in time and frequency domain. (a) Target impulse response in time/range (upper) and frequency domain (lower) for two point targets atr0 = 37.32 (a0 = 1) andr1 = 44.82 (a1 = 1 ∠ 20°), (b) Optimal transmission waveform (eigenvector corresponding to the maximal eigenvalue) in time (upper) and frequency domain (lower).

Comparing this basic results with the solution of the eigenvalue decomposition, it is obvious that both frequency spectra are related to each other. If all frequency components of the target impulse response have comparable magnitudes, the frequency characteristic of the largest eigenvalue is similar to the target frequency spectrum and the target extension respectively. According to Eq. (6) the period of the target extension corresponds to a frequency of ΔF=c02r1r020.0MHZ. The envelope of the transmission wave is related to the phase difference between r0 and r1, and causes a frequency shift in the frequency domain with fe=ϕ1ϕ0φF2π1.125 MHZ. Summarizing the characteristic of the optimal transmission signal, the eigenvector corresponds to the main direction of the target variance in the frequency domain and is linked to the physical behavior of the target. If there are more than two scattering points, additional modulation products will occur. In the case where all frequency components of the target spectrum have similar magnitudes, the eigenvector corresponding to the largest eigenvalue will represent all constructive interferences in the resolution bandwidth. But also for small deviations of the spectral magnitudes, the main component (optimal eigenvector) will contain only the dominant frequency while the minor amplitudes are represented by the remaining eigenvectors forming finally the complete signal space.

In order to distinguish between targets an adapted waveform can be used to improve the discrimination between two types of target classes [2]. A binary hypothesis test is one method to discriminate between target classes by evaluating the received signal


The distance d = ‖ys, 0 − ys, 12 = ‖(H0 − H1)s2 denotes the difference of the received amplitude without taking noise into account. The robustness against incorrect classification increases for higher distances especially in a noise environment. Similar to Eq. (2)(4), the optimal waveform can be calculated by solving


The energy is focused in the spectral area where the both target deviations are predominant.

Comparing the performance of a binary hypothesis test for a linear chirp and the optimized waveform, the test statistic of the likelihood ratio for Eq. (7) is calculated [13]


with the variance σw2of the complex noise. The deflection for the likelihood ratio test defines the effective difference of the likelihood centers and represents the output signal to noise ratio


It is possible to achieve the same performance of the receiver operating characteristic (ROC) curve, with a different test statistic of that likelihood ratio but different deflections [14]. That is why a higher deflection is related to a better discrimination and a lower sensitivity with respect to an suboptimal threshold. Figure 6 shows the results of the binary test for a linear frequency modulation (LFM) and the optimized waveform for two Gaussian targets with the same extension and distance. The deflection between both classes increases for the optimised waveform leading to a lower intersection are of the test statistic for the hypothesis and the alternative. This facilitates a better separability as well as a lower false alarm rate for the same detection probability.

Figure 6.

Test statistic of the likelihood ratios with the mean distance of the centers for LFM and optimized waveform. (a) Distribution of the likelihood ratio for noisy samples of the hypothesis and alternative using linear frequency modulation. (b) Distribution of the likelihood ratio for noisy samples of the hypothesis and alternative using the optimized waveform.

One example of adapting the waveform to the environment is the support of the classification and saving resources like the bandwidth. Applications like interference mitigation can also be executed in the skill-based layer by combining spectrum sensing algorithms with matched illumination.

4.2. Spectrum sensing

Due to the fact that wireless communication technologies are of significant importance in modern times, the available radio frequency spectrum has become a valuable resource for radar. For example the U.S. department of commerce [15] has decided to allocate parts of the S-band (1695–1710 MHZ and 3550–3650 MHZ) to wireless communication. Another example are parts of the C-band (5150–5350 MHZ and 5470–5725 MHZ) which are used by weather radars but are also used by 5GHz-WiFi [16] now. On the other hand these bands, although allocated, are underutilized providing opportunities for secondary (unlicensed) users to share the bands without harming the primary users. The other way round, a similar problem arises when the radar suffers interference from other users or even active jamming. Especially the first is a problem for ultra wideband radars like ground penetrating radars which naturally operate in partially occupied frequency bands. In the future these problems will become even worse and hence future cognitive radar systems must be able to operate in spectrally dense environments. Spectrum sensing techniques from cognitive radio provide algorithms to identify spectrum opportunities, i.e. to decide if a frequency band is occupied or not. With this information a cognitive radar can adapt dynamically its bandwidth, frequency and other transmit parameters to the radio frequency environment.

A significant number of studies dealing with spectrum sensing algorithms exists and hence we only give a brief overview here. For a comprehensive overview the reader is referred, for example, to the surveys [17, 18]. Spectrum sensing algorithms can be split into wideband and narrowband algorithms. Almost all narrowband spectrum sensing methods are statistical hypothesis tests usually written as


where x(t) represents the received complex signal, s(t) the signal of another user and w(t) the noise which is usually assumed white and Gaussian with variance σw2. The most simple spectrum sensing method is the energy detector


where pfais the desired probability of false alarm and Fχ2n2is the χ2 distribution function with 2ndegrees of freedom. Although this method is easy and fast, it suffers from bad detection probabilities in low SNR regions and poor robustness, see [19]. More advanced methods exploit certain features like for example cyclostationary properties [20] where a time series x1, x2… is said to exhibit cyclic frequency αwith delay mif


Most modern modulations like OFDM or QAM have cyclostationary properties. For details on a test statistic see [21]. These methods offer high detection probabilities even in low SNR regions and are blind in the sense that they do not need information about σw2. The price is a very high computational complexity and prior information about the used modulation. Completely blind methods, i.e. absolutely no prior information is necessary, are based usually on a multi antenna system. The data from the different channels is used to estimate a covariance matrix and from its characteristics e.g. eigenvalues a test statistic is build, see [22].

Because the channel state may change between the sensing and transmitting a prediction step after the sensing is helpful or even needed. For this purpose hidden Markov models are used in Ref. [23] and additionally multilayer perceptrons and recurrent neural networks are considered in Ref. [24]. Especially the neural networks perform well in simulations with a prediction accuracy of about 0.8 to 0.9.

In contrary to the narrowband band spectrum sensing the wideband spectrum sensing methods divide a band into occupied and unoccupied subbands. The most obvious method for classifying a wideband is to split it into fixed subbands (using a FFT or sweep and tune) and perform narrowband sensing in each one. But there are also native wideband spectrum sensing methods like a wavelet based approach, see [25].

If the radar is the primary user and avoiding or reducing interference is the only goal of the spectrum sensing, it is not necessary to decide if a channel is occupied or not. It is sufficient to use the channel with the least interference. But if a lot of interference is present, a compromise between bandwidth (resolution) and interference must be made which leads to an optimisation problem, see Refs. [26, 27].

After each sensing period, a suitable and adaptable waveform must be generated taking the information from the sensing step into account, essentially bandwidth and center frequency. For example, this can be multiple or notched chirps filling the unoccupied bands or a stepped FM waveform which avoids the occupied frequencies, see [27]. A combination with the matched illumination approach presented in Section 4.1 can be considered, too.

Building an experimental radar system with spectrum sensing capabilities is a challenging task. The computational complexity of some algorithms can be a burden and the additional sensing time, i.e. gathering the samples and computation time must be taken into account, causing a reduced duty cycle or pulse repetition frequency. In Ref. [27] a radar system employing spectrum sensing and matched illumination was implemented using an Ettus USRP X310 software defined radio. In a test environment about 10 dB noise floor reduction were achieved using spectrum sensing and a notched chirp.

4.3. Classification with deep learning techniques

The transition from the continuous stream of incoming row-data towards a symbolic representation of objects, which forms the basis for higher-level cognitive processing, is typically achieved using pattern recognition or classification techniques. As shown in Figure 3, machine learning approaches comprise subsymbolic feature formation processes that separate characteristic signal features in a higher-dimensional space. In this feature space, it is easier to recognize certain target classes to create an abstracted situational picture within the cognitive radar system.

4.3.1. Convolutional neural networks

Convolutional Neural Networks (CNNs) are inspired by the visual system of the brain and are part of the deep learning research field. For many years, CNNs were the only type of deep neural network that could efficiently be trained due to their structure using the technique of weight sharing [28]. The basic structure of the network used in the presented architecture is shown in Figure 7.

Figure 7.

Structure of the used convolutional neural network.

CNN’s are a special form of multi-layer perceptrons, which are designed specifically to recognize two-dimensional shapes with a high degree of invariance to translation, scaling, skewing, and other forms of distortion [29]. This invariance is achieved by an alternation of convolutional and subsampling layers, in which the neurons are organized in so called feature maps. All neurons in each of these feature maps use the same weights and are connected to a local receptive field in the previous layer. With this weight sharing technique, the number of free parameters is dramatically reduced compared to a fully connected network, what should lead to a better generalization of the network.

In the first convolutional layer, each neuron takes its inputs from a local receptive field in the input image and the output values of each feature map, which are visible in Figure 7, represent the intensity of one specific local spatial feature. The features, i.e. the weights of the neurons, are learned during the training process and since the receptive fields of neighboring neurons in the feature maps are shifted only by one pixel in the corresponding direction in the input image, the output values of each feature map correspond to the result of a two-dimensional correlation of the input image with the learned weights of each particular feature map.

In the input image of Figure 7 one target is visible in the center of the image. The correlation with the different kernels is visualized for three examples. The learned kernels are depicted inside the black squares on the input image and the result of the correlation can be seen in the feature maps of the first layer.

The second layer of the network is a subsampling layer and performs a reduction of the dimension by a factor of four. With this reduction the exact position of the feature becomes less important and it reduces the sensitivity to other forms of distortion [29]. The subsampling is done by averaging an area of 4 × 4 pixels, multiplying it with a weight wjand adding a trainable bias bj.

The third layer is a convolutional layer again and relates the features found in the image to each other. This layer is trained to find pattern of features, which can be separated by the subsequent layers and discriminate the different classes. The output of this layer is the internal representation and can be considered as feature vector found by the network for the given input image.

The last two layers of the network form the decision part of the system and are fully connected layers, which use the output values of the third layer as features for classification. The last layer consists of as many neurons as classes have to be separated, in our case ten. The classification is done by assigning the corresponding class of the neuron with the highest output value.

One cost function for neural networks trained with the back propagation algorithm is the mean square error (MSE) of the training set. The MSE is the mean value of the quadratic loss function E(α), which is given by


In (13), αis the set of classifier parameters, diis the desired output for the ith element of the training set and f(xi, α) is the classifier response to input xi. The MSE of the complete training set with size Nis thus


The MSE is also called the empirical risk with respect to quadratic loss and classifiers using this error as a performance measure are said to implement the empirical risk minimization (ERM) [30].

The training of our network is performed by the stochastic diagonal Levenberg-Marquardt algorithm that is presented in [31, 32]. The core of this algorithm is the stochastic update rule


where αlkis the l-th element of the parameter set αat iteration k, Eiis the instantaneous loss function of (13) for image iand γlkis the step size for the particular weight αlat iteration k. The dependency of the step size on the iteration indicates that the step size is not fixed during the training, but is dynamically updated. The calculation of the step size is done by


with the constant μand a parameter η(k) that prevents the step size from becoming too large when the estimate of the second derivative glkof the loss function Ei(α) with respect to αlis small. For the calculation of glkthe Gauss-Newton approximation is used that guarantees a nonnegative estimate [32]. The parameter ηis marked here as dependent on the iteration, but is fixed over several epochs of the training1. The Hessian matrix g(k) is not calculated explicitly in each iteration, instead a running estimate is kept that is updated with


where βis between zero and one. Because of the weight sharing, the first and the second partial derivative of the loss function are sums of partial derivatives with respect to the connections that actually share the specific parameter αl


In (18) and (19), the wmnis the connection weights from neuron nto mand Vlis the set of unit index pairs (m, n) such that the connection between neuron mand nshares the parameter αl, i.e.,


Further details of the algorithm and the approximations that are done to compute the derivatives can be found in Ref. [32].

4.3.2. Regularizations and adaptive learning rates

One feature of the presented network is the use of momentum, which adds a feedback loop and with this some kind of memory to the algorithm. With this technique a certain amount of the weight change of the last iteration is added to the weight change of the current iteration. This amount is determined by the momentum constant ρand leads to the expression


which can also be written as


The use of momentum should have a positive effect on the behavior of the training algorithm and may prevent the algorithm from converging to a local minimum of the error function [29]. Another important regularization method used in this network is the max-norm regularization of the weights of the network. For this regularization the Frobenius norm of each kernel in layer one and three is calculated after the weight change at every iteration and if the norm is larger than a certain value c, the kernel is rescaled to a norm of c. With this regularization an improvement of the convergence properties of the training algorithm has been observed.

So far the learning rate in (16) is only determined by the characteristics of the data itself and the error it produces at the output of the network. Another important factor could be meta-information available about the training set. We give here an example of a priority class, which means that we have one target in our database that should always be classified correctly with the additional cost that we might produce more errors in other classes. To incorporate these priority classes into our network, the representation in (22) is used. The general idea is to increase the learning rate γ(k) if an image of a priority class is presented at the current iteration. This is done by multiplying a priority weighting pwith the learning rate γ(k), which is then marked as γ′(k)


If this term is included in the formula for the weight change Δα(k), the sum in (22) can be split into two parts. One part that contains all samples of the priority classes and one part with the examples of the remaining classes


The need for a different weighting of classes is also discussed in Ref. [33], where it is mentioned that the different costs of misclassification should be part of the classifier design. The way we used here to include this prior knowledge into our target recognition system was also mentioned in Ref. [34] for Support Vector Machines, where the idea was to penalize the samples of less represented classes higher than others.

To show the benefit of this adaptive learning strategy we show an example of the ten class moving and stationary target acquisition and recognition (MSTAR) data [35] in Figure 8. In this example the learning rate of class four is multiplied with different weightings between one, which means no priority, and ten.

Figure 8.

Performance of CNN with priority class.

Without any weighting, this class has compared to the other classes a rather low correct classification rate calculated with respect to the number of input images Pccin (curve with round markers). This value gives the amount of input images that belong to class four and are actually classified as class four. The curve with the square markers in the plot gives the probability of correct classification with respect to the number of output images Pccout, which gives the amount of images that are classified as class four really belong to class four and is thus an indicator on the reliability of the classification. Summarized over all classes, both indicators lead to the same result, the correct classification rate Pcc of the curve with the triangular markers. From the plot can be seen that Pccin shows a steep increase at small values of pand up to p = 4 also the overall correct classification rate increases, which is not the purpose here, but shows the positive effect of the additional correct classifications. While Pccin is increasing, Pccout shows a steady decreasing behavior. In the extreme case of p → ∞, Pccin should reach one and both Pccout and Pcc should reach a value of Nclass4/N, which means that all images in the dataset are classified as class four. This example and more details about the use of different weightings of different classes can be found in Ref. [36].

4.3.3. Combination of convolutional neural networks with support vector machines

An often mentioned benefit of Support Vector Machines (SVMs) is the high generalization capability in comparison to neural networks. The high generalization of SVMs is achieved by a training strategy called structural risk minimization, which in comparison to the empirical risk minimizationof neural networks takes the complexity of the classifier into account. For this reason, the Vapnik-Chervonenkis (VC)-dimension hwas introduced to measure the complexity of a classifier. The VC-dimension is defined as the largest training set size N, which can be separated with binary labels in an arbitrary way by the SVM. With a high number of free parameters, the capacity of the classifier increases and thus the VC-dimension increases as well. Due to this relation, single patterns have a higher influence on the classification result for classifiers with a high VC-dimension, which increases the likelihood of overfitting to the training data [37]. To incorporate the VC-dimension into the minimization problem that has to be solved during the training, an additional term ΦNhis added to the empirical risk to define the structural risk


where Rempcorresponds to the empirical risk. In this problem Rempdoes not refer to the MSE of (14), which was used for neural networks, but to the specific number of misclassifications in the training set. The VC-dimension has an influence on both terms because a high VC-dimension will increase the complexity of the classifier and thus reduce the empirical risk, but the confidence interval ΦNhwould increase at the same time, since it only depends on the ratio between the size of the training set and the VC-dimension. SVMs are designed to find the best trade-off between these two terms, decrease the empirical error while keeping the VC-dimension as low as possible. Because of this, SVMs are classifiers with a very high generalization capability.

To use the high generalization of SVMs in our classification framework, we replace the last two layers of the CNN in Figure 7 with SVMs. In this way we can use the convolutional feature extraction with the invariance to different forms of distortion and a classifier with high generalization. As input for the SVMs, the output values of the third layer are used. The final structure of the classifier is shown in Figure 9.

Figure 9.

Structure of the used combination of CNN and SVMs.

A SVM can only separate between two classes, for this reason the training set must be split for each SVM into two parts, one part containing the class that should give a positive result at the output of the SVM and one part containing the remaining training set that should give a negative result at the output. SVMs trained in that way are working in the one vs. all classification scheme, which means that as many SVMs as classes that need to be separated are necessary. For the actual classification of a SVM, a kernel is used to transform the data to a high dimensional space in which it is more likely that the problem can be linearly separated. Two common kernels are polynomial (including linear and quadratic kernels) and radial basis functions (RBFs). In Table 1 a small example of the MSTAR database is shown and it can be seen that the already very high correct classification rate of the CNN can be further increased with the use of SVMs as classifier.

Original CNN96.00%4.00%
CNN feature extraction and polynomial SVM98.19%1.81%
CNN feature extraction and RBF SVM98.28%1.72%

Table 1.

Forced decision results of MSTAR dataset.

The results shown here are so called forced decisionresults, meaning that all images are classified by the highest output value, no rejection criteria like a certain confidence measure that has to be overcome is used. This and more results with the proposed classifier can be found in Ref. [38].


5. Rule-based-layer

Based on the abstracted situational picture derived by signal-processing and machine learning techniques, the cognitive radar system has to react to the perceived scene. Below, we illustrate a MDP based scheme to execute a-priori known, optimal illumination policies. In multifunctional radars, a radar resource manager has to schedule the individual illuminations into a serial radar timeline.

5.1. Optimal illumination policy

Markov-decision-processes (MDPs) are widely used in robotics to derive optimal control policies in stochastic environments. An agent in state sican execute different actions ai, which with probability pijlead to a follow-up state sjand a reward of rij. Different approaches, such as value-iteration or reinforcement-learning are used to determine an optimal policy π = (si| ai, sj| aj, …). The policy assigns to each state sian optimal action aiwhich maximizes the expected reward. MDPs are well suited to model the perception-action-cycle of a radar, e.g. for tracking applications [39]. In the following, we illustrate an example for multi-stage classification from Ref. [40].

Three classes of targets K = {1, 2, 3} can appear in a scenario with a priori-probability π1 = 0.1, π2 = 0.2 and π3 = 0.7 (Figure 10). A low- and a high-resolution radar-mode (mode = 1 ∣ 2) are available for up to five consecutive illuminations t = {1, 2, 3, 4, 5}, which are fused to a final declaration Vusing the Bayes rule (Figure 11). The policy describes the optimal illumination strategy with respect to the highest expectation for correctly classifying targets of class 1 (V = 1 ⇔   Class 1, V = 2 ⇔  ¬ Class 1). A negative reward (cost) of 1 unit is assigned for a false alarm and 2 units for a missed detection.

Figure 10.

Scenario, confusion- and cost matrix for the classification problem according to Ref. [40].

Figure 11.

State-space, fusion, and selection of action (further measurement or final declaration V) to minimize the expected costs.

The resulting multi-stage illumination policy is shown in Figure 12. Initially the target is illuminated with mode 2 and classified. Depending on the result Y = 1, 2, the strategy branches and finishes with a final declaration V = 1, 2. In a simulation of 100,000 Monte-Carlo runs, the static application of mode 1 resulted in accumulated costs of 20,000 (class 1 never detected, i.e. all missed detections). When randomly switching between mode 1 and 2, costs of 9063 occurred as opposed to the lowest cost of 4797 when using the optimal strategy.

Figure 12.

Optimal policy to the MDP.

5.2. Radar resource management

The illumination-strategy in Figure 12 requires up to five consecutive illuminations of a target. As indicated in Figure 1, a multifunctional radar must simultaneously carry out additional tasks, in particular search for the new targets and track known targets. Since a shared aperture is used, the radar resource-manager schedules the radar timeline in time-multiplexing mode.

In the following, we simulate an airspace-surveillance radar rotating at 180°/s with electronic beam-steering.

5.2.1. Surveillance

The airspace is discretised depending on the beam width. Let Bφ, Bθ ∈ (0, 2π) be the azimuth and elevation opening angle respectively. The dwell time τ=2rcof an airspace section is chosen dependent on the range rof the target to guarantee that the whole range can be scanned within one transmit-receive process. Therefore the discretisation is only made in direction of azimuth and elevation. Since the transmit power decreases with increasing distance to the main lobe the borders are defined overlapping, i.e. a constant d ∈ (0, 1) is selected for the discretisation (typical values are d = 0.5 or d = 0.75). If the maximum observable range and altitude are limited by Rand Hrespectively, the airspace to be observed can be written as


where the sensor is located in center of the coordinate system and h(x) denotes the height of the target perpendicular to earth’s surface. Then, after proper transformation the discretisation of ℒ is given by


Here the factor cos(jdBθ)−1 compensates the circumstance that the same area (in steradians) engages a wider azimuth coverage on higher elevation than it does on lower elevation. When a surveillance task (see Section 5.2.3) is completed it is immediately regenerated with the desired revisit time to guarantee regular observation of the entire airspace.

5.2.2. Tracking

To be able to estimate the position of a target continuously in time all radar detections of a target Tiare put together into a track T˜i. This is done by bringing them into physical relation using predefined dynamic models. A simple dynamic model assumes for example (statistically zero-mean) constant velocity which is variable through the (process-)noise in acceleration. To be able to determine which measurement belongs to which track the data association is done using scoring and global nearest neighbor approach (GNN) as it is described in Ref. [41]. In this case all unassociated detections generate a new track that applies as verified when the score exceeds a given threshold. In general a track is an estimation of the movement of the target, it contains information about the dynamic model, the covariance matrix Piand an estimation x̂itof the real state xi(t) = (pi(t), vi(t), …)Tconsisting of position pi(t), velocity vi(t) and for example acceleration ai(t) at time t. All tracks generated by the radar yield an estimation of the airspace situation (see Figure 13).

Figure 13.

Airspace situation.

A more complex dynamic model was introduced by Singer [42]. The state x(t) at time tcan then shortly be written as xt=ptvtatT=ptṗtp¨tT. The acceleration in this model is given by an ordinary differential equation


where αis the reciprocal of the maneuver time constant and w(t) is Gaussian white noise. From Eq. (28) a discrete form of the Singer model at the k-th time step can be derived


with discrete white noise wkand process matrix Fkof the following form


where Δtdenotes the time elapsed between time steps k − 1 and k.

Recursive Bayesian estimators can be used to calculate the state xand the covariance matrix P(for a better readability the index iwill be dropped from now on). One commonly used estimator is, for example, the (Extended) Kalman Filter (EKF) [41, 43]. In general the Kalman filter assumes a state transition model and an observation model


where zkdenotes the measurement, fand hare (not necessarily linear) functions, and wkand vkare additive, zero mean, white noises with process noise covariance Qkand measurement noise covariance Rkrespectively. In our case it is for example


the mapping between the state space ℒ and the measurement in azimuth ϕ, elevation θand range r. For the Singer model the state transition is a linear function with


The EKF consists of two steps. First the state xk ∣ k − 1 and the covariance Pk ∣ k − 1 are predictedusing the previous information xk − 1 ∣ k − 1 and Pk − 1 ∣ k − 1 (the index k ∣ k − 1 depicts the dependency of the estimates at time steps kand k − 1):


In the general case the matrix Fk − 1 is defined by


Second the prediction will be correctedusing the (erroneous) measurement zk:


with observation matrix


and Kalman gain


Process noise and measurement error accumulate over time until a new measurement is executed. This leads to a probability density of the track T˜iwith state xi.

The probability density is used to calculate the maximum time difference Δtthat allows the track to stay in a predefined range relative accuracy:

subject to

where P¯denotes a projection into the plane orthogonal to the beam direction and ν ∈ (0, 1) is the track sharpness. The time difference Δtis added to the tracking task of the track Tiand it is updated at every measurement.

5.2.3. Scheduler

In the simulation presented here a task is generated for each Lijand placed into a sorted waiting queue (see Figure 14). The scheduler executes those tasks, whose time stamp do not lie in the past, in the given order. If tasks are delayed, they will be prioritized following the hierarchy of the waiting queue. The tasks inside the waiting queue are sorted according to their time stamps Figure 15.

Figure 14.

Schematic illustration of the scheduler.

Figure 15.

Detailed illustration of the taskA1. Containing a time stamp, the duration of the task, azimuth and elevation.

5.2.4. Performance metrics

In this section three metrics are introduced to validate the performance of the resource manager.

One key element is the tracking accuracy. For the validation the distance between the estimated position p˜itand the real position pi(t) of a target is calculated. The track does not contain any information about to which target it is related to, since the radar system does not know the ground truth. Therefore the track with the closest approach to the target is chosen as reference. The track sharpness is given as % of the beam width:


The metric dTSdoes not take into account whether the number of tracks matches the number of targets. Therefore the number of tracked targets #T˜is compared to the number of actually existing targets #Tby the following metric:


To evaluate the surveillance performance, the revisit time is considered. Let therefore be tLijthe time of the last update of direction Lijand let L¯tbe the direction the radar is facing at time t. Then the metric is given by


5.2.5. Results

In this section the validation results of the simulation are presented. The actual airspace situation is depicted in Figure 13. Figures 16 and 17 show the evaluations of the metrics defined in Section 5.2.4.

Figure 16.

Results during initialization.

Figure 17.

Results during routine operation.

The simulation starts with an occupied airspace. This can be a difficult situation for the radar since pop-up targets significantly decrease the reaction time as the distance to the radar is shortened.

Figure 16(a) shows that the revisit time for the surveillance settles around a constant value after 4 seconds. The stepped line in Figure 16(b) shows that all targets are tracked in less than 3 seconds. The second line in (b) shows that the tracking accuracy is poor at the beginning of the simulation since the filters need several measurements to initialize correctly. Figure 17 shows that the revisit time oscillates around 4 seconds and that the tracks are stable during routine operation.


6. Knowledge-based-layer

In this section, we discuss knowledge-based behavior of a cognitive radar. As discussed in Section 3, the knowledge-based layer works on structured a-priori knowledge about the application domain and its goals and constraints. Automated planning or optimisation tools can be applied to generate mission-level commands that, for example, control the trajectory of the sensor-carrying platform.

Below, we discuss an illustrative trajectory planning problem for a 6-DOF robotic manipulator arm that carries a UWB sensor able to work in synthetic aperture radar (SAR) mode. Results from a real measurement setup using a ST Robotics R17 robot arm are also shown. The sensor has one transmitter and one receiver in a typical common-offset arrangement (Figure 18).

Figure 18.

Trajectory planning for IED inspection with R17HS robot arm.

6.1. Robot trajectory planning

The spatial resolution and processing gain that the system can achieve ultimately depend on the trajectory and velocity profile of the sensor head. The constraints can be modeled as an optimisation problem to obtain a feasible, collision-free trajectory of the end-effector of the manipulator arm in Cartesian coordinates that minimizes observation time.

6.1.1. Sensor characteristics and trajectory constraints

The radar sensor under consideration uses a selectable center-frequency from 3 to 8 GHz and 4 GHz of bandwidth, resulting in 3.75 cm of range resolution. The center frequency can be tuned according to a particular target or propagation environment (ground penetration, through-the-wall imaging, IED inspection…). The horn-type antennas can be rotated to exploit polarization diversity. The sensor is able to operate in stripmap or spotlight SAR modes using linear trajectories. Several parallel trajectories can be combined for 3D imaging. The mobility of the arm could be further exploited to generate non-linear trajectories around a target to obtain a more accurate 3D reconstruction.

In order to obtain a similar resolution in cross-range than in range the trajectory planning must (aim to create at least an aperture of 0.5 to 1.3 times the distance to the target in both dimensions (azimuth and elevation) depending on the center frequency used by the system 3 to 8 GHz respectively).

High resolution imaging can only be achieved with an even higher precision positioning. The 3D-trajectory of the sensor needs to be measured and synchronized with the sensor data. For that purpose, accelerometers and gyroscopes from an attached inertial measurement unit (IMU) are used. The IMU drift is additionally stabilized using the hardware readout of optical encoders of the robot arm joints controlled by step-motors.

Two other important parameters to be considered for the trajectory planning are the optimal size of the scanning area and the sampling requirements. Considering the case of planar acquisition geometries working in stripmap mode, to obtain full resolution imaging of the total area of interest, an additional half beam aperture must be extended in both dimensions.

Another important parameter is related with the sampling requirements of a particular acquisition. The measurement positions in the synthetic radar aperture require a minimum spacing in order to sample adequately the phase history associated with all the scatterers. If the distance between measurements is too large the Nyquist criterion is not fulfilled and artifacts may appear in the reconstructed image.

It must be considered also that signal propagation in dielectric materials (ground, wall) will shrink the wavelengths, and sampling requirements become then even more stringent [44]. A previous estimation of the dielectric permittivity of the propagation media may further optimize the acquisition geometry and the imaging process. Figure 19 shows an example of an image obtained with the robot arm using some reference objects inside a plastic suitcase. The trajectory followed by the sensor has been planned considering the constraints previously mentioned to obtain unaliased high-resolution images of the total area of interest.

Figure 19.

Image of objects inside a suitcase using the robot arm.


7. Conclusions

In this article, a three-layered cognitive radar-architecture based on the Rasmussen model was presented. Several examples illustrated technologies to implement the cognitive subfunctions in a radar system.

For the skill-based layer, an approach for matching a waveform to the target transfer function was shown. In addition, spectrum sensing methods can be used to adapt the transmit signal to the electromagnetic environment. Rule-based behavior can be implemented using Markov-decision processes (MDPs) to compute optimal illumination policies. For a shared-aperture multifunctional radar, radar-resource management approaches are required to schedule the radar timeline. For knowledge-based behavior, an example for sensor-controlled trajectory generation of a robotic-arm were presented.

The different layers of the architecture encompass a broad range of time-scales and levels of abstraction. The full potential is achieved, if all layers interact consistently. This and further experimental validation of the approach are currently investigated at FHR.


  • The training of neural networks is separated into epochs, in each epoch the complete dataset is presented one time to the classifier [29].

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Stefan Brüggenwirth, Marcel Warnke, Christian Bräu, Simon Wagner, Tobias Müller, Pascal Marquardt and Fernando Rial (May 16th 2018). Sense Smart, Not Hard: A Layered Cognitive Radar Architecture, Topics in Radar Signal Processing, Graham Weinberg, IntechOpen, DOI: 10.5772/intechopen.71365. Available from:

chapter statistics

1196total chapter downloads

1Crossref citations

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Representation of Radar Micro-Dopplers Using Customized Dictionaries

By Shobha Sundar Ram

Related Book

First chapter

Enhancing the Unmixing Algorithm through the Spatial Data Modeling for Limnological Studies

By Enner Herenio Alcantara, Jose Luiz Stech, Evlyn Marcia Leso de Moraes Novo and Claudio Clemente Faria Barbosa

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us