1. Introduction
Cognitive radio (CR) is a newly emerging technology (Mitola & Maguire, 1999, Mitola, 2001), which has been recently proposed to implement some kind of intelligence to automatically sense, recognize, and make wise use of any available radio frequency spectrum. With the increasing demand for wireless application, access to available spectrum is becoming increasingly difficult. On the other hand, most licensed spectrum go unused most of the time according to the FCC's Spectrum Policy Task Force Report (FCC, 2003). In order to solve these problems, cognitive radio is proposed for sharing the licensed spectrum to unlicensed users without harmful interference to licensed system.
Spectrum sensing is a key element in cognitive radio system which enables the cognitive radio to share the spectrum in licensed bands by detecting temporarily unused spectral resources (Haykin, 2005, Ghasemi & Sousa, 2005). Recently, several spectrum sensing techniques have been explored for cognitive radios, such as matched filter detection, energy detection and cyclostationary feature detection (Akyildiz at el., 2006). Many signals used in communication systems exhibit periodicities of their second order statistical parameters due to the operations such as sampling, modulating, multiplexing and coding. These cyclostationary properties, which are named as spectral correlation features, can be used for spectrum sensing. Moreover, spectrum sensing can not be restricted to simply monitor the power in some frequency bands of interest but must include detection and identification in order to avoid interference (Fehske at el., 2005). Therefore, cyclostationary feature detection is undoubtedly a good solution for primary user signal detection and recognition. In this chapter, several well-known spectrum sensing techniques are reviewed first. A survey of signal detection and classification for cognitive radios combining the spectral correlation analysis and support vector machine (SVM) is given in Section 3. Several spectral coherence characteristic parameters which are sensitive with modulation types and insensitive with SNR variation are chosen via spectral correlation analysis. In order to give better performance of the SVM, an alignment based kernel selection method is proposed in Section 4, which is used to choose the best kernel function for the SVM with spectral coherence characteristic training samples. A simple cross-validation method is also introduced to choose the most appropriate kernel parameters and penalty parameters for the SVM. The performance analysis of the proposed approach is given over both Gaussian channel and IEEE 802.22 WRAN channel in Section 5. Compared to the existing methods including the classifiers based on binary decision tree (BDT) and multilayer linear perceptron network (MLPN), the proposed approach is more effective in the case of low SNR and limited training numbers.
2. Overview of Spectrum Sensing in Cognitive Radio System
Interference due to a cognitive radio network is deemed harmful if it causes the signal-to-interference ratio (SIR) at any primary receiver to fall below a certain threshold, which supplied by the regulatory bodies. This threshold depends on the receiver's robustness toward interference and varies from one primary band or service to another. In order to achieve spectrum sharing, the sensitivity of cognitive radio should be higher than that of primary receiver. For example, in order to share the radio spectrum between cognitive users and TV users in IEEE 802.22 standard, the sensitivity of cognitive receiver should exceed the TV receiver without hidden terminal problem. To improve the sensing accuracy, an additional margin of 30dB - 40dB should be added to the detection threshold. Moreover, due to the dynamic characteristic of the radio environment and difference between primary users as well as unknown influence of interference, spectrum sensing has become a challenging problem in cognitive radio. Generally, the spectrum sensing techniques can be classified as transmitter detection, cooperative detection, and interference-based detection, as shown in Fig. 1. In this chapter, we focus on the transmitter detection which is commonly used in the practical system. Transmitter detection approach is based on the detection of the weak signal from a primary transmitter. To achieve dynamic spectrum sharing, the cognitive radio transmitter should have capability to determine if a signal from primary user is locally present in a certain spectrum. Basic hypothesis model for transmitter detection can be defined as follows
where x(t) is the signal received by the cognitive user, s(t) is the transmitted signal of the primary user, n(t) is the additive white Gaussian noise and h is the amplitude gain of the channel. HQ is a null hypothesis, which states that there is no licensed user signal in a certain spectrum band. On the other hand, Ha is an alternative hypothesis, which indicates that there exists some licensed user signals.
Three approaches exist for transmitter detection, based on the sensing users' knowledge on the transmitted signals, which are illustrated in Fig. 1.
2.1. Matched Filter Detection
When the information of the primary user signal is well-known to the cognitive radio, the matched filter is the optimal linear filter for maximizing the signal to noise ratio (SNR) in the presence of additive Gaussian noise (Sahai at el., 2004, Akyildiz at el., 2006). In cognitive radio system, the matched filter is obtained by correlating a known signal, or primary user signal template, with an unknown signal to detect the presence of primary user signal in the unknown signal. This is equivalent to convolving the unknown signal with a time-reversed version of the template. According to the hypothesis model for transmitter detection, the received signal x(n) can be expressed as
where s(n) and
The Equation (3) can also be expressed as
where P is the power of the pilot signal, and a is noise variance. Then the decision function can be given by
where the threshold
where
Combining Equation (6) and Equation (7), we can obtain N by eliminating
According to the Equation (9), O (1/ SNR) samples are required to meet a probability of error constraint, thus the main advantage of the matched filter is that it requires less time to achieve high processing gain. However, as a coherent detection method, matched filter detection requires a priori knowledge of the primary user signal such as the modulation type and order, the pulse shape, and the packet format. Although most of the priori knowledge can be obtained from pilot, preambles, synchronization word or spreading codes of the primary user network systems, an obvious shortcoming is that the cognitive radio user requires specific receivers for different types of primary user signal.
2.2. Energy Detection
If the receiver cannot gather sufficient information about the primary user signal, for example, if the power of the random Gaussian noise is only known to the receiver, the optimal detector is an energy detector (Akyildiz at el., 2006, Digham at el., 2003). In order to measure the energy of the received signal, the output signal of bandpass filter with bandwidth W is squared and integrated over the observation interval T. Finally, the output of the integrator, x(t), is compared with a threshold, y, to decide whether a licensed user is
Fig. 2. The principal of energy detection.
present or not. According to the basic hypothesis model for transmitter detection, the probability of false alarm PFA and detection PD are given as followsSince it is easy to implement, the recent work on detection of the primary user has generally adopted the energy detection. Although this method can be implemented without any prior knowledge of the primar2y user signal, it also has some drawbacks. Since energy detection is non-coherent, O(1/SNR ) samples are required to meet a probability of error constraint. Moreover, the threshold selection for energy detection is highly susceptible to uncertainty in background noise and interference, and it can only determine the presence of the signal without differentiating signal types. However, the largest advantage of energy detection is simple and low complexity.
2.3. Cyclostationary Feature Detection
The cyclostationary feature is intentionally embedded in the physical properties of a communication signal, which may be easily generated, manipulated, detected and analyzed using low complexity transceiver architectures. This feature is present in all transmitted signals, requires little signalling overhead and may be detected using short signal observation times, and thus it can be used for primary user signal detection and recognition. Recent research efforts exploit the cyclostationary features of signals via spectral correlation analysis as a method for spectrum sensing (Digham at el., 2003, Sahai at el., 2004, Ghasemi & Sousa, 2005), which has been found to be superior to simple energy detection and matched filtering. Energy detection can only detect whether or not a signal in present and utilizing a matched filter system requires extensive knowledge about the channel and signals that are to be identified. The method, which is not susceptible to in-band interference, can be used to detect and classify different types of signal. Conventional signal classification approaches are mainly based on decision theory (Polydoros & Kim, 1990, Sapiano & Martin, 1996, Sills,
1999, Wei & Mendel, 2000, Hang at el., 2001) and statistical pattern recognition (Nandi &
Azzouz, 1995, Azzouz & Nandi, 1995, Azzouz & Nandi, 1996, Nandi & Azzouz, 1998). In the work by Hang (Hang at el., 2001), a binary decision tree is designed to signal classification. However, it's difficult to obtain the decision thresholds and rules, which needs a large amount of calculation. For more efficient and reliable performance, a novel approach based on multilayer linear perceptron network for signal classification in cognitive radio is studied by Fehske (Fehske at el., 2005). Support vector machine (SVM) is a new statistical pattern recognition approach, which is based on structural risk minimization principle (Vapnik, 1995). Compared with the conversional methods based on empirical risk minimization like artificial neural network (ANN), it has been found to give better generalization and better performance for small training examples. In the next section, a novel approach of signal classification for cognitive radios combining the cyclostationary features and SVM is proposed.
3. Spectrum Sensing based on spectral correlation analysis and SVM
3.1. System Framework
As our signal classification scheme combining spectral correlation analysis and SVM is based on statistical pattern recognition, which mainly consists of three modules (Han, 2003): feature extraction, classifier design and classification decision. Feature extraction is typically the first stage in any classification system in general, and in our spectrum sensing systems in particular. Given signal set to be classified, the feature parameters of different classes of signal and rules for classifier should be determined first. In order to achieve better classification performance, selected feature parameter should be insensitive with the SNR variation, and then a proper classifier is designed for specific classification problem using training data with known signal types. When the error probability of the classifier achieves a specific threshold, the classifier can be used for signal classification and recognition. In our scheme, there are three procedures adopted for primary user signal recognition:
1) Pro-processing procedure:
Several feature parameters are extracted via spectral correlation analysis first. The feature parameters insensitive with the SNR variation are selected as feature vector
2)Training and learning procedure:
The SVM classifier is trained using selected feature parameters in the training set. By utilizing a nonlinear SVM, an amount of calculation for training is performed offline, thus the computational complexity is reduced. The optimal classification plane for SVM is obtained in this procedure via training and learning.
3)Test procedure
Selected feature parameters extracted for received signals are inputted well-trained SVM classifier for primary user signal detection and recognition.
The framework of our scheme combining spectral correlation analysis and SVM is shown in Fig. 3.
Although SVM is a better choice for the classifier, the selection of feature parameters has direct impact on the performance of the classification algorithm. In the next section, we will discuss the first step, how to choose spectral coherence characteristic parameters for our scheme.
3.2. Spectral Correlation Analysis
Many signals used in communication systems exhibit periodicities of their second order statistical parameters due to the operations such as sampling, modulating, multiplexing and coding. These cyclostationary properties, which are named as spectral correlation features, can be used for signal detection and recognition (Gardner, 1987).
In order to analyze the cyclostationary features of the signal x(t), two key functions are typically utilized. The cyclic autocorrelation function (CAF) is used for time domain analysis, which can be expressed as
The spectral correlation function (SCF), which exhibits the spectral correlation of the signal x(t), is obtained from the Fourier transform of the cyclic autocorrelation in Equation (12) (Gardner & Franks, 1975).
Where
where the Fourier transform of the function x(u) on the bounded time interval [t-T/2,t + T/2] is definedas
The correlation coefficient for the SCF between frequency components
The magnitude of the SCC ranges from 0 to 1 with a = 0 for all f. Different signal classes (i.e. AM, ASK, FSK, PSK, MSK, QPSK) can be distinguished based on several characteristic parameters of SCF and SCC.
In practical situations, however, the number of observation samples at the sensor is limited. Therefore, the spectral correlation function needs to be estimated from a finite set of samples. In general, two methods are used for spectral correlation estimation including time-domain averaging and frequency-domain smoothing (Gardner & Spooner, 1988). In this section, the frequency-smoothing method is used for spectral correlation estimation, which can be expressed as follows.
where
Even visually in the above figures, the spectral correlation functions of the different modulation types possess distinct characteristics. It is this fact that allows the successful application of the pattern recognition techniques to achieve primary user signal detection and recognition. In order to obtain better robustness of the proposed algorithm, some features less sensitive with SNR should be chosen for the classifier. Assumed that the received signal s = s + n, where s, n are the transmitted signal and additional white Gaussian noise. The feature parameter of the received signal has a better classification where performance which is insensitive with SNR variation, if it satisfies
X, x, n is the feature vectors of s, s, n.Based on the calculation of the spectral correlation function, we can obtain the spectral correlation magnitude surface of different types of signal. According to above analysis, several spectral correlation features can be extracted for distinction of different modulation types. Typically, four key features

Table.1.
Typical value of spectral correlation features. Four key features x1,x2,x3,x4are described as follows.
•x1: Number of 5 pulse on f - domain of SCF
Let a = 0 in Equation (17), the SCF is transformed into S"(f), thus x1 can be obtained from the ichnography of S x(f).
•x2: Number of cyclic spectral line on a -domain of SCF
Let f = 0 in Equation (17), the SCF is transformed into S (0), and x2 can be obtained from the ichnography of S (0).
•x3: Average energy of cyclic spectral line on a - domain of SCF
The Average energy of cyclic spectral line on a - domain of SCF can be computed by the equation as follows:
• x4: Maximum value of SCC
The spectral coherence coefficient can be obtained via Equation (16) and Equation (17). Then, the maximum value of SCC is computed as key feature x4.
In the simulation experiments, 4000 features are extracted from the signal for every trial. In order to prevent numerical computational errors, the features need to be normalized by subtracting mean of each feature from the original feature and dividing the result by the standard deviation of the same feature.
After normalization, the feature vector
3.3. Support Vector Machine
The traditional statistical theory is primarily based on the asymptotic principle, which provides conclusion only for the situation where the sample size is tending to infinity. However, in most practical applications, the samples are usually limited so that it is difficult to achieve the desired results via existing methods. Statistical Learning Theory is a novel statistical theory based on small sample statistics by Vapnik (Vapnik, 1995). Compared to the conventional statistical theory, statistical learning theory mainly concerns the statistic principles when samples are limited, especially the properties of learning procedure in such cases. Statistical learning theory provides us a new framework for the general learning problem, which not only considers the asymptotic performance but obtains the optimal results under the condition of limited information. In order to study the generalization performance and the speed of uniform convergence, a series of indicators used to evaluate the learning performance of function sets are defined in statistical learning theory. One of the most important concepts is Vapnik-Chervonenkis(VC) dimension which was originally defined by Vladimir Vapnik and Alexey Chervonenkis in 1971. VC-dimension is a measure of learning machine complexity or the capacity of a statistical learning algorithm, which is the cardinality of the largest set of points that the algorithm can shatter. The learning machine is more complex with a greater VC-dimension. Statistical learning theory provides a novel strategy that balances the empirical risk and confidence interval. A nested subset sequence is chosen from the given set of functions according to the size of the VC dimension. For a given subset, the minimal value of the empirical risk can be obtained as the minimal true risk, which is illustrated in Fig. 5. This method is named as Structural risk minimization (SRM) which was also coined by Vapnik and Chervonekis in 1974. The principle of SRM is to provide a method to reach the trade-off between hypothesis space complexity (the VC dimension of approximating functions) and the quality of fitting the training data (Wang, 2007). The procedure is described in detail as follows.
Assumed a function set
The elements of the above structure have two properties as follows.
(1) The VC dimension of each subset hk is limited and satisfies
(2) Any element in the structure Sk contains a set of totally bounded function
or contains a function set which satisfies the following inequality for some (p, t k)
For a given set of observation set
Statistical learning theory also gives the required conditions for reasonable structure of function subset and the convergence property of actual risk in SRM principle. The actual risk is the sum of empirical risk and confidence interval. As the index of the elements in the structure increases, the empirical risk will be reduced with extended confidence interval. The smallest upper bound of the actual risk can be derived from a certain element in the structure. Support vector machine is a novel universal learning machine, which is widely used in the fields of pattern recognition, regression estimation and probability density. It is based on VC-dimension theory and SRM principle, which has a better generalization performance by reaching a trade-off between model complexity with limited sample data and capacity of the learning algorithm. The support vector machine was coined by Vapnik in the late 1960s on the foundation of statistical learning theory. It was originally developed for binary classification problem. The optimal solution of SVM for a linearly separable case was introduced by Vapnik. Later this was extended to non-separable cases. In the previous research, a common solution to classification problem of communication signals is artificial neural network (ANN), such as multilayer linear perceptron network. After the first preliminary studies, SVM have shown a remarkable efficiency, especially when compared with traditional artificial neural networks. The main advantage of SVM, with respect to ANN, consists in the structure of the learning algorithm, characterized by the resolution of a constrained quadratic programming problem (CQP), where the drawback of local minima is completely avoided (Boser at el., 1992, Cortes & Vapnik, 1995
, Scholkopf, 1995). Since the classification of communication signals is obvious to be a linearly non-separable problem, we will only discuss the computation of this optimization problem in this chapter. Given linearly non-separable classification problem, we suppose a training set is {(xi, yi )}, where
where the vector
In order to solve this non-separable problem, the non-minus slack variables
The hyperplane, which makes O(w) = ||w|| /2 to be minimum, is named as the optimal hyperplane. All of the training vectors are correctly classified by it and the vectors of each class are separated with a maximum margin (Burges, 1997).
Usually, structuring a hyperplane is solved as a quadratic optimization problem that can be formulated as
where C is the penalty parameter, which is used to control the training error rate by different values.
Using a Lagrange multiplier technique, the optimization problem can be converted into
where ai, (i> 0 are Lagrange multiplier factors.
Given linearly non-separable classification problem, we can map the input data into a high dimensional feature space through some non-linear transformation which makes the data linearly separable (Devroye, 1996). Noted the above solution to linearly non-separable classification problem, only the inner product operation of the training samples is involved in the decision function. While structuring high dimensional feature space, the algorithm only use the inner product O (xi )«O (x) in the space without separated O (x) or O (xi). If we can find a function K satisfying Mercer condition (Daniel & James, 2000), which can be denoted as
where K(xi, x) is the kernel function, which is utilized for mapping the input data to higher dimensional space in order to reduce the computational load. There are different kernel functions like polynomial, sigmoid and radial basis function (RBF) used in SVM, which are defined as follows.
1. Polynomial Kernel
A k-order polynomial classifier can be defined by Equation (30).
2. Radial Basis Function (RBF) Kernel
The width of the RBF kernel parameter o can be determined in general by an iterative process selecting an optimum value based on the full feature set. The main difference between RBF classifier and traditional RBF method is that each basis function in the RBF classifier corresponds to a support vector, which is automatically identified by the algorithm where the drawback of local minima is completely avoided.
3. Sigmoid Kernel
This kernel uses sigmoid function as inner product, which is equivalent to a multilayer perceptron with only one hidden layer. The number of node in hidden layer is automatically determined by algorithm.
Till now, the choice of the kernel functions was often used empirically, and this also became a theoretical drawback of SVM. A proper kernel function for a specific problem is dependent on the specific training sample data. In the practical applications, how to choose the proper model according to training sample set with better generalization ability is currently a research direction in the field of SVM. For a signal classification problem using cyclostationary features, we use an improved method of model selection based on kernel alignment, which will be described in Section 4.1 in detail. The choice of the kernel functions is studied via computer simulations and optimal results are achieved using radial-basis function (RBF) kernel function. A typical classification experiment using RBF kernel function based SVM is illustrated as follows
After choosing the best kernel function, the dual representation of the optimization problem can be obtained by computing the derivatives with respect to w, b,
The resulting decision function is obtained as follows:
The architecture of the SVM classifier combining spectral correlation analysis is shown in Fig. 7.
4. Performance Evaluation and Analysis 4.1 Kernel Function and Parameters Selection
According to the definition of kernel function in the previous section, the kernel matrix can be defined as follows (Lanckriet at el., 2002)
where n is the number of the samples. It is a symmetric positive definite matrix, and since it specifies the inner products between all pairs of input elements, it completely determines the relative positions between those points in the embedding space.
In this section, we use an improved method of kernel selection based on kernel alignment to choose proper kernel function for our scheme (Cristianini at el., 2002). Assumed that K1 and K2 are kernel matrix of the kernel function k1 and k2, respectively. The (empirical)
alignment of a kernel k1 with a kernel k2 with respect to the sample S is the quantity, which can be defined by
Given a sample set
This kernel selection method using kernel alignment is based on an important assumption that the kernel function has better performance if the kernel alignment of the kernel matrix and the alignment matrix is higher. Thus, if we consider K= K1, Kad =YY =K2, then
According to the above derivation, the optimal kernel function problem can be transformed into kernel alignment maximizing problem. In this section, the kernel alignment values of different kernel functions are compared via computer simulation by MATLAB 7.0. For the simulations, we define a signal set as {AM, ASK, FSK, PSK, MSK, QPSK}. To obtain the kernel alignment at different SNR, simulations are carried out with 1024 samples at SNR ranging from 0 dB to 20 dB. Simulation results show that the kernel alignment of the RBF kernel is the greater than that of other kernels, which is shown in Fig. 8. According to the simulation results, we choose RBF kernel as the kernel function of the SVM in our scheme. After the kernel function is selected, two key parameters of the SVW should be considered next. The first parameter, penalty parameter C of the SVM, is used for adjusting the range of the confidence interval to control the training error rate by different values. The second one, the width of the RBF kernel parameter o, can control the classification error by changing the largest VC dimension of linear classification plane. Therefore, these two parameters have a great impact on the classification performance (Chapelle at el., 2002). In this section, we use a simple cross-validation method to search the best parameters (C, o).
In n-fold cross-validation, we first divide the training set into n subsets of equal size. Sequentially one subset is tested using the classifier trained on the remaining n-1 subsets. Thus, each instance of the whole training set is predicted once so the cross-validation accuracy is the percentage of data that are correctly classified. The process of cross-validation is described as follows.
1.The training set is divided into n subsets
the subset are
to 10.
After training each
Repeat step 3 until the number of
After performing cross-validation method for our scheme by MATLAB, we can obtain the best kernel parameters
4.2. Classifier Design
In order to compare the performance of different classifiers, two approaches based on existing methods, such as decision theory and artificial neural network, are introduced with spectral correlation features as training data.
4.2.1. Binary Decision Tree
After decades of research, decision theory has been widely studied in mathematics, statistics and communication concerned with identifying the values, uncertainties and other issues relevant in a given decision and the resulting optimal decision. In the conventional decision theory, the binary decision tree (BDT) is a decision support tool that uses a graph or model of decisions mapping from observations to target value. Since it is simple and easy to understand, binary decision tree is widely used in signal detection and recognition. After observation of the value range the different features in Table. 1, it's easy to find that feature x can be used to classify the signals into three groups, which are {PSK, MSK, QPSK}, {FSK} and {AM, ASK}. Furthermore, feature x2 and x4 can be used to distinguish {ASK, AM} and {MSK, QPSK}, respectively. Thus, a binary decision tree is designed based on spectral correlation features for primary user signal recognition. In the decision algorithm given in Fig. 9, we make use of feature x to recognize the FSK signal in the first layer, and then feature x2 and x3 are used for the classification of ASK and AM signals and recognition of PSK signal in the second layer. In the third level, feature x4 is utilized to distinguish MSK and QPSK signal.
4.2.2. Multilayer Linear Perceptron Network
Artificial Neural networks have long been considered for pattern recognition and modulation classification and have proven to be robust to a variety of conditions such as interfering signals and noise. In order to compare the classifier performance of artificial neural network and support vector machine, a signal classification approach using spectral correlation and neural networks, which was proposed by A. Fehske (Fehske at el., 2005), is introduced below. Due to its simplicity, a multilayer linear perceptron network (MLPN) with 4 neurons in the hidden layer was used for each signal class, and each input layer uses the normalized spectral correlation feature vector x' = (x't, x2, x'3, x'4) as input. Each MLPN was trained with a back propagation algorithm (Gupta, 2003) with an initial learning rate n = 0.05 decreasing with each epoch, a momentum constant a = 0.7, and an activation function tanh(x). The output of each MLPN is a continuous value in the range (-1, 1). The MAXNET structure shown in Fig. 10 simply chooses the signal whose MLPN outputs the largest value. A typical gradient descent algorithm can be used to solve the linearly non-separable signal classification problem, which can achieve minimal mean square error of expected output and actual output. The training results of all the MLPN are inputted into a simple MAXNET for final decision. The decision function of the MAXNET is defined by
The signal classification approach using spectral correlation and MLPN is shown in Fig. 10.
5. Simulation Results
In this section, a variety of Monte Carlo simulations are presented to illustrate the performance of the algorithm. In the simulations, we define a signal set as {AM, ASK, FSK, PSK, MSK, QPSK}. For each type of signal, N signal samples constitute one frame which is used as an observation window to compare the performance of the algorithm with different data samples. To distinguish 6 modulation classes, simulation are carried out with 1100 frames at SNR ranging from 0 dB to 20 dB using three classifiers developed. 100 frames are used for training samples, and the remaining 1000 frames are used to calculate the probability of correct classification of different classifiers. The probability of correct classification (Pcc) can be defined by
where N is the number of simulations,
The radio channel models considered in the simulations include Gaussian channel and cognitive radio channel. In order to simulate the wireless environment of cognitive radios, WRAN channel model B recommended in IEEE 802.22 standard is used as cognitive radio channel (Sofer, E. & Chouinard, 2005). The WRAN channel reference model B determined by the IEEE 802.22 standard group has a multi-path (6-path) delay profile, which is summarized in Table. 2.
In the IEEE 802.22 WRAN system, cognitive radio technology is considered to share the licensed spectrum of Digital TV, the typical service coverage is from 33 kilometres to 100 kilometres. The reference channel model B for the IEEE 802.22 WRAN are derived from a scenario that transmits the signals between the fixed BS and CPEs in wireless broadband environments. In such a dynamic channel environment, the delay extension is high with lower Doppler frequency. An Example channel responses for the nominal WRAN channel B is illustrated in Fig. 11.
Table.3 and Table.4 indicate the probability of correct classification (Pcc) for each modulation type with the training data length of 1000 over Gauss channel and cognitive radio channel, respectively. Results show the overall correct rate is above 92.83% for a SNR of 4dB, and 97.32% for a SNR of 8dB. These good results for signals with low SNR in the cognitive radio environment show the proposed approach is insensitive with SNR variation, which come from the effects of the robustness of SVM classifier. According to Table.3, the proposed approach has better performance in both channel conditions.
Fig. 12 and Fig. 13 show the performance of SVM classifier with data length as parameter over different channel models. When the data length is 100 and for a SNR of 4dB, the Pcc is up to 80.62% and with data length 200 and for a SNR of 6dB, the Pcc increases to 90%. When the data length is 1000 and for a SNR of l0dB, the Pcc is close to 100%. Above results show that the performance of the SVM classifier is high for small training data in both channel models. Fig. 14 and Fig. 15 are the performance comparison between BDT classifier, MLPN classifier and SVM classifier with the spectral correlation features over Gaussian channel and WRAN channel, which are calculated for different signals using data length of 200 and 1000, respectively. It is shown that when the SNR is lower, the MLPN classifier shows poor performance. While the SNR is higher, the probability of correct classification is increased. In lower SNR, the variation of the spectral correlation features (SCFs) is drastically due to the effect of the noise. Thus, the construction of neural network is not complete with small training data, which results in the performance degradation. The decision tree based classifier only use the partial information of the spectral correlation features (SCFs), therefore, the correct probability is lower than SVM classifier in the whole SNR range. All the results show the high performance of SVM classifier based on spectral correlation features (SCFs).
6. Conclusion and Future Work
In this chapter, we proposed a novel approach combining the spectral correlation features and SVM for signal classification in cognitive radio environment. Four spectral correlation characteristic parameters were chosen as feature vector of SVM classifier. Simulation results show that the overall success rate is above 92.83% with data length of 1000 when SNR is equal to 4dB. Compared to existing methods, the proposed approach is more effective in the case of low SNR and limited training numbers. Future work in the area of signal classification for cognitive radio systems will involve the analysis of higher order spectrum correlation features of more communication signals. Based on these features, a multi-class SVM classifier can be used to improve the accuracy of classification and reduces the computational complexity. In addition, the classifier performance will be tested via simulations using several different channel models.














