Open access peer-reviewed chapter

Anomaly Detection in Intrusion Detection Systems

Written By

Siamak Parhizkari

Submitted: 28 May 2023 Reviewed: 01 August 2023 Published: 09 October 2023

DOI: 10.5772/intechopen.112733

From the Edited Volume

Anomaly Detection - Recent Advances, AI and ML Perspectives and Applications

Edited by Venkata Krishna Parimala

Chapter metrics overview

142 Chapter Downloads

View Full Metrics

Abstract

Intrusion detection systems (IDS) play a critical role in network security by monitoring systems and network traffic to detect anomalies and attacks. This study explores the different types of IDS, including host-based and network-based, along with their deployment scenarios. A key focus is on incorporating anomaly detection techniques within IDS to identify novel and unknown threats that evade signature-based methods. Statistical approaches like outlier detection and machine learning techniques like neural networks are discussed for building effective anomaly detection models. Data collection and preprocessing techniques, including feature engineering, are examined. Both unsupervised techniques like clustering and density estimation and supervised methods like classification are covered. Evaluation datasets and performance metrics for assessing anomaly detection models are highlighted. Challenges like curse of dimensionality and concept drift are outlined. Emerging trends include integrating deep learning and explainable AI into anomaly detection. Overall, this comprehensive study examines the role of anomaly detection within IDS, delves into various techniques and algorithms, surveys evaluation practices, discusses limitations and challenges, and provides insights into future research directions to advance network security through improved anomaly detection capabilities.

Keywords

  • anomaly detection
  • intrusion detection systems (IDS)
  • fraud detection
  • cybersecurity
  • abnormal patterns

1. Introduction

An intrusion detection system (IDS) is a security tool designed to monitor network or system activities to detect and respond to unauthorized or malicious activities. It serves as an additional layer of defense in a comprehensive cybersecurity strategy.

The primary goal of an IDS is to identify and alert security administrators about potential security incidents, such as unauthorized access attempts, malware infections, or suspicious network traffic patterns. By analyzing network packets, log files, system activities, and other relevant data, IDS can help detect and respond to security threats in real-time.

There are two main types of IDS:

Network-based intrusion detection systems (NIDS) [1, 2, 3, 4, 5, 6, 7]: NIDS monitors network traffic in real-time, analyzing packets to identify suspicious or malicious activity. It operates at the network layer and can detect threats such as port scanning, denial-of-service (DoS) attacks, and network intrusions. NIDS can be deployed as a standalone device or as part of a network security infrastructure.

Host-based intrusion detection systems (HIDS) [1, 2, 3, 4, 5, 6, 7]: HIDS monitors the activities occurring on individual hosts or endpoints, such as servers or workstations. It analyzes system logs, file integrity, and user activities to identify unauthorized access attempts, privilege escalations, or suspicious behavior at the host level. HIDS is particularly useful for detecting insider threats or malware infections that may bypass network-based defenses.

IDS employs different detection techniques to identify potential threats:

Signature-based detection [4]: This technique relies on a database of known attack signatures or patterns. IDS compares the incoming network traffic or system activities against these signatures to identify known attacks. While effective against known threats, signature-based detection may struggle with detecting new or zero-day attacks.

Anomaly-based detection [4, 5, 6, 8]: Anomaly detection involves establishing a baseline of normal behavior for a network or system and then identifying deviations from this baseline. It analyzes traffic patterns, system performance, user behavior, and other metrics to detect anomalies that could indicate a potential security breach.

When an IDS detects an intrusion or suspicious activity, it generates an alert or notification for security administrators. These alerts provide information about the nature of the incident, the affected system or network, and any additional details to aid in the response and mitigation process.

It is important to note that IDS is not a standalone solution but works in conjunction with other security measures like firewalls, antivirus software, and security policies. Additionally, intrusion prevention systems (IPS) are often used in conjunction with IDS to not only detect but also actively block or prevent detected threats.

In summary, Intrusion Detection Systems play a crucial role in identifying and responding to potential security incidents in real-time. By monitoring network and system activities, IDS helps organizations strengthen their overall security posture and minimize the potential impact of cyber threats.

Advertisement

2. Anomaly detection techniques in IDS

Table 1 shows Anomaly detection techniques with pros and cons.

Anomaly Detection Techniques in IDS
TechniquesModelsProsCons
Signature-based
  • Pattern matching

  • Protocol analysis

  • Content inspection

  • Log analysis

  • High accuracy for known attacks

  • Low false alarm rate

  • Easy deployment

  • Low computational overhead

  • Inability to detect New or unknown attacks

  • Dependency on signature update

  • Lack of flexibility

  • Limited coverage

Statistical-based
  • Outlier detection

  • Time series analysis

  • Statistical Modeling

  • Well-established method with a solid theoretical foundation

  • Suitable for detection simple anomalies

  • Interpretable results

  • Limited ability to detect complex or sophisticated anomalies

  • Sensitivity to data distribution and assumptions

  • Difficulties in handling high-dimensional data

Machine Learning
  • Clustering

  • Classification

  • Neural Networks

  • Ability to handle complex and non-linear patterns

  • Effective for identifying subtle anomalies

  • Adaptability to changing environments

  • Can learn from unlabeled or partially labeled data

  • Requirement of large labeled training datasets

  • Overfitting if no properly tuned or validated

  • Computationally intensive for complex algorithms

  • Black-box nature may lack interpretability

Hybrid ApproachesStatistical + Machine learning methods
  • Leveraging the strengths of both statistical and machine learning techniques

  • Improved detection accuracy and robustness

  • Enhanced ability to handle diverse anomalies

  • Increase complexity and potential for integration challenges

  • Higher computational requirements

  • Potential trade-off between interpretability and performance

Table 1.

Anomaly detection techniques with pros and cons.

2.1 Signature-based detection vs. anomaly detection

Signature-based detection, also known as rule-based detection, relies on pre-defined signatures or patterns of known attacks to identify intrusions. However, signature-based detection has limitations as it can only detect known attacks for which signatures have been defined. New or unknown attacks can easily evade signature-based detection. Anomaly detection techniques, on the other hand, focus on identifying deviations from normal behavior, without relying on predefined signatures. This makes anomaly detection more effective in detecting unknown or novel attacks that do not have specific signatures [4, 6, 9]. Figure 1 shows the concept of signature-based IDS.

Figure 1.

Concept of signature based IDS [4].

2.2 Statistical approaches for anomaly detection

Statistical approaches are commonly employed for anomaly detection in IDS. These techniques involve the use of statistical methods to establish normal behavior baselines and detect deviations from these baselines. Outlier detection algorithms, such as the statistical outlier detection method or the Z-score method, are used to identify data points that significantly deviate from the expected behavior. Time series analysis techniques, such as autoregressive integrated moving average (ARIMA) models [10, 11], are used to detect anomalies in temporal data. Statistical modeling approaches, such as Gaussian mixture models or hidden Markov models, are utilized to capture the statistical characteristics of normal behavior and detect anomalies based on deviations from the learned models [4, 5].

2.3 Machine learning approaches for anomaly detection

Machine learning algorithms play a crucial role in anomaly detection for IDS. These algorithms can learn patterns and behaviors from historical data and apply that knowledge to detect anomalies in real-time. Clustering algorithms, such as k-means or DBSCAN, group similar instances together and flag instances that do not fit into any cluster as anomalies. Classification algorithms, such as support vector machines (SVM) or random forests, learn from labeled data to classify instances as normal or anomalous. Neural networks, including deep learning models like convolutional neural networks (CNN) [9] or recurrent neural networks (RNN) [9], can capture complex patterns and relationships to identify anomalies [12]. Figure 2 shows machine learning approaches in IDS.

Figure 2.

Machine learning approaches in IDS [12].

2.4 Hybrid approaches

Hybrid approaches combine both statistical and machine learning techniques to improve the accuracy and effectiveness of anomaly detection in IDS [4]. By leveraging the strengths of different approaches, hybrid models can provide enhanced detection capabilities. For example, a hybrid approach may use statistical techniques to establish baseline behavior and machine learning algorithms to classify instances as normal or anomalous. This combination allows for a more comprehensive and robust anomaly detection system.

Advertisement

3. Data collection and preprocessing in IDS

Data sources for IDS [4, 12].

Intrusion detection systems (IDS) rely on various sources of data to detect anomalies and potential security breaches. Some common data sources used in IDS include:

  1. Network traffic logs: IDS can analyze network traffic logs to monitor incoming and outgoing network packets, protocols used, source and destination IP addresses, ports, and other relevant information. Network traffic logs provide valuable insights into communication patterns and can help to detect anomalies such as unusual traffic volumes, suspicious connections, or protocol violations.

  2. System logs: System logs record events and activities within the operating system or specific applications. IDS can analyze system logs to identify abnormal system behavior, such as unauthorized access attempts, changes to system configurations, or unexpected system errors. System logs may include information about login attempts, file access, process execution, or resource utilization.

  3. Audit trails: Audit trails capture detailed information about user activities and actions within a system. They record events such as file access, privilege changes, user authentication, or administrative actions. Analyzing audit trails can help to identify unauthorized actions, unusual user behavior, or privilege escalation attempts.

Data preprocessing techniques.

Data preprocessing [13, 14] plays a crucial role in preparing the data for effective anomaly detection in IDS. Several techniques are commonly used in the preprocessing stage, including:

  1. Data cleaning [14]: Data cleaning involves removing or correcting inconsistent, irrelevant, or noisy data. This process may include handling missing values, dealing with outliers, and resolving inconsistencies in the data. Cleaning the data helps to ensure the quality and reliability of the input data for anomaly detection.

  2. Feature selection [4, 15]: Feature selection aims to identify the most relevant and informative features for anomaly detection. In IDS, this involves selecting the attributes or variables that provide the most discriminative information about normal and anomalous behavior. Feature selection can help to reduce computational complexity, improve detection accuracy, and eliminate redundant or irrelevant features.

  3. Normalization [14]: Normalization is the process of scaling data to a common range or distribution. It ensures that different features are on a comparable scale, which is essential for certain anomaly detection algorithms that rely on distance or similarity measures. Normalization techniques include min-max scaling, z-score normalization, or logarithmic transformations.

  4. Dimensionality reduction [16]: Dimensionality reduction techniques aim to reduce the number of features while preserving the most important information. High-dimensional data can be computationally expensive and prone to overfitting. Techniques such as principal component analysis (PCA), linear discriminant analysis (LDA), or t-distributed stochastic neighbor embedding (t-SNE) can help to reduce the dimensionality of the data while retaining its essential characteristics.

Advertisement

4. Unsupervised anomaly detection in IDS

Unsupervised anomaly detection techniques in intrusion detection systems (IDS) aim to identify anomalies in data without relying on pre-labeled instances of normal and anomalous behavior [4, 9]. These techniques are particularly useful in scenarios where labeled training data is scarce or unavailable, making it challenging to train supervised models. Unsupervised anomaly detection methods utilize statistical, clustering, or density-based approaches to identify patterns that deviate from normal behavior. Here are some commonly used unsupervised anomaly detection techniques in IDS and Figure 3 shows a summary of these techniques:

  1. Statistical-based techniques: Statistical-based techniques are commonly used for unsupervised anomaly detection in intrusion detection systems (IDS). As we said before, these techniques analyze the statistical properties of the data to identify instances that deviate significantly from the expected behavior. The underlying assumption is that normal behavior follows a certain statistical distribution, and any deviation from this distribution is considered anomalous. Here are some commonly used statistical-based techniques:

    • Gaussian distribution: The Gaussian distribution, also known as the normal distribution, is frequently used in statistical-based anomaly detection. It assumes that the normal behavior of the data follows a bell-shaped curve. Anomalies are identified as instances that fall outside a specified range or threshold based on the estimated mean and standard deviation of the data. Instances that lie in the tails of the distribution, beyond a certain number of standard deviations from the mean, are considered anomalies.

    • Mahalanobis distance: The Mahalanobis distance measures the distance between a data point and the center of a distribution, taking into account the correlation between variables. It accounts for the covariance structure of the data and is particularly useful when the variables are correlated. The Mahalanobis distance can be used to detect anomalies by comparing the distance of each data point to a threshold value. Points with a large Mahalanobis distance are considered anomalies.

    • Z-score method: The Z-score method is a simple statistical technique for anomaly detection. It calculates the standard deviation from the mean for each data point and expresses it as a Z-score. The Z-score represents the number of standard deviations a data point is away from the mean. Anomalies are identified as data points with a Z-score exceeding a specified threshold. This method is particularly useful when the data is normally distributed.

    • Hypothesis testing: Hypothesis testing is a statistical technique used to determine the likelihood that an observed deviation from the expected behavior is due to chance or represents an anomaly. Commonly used hypothesis tests include the t-test, chi-square test, or Kolmogorov-Smirnov test. These tests compare the observed data to a reference distribution or expected behavior and calculate a p-value. If the p-value is below a predefined significance level, the deviation is considered significant, and the instance is flagged as an anomaly.

    Statistical-based techniques provide a solid foundation for detecting anomalies based on deviations from expected statistical behavior. However, it is important to note that these methods assume the data follows specific statistical distributions and may not be suitable for data with complex or non-parametric distributions. Additionally, choosing appropriate thresholds or significance levels is crucial and requires careful consideration and domain knowledge.

  2. Clustering-based techniques: Clustering-based techniques as shown in Figure 4 are commonly used for unsupervised anomaly detection in intrusion detection systems (IDS). These techniques aim to partition the data into clusters based on the similarity or density of instances [14, 17]. Anomalies are identified as instances that do not belong to any cluster or are located far from the clusters. Here are some commonly used clustering-based techniques:

    • K-means clustering: K-means clustering is a popular technique that aims to partition the data into K clusters. The algorithm iteratively assigns data points to the nearest cluster centroid based on distance measures such as Euclidean distance. Anomalies are typically identified as instances that do not fit well into any cluster or are located far from the cluster centroids. However, K-means alone may not be sufficient for anomaly detection as it assumes that all clusters have similar sizes and shapes, which may not hold true for anomalous instances.

    • Density-based spatial clustering of applications with noise (DBSCAN): DBSCAN is a density-based clustering algorithm that identifies clusters based on the density of instances [18]. It groups together instances that are close to each other and have a sufficient number of nearby neighbors. Anomalies are typically instances that do not have enough nearby neighbors to form a cluster and are considered noise points. DBSCAN can effectively identify clusters of different shapes and sizes, making it suitable for detecting anomalies that do not conform to regular cluster patterns.

    • Ordering points to identify the clustering structure (OPTICS): OPTICS is an extension of DBSCAN that provides a hierarchical view of the clustering structure. It orders instances based on their density and identifies core points, reachability distances, and clusters. Anomalies are typically instances that have low density and are located in regions with sparse or no clusters. OPTICS allows for flexible parameterization, making it more adaptive to different datasets and providing a richer characterization of the data structure.

    • Hierarchical clustering: Hierarchical clustering methods create a hierarchy of clusters by successively merging or splitting clusters based on their similarity. Agglomerative hierarchical clustering starts with each instance as a separate cluster and iteratively merges similar clusters until a single cluster is formed. Divisive hierarchical clustering starts with all instances in one cluster and iteratively splits the cluster into smaller clusters. Anomalies can be identified as instances that do not fit well into any cluster or do not conform to the hierarchical structure.

    Clustering-based techniques offer flexibility in detecting anomalies by identifying instances that do not conform to regular cluster patterns. However, these techniques require careful consideration of parameters such as the number of clusters or density thresholds, and the interpretation of anomalies may depend on the dataset and the clustering algorithm used.

  3. Density-based techniques: Density-based techniques are commonly used for unsupervised anomaly detection in intrusion detection systems (IDS). These techniques focus on estimating the density distribution of the data and identify anomalies as instances that lie in regions of low density [19]. Here are some commonly used density-based techniques:

    • Kernel density estimation (KDE): Kernel density estimation is a non-parametric technique used to estimate the underlying density distribution of the data. It places a kernel function on each data point and sums them to estimate the density at any given point. Anomalies are typically identified as instances with significantly lower density values compared to the majority of the data. The choice of kernel function and bandwidth parameter affects the smoothness and accuracy of density estimation.

    • Local outlier factor (LOF): The local outlier factor measures the deviation of an instance’s density compared to its neighboring instances. It calculates a local density for each data point based on the distances to its k nearest neighbors. Anomalies are identified as instances with significantly lower local densities compared to their neighbors. LOF takes into account the local density variations in the data, making it robust to varying densities and useful for detecting anomalies in clusters or regions of different densities.

    • Distance-based techniques: Distance-based density estimation techniques measure the distances between instances and identify anomalies based on deviations from the expected distance distribution. For example, the nearest neighbor distance (NND) approach calculates the average distance to the k nearest neighbors for each instance. Anomalies are identified as instances with significantly larger or smaller distances compared to the majority of the data. Distance-based techniques are effective in identifying anomalies that exhibit unusual distance patterns.

    • Density-based clustering [18]: Density-based clustering algorithms, such as DBSCAN, can also be used for anomaly detection. These algorithms identify clusters based on the density of instances and label as anomalies the instances that do not belong to any cluster. Anomalies are typically located in regions with low density or as individual points far from the clusters.

    Density-based techniques provide flexibility in detecting anomalies by focusing on regions of low density or deviations from expected distance patterns. These techniques are effective in identifying anomalies that do not conform to regular density distributions or exhibit unusual distance patterns. However, careful parameter selection, such as the neighborhood size or density thresholds, is important to ensure accurate anomaly detection.

  4. Reconstruction-based techniques [20]: Reconstruction-based techniques are a class of anomaly detection techniques that aim to reconstruct the normal behavior of the data and identify anomalies based on the errors or deviations from this reconstruction. These techniques typically employ autoencoders shown in Figure 5 or similar models to learn the underlying patterns in the data and use them to reconstruct or generate the data. Here are some commonly used reconstruction-based techniques:

    • Autoencoder-based anomaly detection: Autoencoders are neural network models that are trained to reconstruct their input data. They consist of an encoder that compresses the input data into a lower-dimensional representation and a decoder that reconstructs the data from this representation. During training, autoencoders learn to minimize the reconstruction error by capturing the patterns and regularities in the data. Anomalies are identified as instances that result in high reconstruction errors, indicating deviations from the learned normal behavior.

    • Variational autoencoders (VAEs): Variational autoencoders are a type of generative model that learns a low-dimensional representation of the data and generates new samples by sampling from this learned representation. VAEs consist of an encoder that learns the parameters of a probability distribution in the latent space and a decoder that generates samples from this distribution. Anomalies can be identified based on the reconstruction error or by measuring the dissimilarity between the original data and the generated samples. VAEs can capture the underlying distribution of the data and detect anomalies that deviate significantly from this distribution.

    • Generative adversarial networks (GANs) [21]: Generative adversarial networks are another type of generative model that consists of a generator network and a discriminator network. The generator network learns to generate realistic samples that resemble the normal behavior of the data, while the discriminator network learns to distinguish between real and generated samples. Anomalies can be identified as instances that are not well captured by the generator network or are classified as fake by the discriminator network. GANs can learn complex data distributions and detect anomalies that differ significantly from the learned distribution.

Figure 3.

Summarize of unsupervised techniques.

Figure 4.

Clustering [14].

Figure 5.

Structure of autoencoders [12].

Reconstruction-based techniques offer the advantage of learning the normal behavior of the data and identifying anomalies based on deviations from this learned behavior. They can capture complex patterns and variations in the data, making them effective for detecting anomalies that do not conform to specific statistical or density distributions. However, these techniques require a representative dataset of normal behavior for training the models and may be sensitive to the choice of model architecture and training parameters.

Advertisement

5. Supervised anomaly detection in IDS

Supervised anomaly detection in intrusion detection systems (IDS) involves training a model on labeled data, where both normal and anomalous instances are explicitly identified [4, 9]. The model learns the patterns and characteristics of normal behavior during the training phase and can subsequently classify new instances as either normal or anomalous based on the learned knowledge. Here are some commonly used techniques for supervised anomaly detection in IDS:

  1. Supervised machine learning algorithms [4]: Supervised machine learning algorithms, such as decision trees [4], random forests, support vector machines (SVM), and neural networks, can be applied to train models for supervised anomaly detection in IDS. These algorithms learn from labeled data, where the labels indicate whether an instance is normal or anomalous. The trained models can then classify new instances as either normal or anomalous based on the learned patterns and decision boundaries.

  2. Ensemble methods [4]: Ensemble methods combine multiple models to improve the accuracy and robustness of supervised anomaly detection. Techniques such as bagging, boosting, and stacking can be employed to create an ensemble of models that collectively make predictions. Each individual model may use a different algorithm or have different parameter settings, and the final prediction is often determined through voting or averaging. Ensemble methods can effectively handle complex data distributions and improve the overall detection performance.

  3. Deep learning [9]: Deep learning techniques, such as convolutional neural networks (CNNs) shown in Figure 6, recurrent neural networks (RNNs) shown in Figure 7, and deep autoencoders, have shown promising results in supervised anomaly detection. These models can learn complex representations of the input data, capture intricate patterns, and generalize well to unseen instances. Deep learning approaches require large amounts of labeled data and can be computationally intensive but can achieve high accuracy in detecting anomalies in IDS.

  4. Feature engineering: Feature engineering plays a crucial role in supervised anomaly detection. It involves selecting relevant features from the data or designing new features that can effectively discriminate between normal and anomalous instances. Domain knowledge and expertise are often employed to identify informative features that can capture the distinguishing characteristics of anomalies. Feature engineering techniques, such as dimensionality reduction, feature selection, and feature transformation, can improve the detection performance of supervised anomaly detection models.

  5. One-class support vector machines (SVM): One-class support vector machines (SVM) is a popular technique for anomaly detection that falls under the category of supervised learning. Unlike traditional SVMs that are designed for binary classification tasks, One-class SVM is specifically designed for the task of learning a model of normal data and identifying anomalies based on deviations from this model [22]. Here’s how One-class SVM works:

    • Training phase: In the training phase, One-class SVM learns a decision boundary that encloses the majority of the training data points, representing the normal class. The goal is to find a hyperplane that maximally separates the normal data instances from the origin or the center of the feature space.

    • Decision function: Once the model is trained, the decision function of the One-class SVM is used to classify new instances as either normal or anomalous. The decision function assigns a score or distance to each instance, indicating how well it fits within the learned model. Instances with positive scores are considered normal, while instances with negative scores are classified as anomalies.

    • Kernel trick: One-class SVM can make use of the kernel trick to handle nonlinear data distributions. By mapping the input data into a high-dimensional feature space, the One-class SVM can find a nonlinear decision boundary that better separates normal instances from anomalies. Commonly used kernel functions include the radial basis function (RBF) kernel and the polynomial kernel.

Figure 6.

Structure of convolutional neural network [12].

Figure 7.

Structure of recurrent neural network [12].

One-class SVM offers several advantages for anomaly detection:

  • It can handle data with high-dimensional feature spaces effectively.

  • It is robust to outliers in the training data.

  • It can capture complex decision boundaries, including nonlinear ones, through the use of the kernel trick.

  • It does not require labeled anomalies for training, as it focuses solely on learning the normal class.

However, One-class SVM also has certain limitations:

  • It may struggle when the normal class exhibits significant variations or when the normal data distribution is not well-represented in the training set.

  • It may be sensitive to the choice of kernel function and its hyperparameters, requiring careful tuning for optimal performance.

  • It does not provide direct probabilistic outputs, making it challenging to interpret the anomaly scores as probability estimates.

Supervised anomaly detection in IDS offers the advantage of explicitly labeled data for training and can achieve high detection accuracy. However, it relies on the availability of accurately labeled training data and may face challenges when dealing with evolving or previously unseen anomalies. Moreover, supervised approaches may not capture novel or unknown anomalies that were not present in the training data.

Advertisement

6. Evaluation and performance metrics in IDS

Evaluation datasets play a crucial role in assessing the performance of anomaly detection techniques in intrusion detection systems (IDS). These datasets are used to evaluate how well a detection technique can accurately classify instances as normal or anomalous. Several datasets have been widely used in the field of IDS for evaluation purposes. Here are some commonly used datasets:

  • KDD Cup 99 [4, 9]: The KDD Cup 99 dataset is one of the most widely used datasets for evaluating IDS techniques. It was created for the Third International Knowledge Discovery and Data Mining Tools Competition held in 1999. The dataset contains a large number of network connection records generated in a simulated network environment, with various types of attacks and normal traffic.

  • NSL-KDD [4, 9]: The NSL-KDD dataset is an updated version of the original KDD Cup 99 dataset. It addresses some of the limitations and drawbacks of the original dataset, such as redundant and irrelevant features. NSL-KDD provides a more balanced and realistic representation of network traffic data, making it suitable for evaluating IDS techniques.

  • CICIDS2017 [4, 23, 24]: The CICIDS2017 dataset is a recent dataset that was developed for evaluating IDS techniques in the context of real-world network traffic. It consists of network traffic data collected from a real network environment, containing various types of attacks and normal traffic instances. The dataset offers a comprehensive and diverse set of scenarios for evaluating IDS performance.

Performance metrics are used to quantitatively measure the effectiveness of anomaly detection techniques in IDS. These metrics provide insights into the model’s accuracy, precision, recall, and overall performance. Here are some commonly used performance metrics:

  1. Accuracy [4, 9, 25, 26]: Accuracy measures the overall correctness of the model’s predictions. It calculates the ratio of correctly classified instances to the total number of instances. However, accuracy can be misleading in imbalanced datasets where anomalies are rare.

    Accuracy=TP+TNTP+TN+FP+FNE1

  2. Precision [4, 9, 25, 26]: Precision measures the proportion of correctly identified anomalies among all instances classified as anomalies. It focuses on the correctness of positive predictions, indicating the model’s ability to avoid false positives.

    Precision=TPTP+FPE2

  3. Recall [4, 9, 25, 26]: Recall, also known as sensitivity or true positive rate, measures the proportion of actual anomalies that are correctly identified by the model. It represents the model’s ability to detect anomalies and avoid false negatives.

    Recall=TPTP+FNE3

  4. F1-score [4, 9, 25, 26]: The F1-score is a harmonic mean of precision and recall, providing a balanced measure of the model’s performance. It considers both false positives and false negatives and is especially useful when dealing with imbalanced datasets.

    fscore=Precision.RecallPrecision+RecallE4

  5. ROC curve and AUC [4, 9]: The receiver operating characteristic (ROC) curve illustrates the trade-off between the true positive rate (TPR) and the false positive rate (FPR) at different classification thresholds. The area under the curve (AUC) summarizes the performance of the model across all possible thresholds. A higher AUC indicates better discrimination between normal and anomalous instances.

These performance metrics help in evaluating the accuracy, effectiveness, and reliability of anomaly detection techniques in IDS.

Advertisement

7. Challenges in anomaly detection for IDS

  1. Curse of dimensionality [27]: The curse of dimensionality refers to the phenomenon where the effectiveness of certain algorithms and techniques deteriorates as the dimensionality of the data increases. In the context of intrusion detection systems (IDS), the curse of dimensionality poses a significant challenge for anomaly detection.

    Anomaly detection in IDS often involves analyzing high-dimensional data, such as network traffic logs, system logs, or audit trails. Each data instance is typically represented by a large number of features or attributes that describe various aspects of the network or system behavior. However, as the number of features increases, the available data becomes increasingly sparse in the high-dimensional space.

    The curse of dimensionality has several implications for anomaly detection in IDS:

    • Insufficient data: As the number of dimensions (features) increases, the amount of available data decreases exponentially. In other words, the data becomes sparse, and the number of instances representing normal and anomalous behavior becomes limited. This scarcity of data can lead to poor generalization and inaccurate anomaly detection.

    • Increased complexity: With a high number of dimensions, the complexity of the anomaly detection problem also increases. Anomaly detection algorithms may struggle to effectively capture patterns and relationships among the features, leading to decreased detection accuracy.

    • Increased computational cost [28]: The computational cost of processing high-dimensional data is significantly higher than processing data with a lower dimensionality. Anomaly detection algorithms may require more computational resources and time to analyze and classify instances accurately, affecting the real-time performance of IDS.

    To mitigate the curse of dimensionality in IDS, various techniques can be employed:

    • Dimensionality reduction: Dimensionality reduction methods aim to reduce the number of features while preserving the most relevant information. Techniques such as principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) can be used to reduce the dimensionality of the data, making it more manageable for anomaly detection algorithms.

    • Feature selection: Feature selection techniques help to identify the most informative and discriminative features that contribute to anomaly detection. By selecting a subset of relevant features, the curse of dimensionality can be alleviated, reducing computational complexity and improving detection accuracy.

    • Feature engineering: In addition to dimensionality reduction and feature selection, feature engineering involves transforming or creating new features that better represent the underlying characteristics of normal and anomalous behavior. This process can help to extract more meaningful information from the high-dimensional data and enhance the performance of anomaly detection algorithms.

    Overall, addressing the curse of dimensionality in IDS requires careful consideration of data representation, feature selection, and dimensionality reduction techniques. By reducing the dimensionality of the data and focusing on relevant features, anomaly detection algorithms can be more effective in accurately identifying anomalies in high-dimensional data.

  2. Concept drift [29, 30]: Concept drift refers to the phenomenon where the underlying data distribution, which defines what is considered normal or anomalous, changes over time. In the context of intrusion detection systems (IDS), concept drift poses a significant challenge for anomaly detection.

    In IDS, anomaly detection models are trained on historical data to learn patterns of normal behavior and identify deviations from those patterns as anomalies. However, the characteristics of network traffic and system behavior can evolve over time due to various factors such as changes in network infrastructure, software updates, and emerging attack techniques. As a result, the learned model may become outdated and less effective in detecting new types of anomalies.

    Concept drift in IDS can occur in different forms:

    • Gradual concept drift: In gradual concept drift, the change in the underlying data distribution is relatively slow and progressive. The statistical properties of the data gradually shift over time, leading to a gradual degradation in the performance of the anomaly detection model. This type of concept drift requires continuous monitoring and adaptation of the model to maintain its effectiveness.

    • Sudden concept drift: In sudden concept drift, the change in the underlying data distribution occurs abruptly and unpredictably. This can happen due to sudden changes in network conditions, system configurations, or the introduction of new attack techniques. Sudden concept drift poses a significant challenge as the model needs to quickly adapt to the new data distribution to accurately detect anomalies.

    Addressing concept drift in IDS is essential to maintain the effectiveness of anomaly detection over time. Several techniques can be employed:

    • Online learning: Online learning approaches allow the anomaly detection model to continuously adapt to new data as it arrives. By updating the model with each new data point or in small batches, online learning can capture and respond to concept drift in real-time. Techniques such as incremental learning, ensemble methods, and adaptive models can be used to achieve online learning.

    • Change detection: Change detection techniques monitor the statistical properties of the data and detect significant changes that indicate concept drift. By periodically comparing the current data distribution with the historical distribution, these methods can trigger model retraining or adaptation when a significant change is detected. Statistical methods like control charts, cumulative sum (CUSUM), and change point detection algorithms can be used for change detection.

    • Ensemble methods: Ensemble methods combine multiple anomaly detection models or algorithms to improve detection performance and resilience to concept drift. By aggregating the decisions of multiple models, ensemble methods can adapt to changing data distributions and make more robust anomaly predictions. Techniques like ensemble averaging, boosting, and stacking can be applied to create ensemble models.

    It is important to note that concept drift detection and adaptation in IDS is an ongoing research area, and the development of effective techniques to handle concept drift remains an active research topic.

  3. Adversarial attacks [31, 32, 33]: Adversarial attacks in IDS refer to deliberate attempts by adversaries to exploit vulnerabilities in the system and manipulate its behavior in order to evade detection or cause misclassification of normal or malicious activities. These attacks are specifically designed to target the anomaly detection capabilities of IDS and can have serious consequences for the security of the network.

There are different types of adversarial attacks that can be launched against IDS:

  • Evasion attacks: Evasion attacks aim to manipulate the input data in a way that the IDS fails to detect or correctly classify the malicious activities. Attackers carefully craft the input features to deceive the IDS into treating them as normal behavior, thus evading detection. Evasion attacks often involve carefully modifying or adding features to manipulate the decision boundary of the IDS.

  • Poisoning attacks: Poisoning attacks occur during the training phase of the IDS and involve injecting malicious or manipulated data into the training set. By poisoning the training data, attackers aim to manipulate the learning process of the IDS, compromising its detection capabilities. The poisoned data can introduce biases or alter the statistical properties of the training set, leading to degraded performance or increased false positives/negatives.

  • Stealth attacks: Stealth attacks aim to exploit the specific weaknesses or blind spots of the IDS to remain undetected. These attacks often involve carefully crafted sequences of activities that exploit temporal or contextual vulnerabilities, making it difficult for the IDS to identify them as anomalies. Stealth attacks can leverage timing patterns, bursty activities, or sophisticated evasion techniques to bypass detection.

  • Data injection attacks: Data injection attacks involve injecting malicious or unauthorized data into the network or system monitored by the IDS. These attacks can disrupt the normal operation of the IDS by overwhelming it with excessive or irrelevant data, triggering false alarms, or causing system failures. Data injection attacks can exploit vulnerabilities in data handling mechanisms or target specific weaknesses in the IDS architecture.

Addressing adversarial attacks in IDS is a challenging task. Some strategies and techniques that can help to mitigate the impact of these attacks include:

  • Adversarial training: Adversarial training involves training the IDS on both normal and adversarial examples to make it more robust against adversarial attacks. By exposing the IDS to various adversarial scenarios during training, it can learn to recognize and classify adversarial behavior more effectively.

  • Defense mechanisms: Implementing defense mechanisms such as input sanitization, feature engineering, and anomaly detection ensembles can enhance the resilience of the IDS against adversarial attacks. These techniques focus on improving the robustness of the IDS to handle manipulated or malicious inputs.

  • Monitoring and response: Continuous monitoring of the network and system activities can help to detect and respond to adversarial attacks in a timely manner. Real-time analysis, incident response, and adaptive countermeasures can aid in mitigating the impact of attacks and preventing further exploitation.

  • Collaboration and information sharing: Sharing information and collaborating with other IDS systems, security researchers, and organizations can help to create a collective defense against adversarial attacks. Sharing knowledge about attack techniques, patterns, and countermeasures can lead to more effective defense strategies.

It is worth noting that adversarial attacks and defense mechanisms in IDS are evolving research areas, and new attack techniques and defense strategies are continuously being developed.

Advertisement

8. Emerging trends and future directions

  1. Integration of deep learning techniques [34]: Deep learning techniques, such as deep neural networks and recurrent neural networks, have shown promising results in various domains. In anomaly detection for IDS, the integration of deep learning techniques can help to capture complex patterns and dependencies in network traffic data, improving the accuracy of detection [9].

  2. Explainable AI for anomaly detection [35]: Explainability is a crucial aspect of anomaly detection in IDS. As complex machine learning models are being used, understanding and interpreting their decisions become essential. Future research focuses on developing explainable AI techniques that provide transparency and insights into the reasoning behind anomaly detections.

  3. Real-time and streaming anomaly detection: Traditional batch processing approaches are not sufficient to handle the high-speed and large-scale nature of network traffic data. Future directions involve developing real-time and streaming anomaly detection methods that can process and analyze data on the fly, allowing for timely detection and response to anomalies.

  4. Integration of multiple data sources: IDS can benefit from the integration of multiple data sources, such as network traffic data, system logs, and user behavior data. Incorporating diverse data sources and applying advanced fusion techniques can enhance the accuracy and robustness of anomaly detection.

Advertisement

9. Conclusion

In conclusion, this chapter has provided an overview of anomaly detection techniques in intrusion detection systems (IDS). We discussed the two main types of IDS, including signature-based detection and anomaly detection, and highlighted the advantages of using anomaly detection techniques over signature-based approaches. We explored various anomaly detection techniques, including statistical-based techniques, clustering-based techniques, density-based techniques, reconstruction-based techniques, and One-class support vector machines (SVM).

We also discussed the importance of data collection and preprocessing in IDS, emphasizing the relevance of different data sources and the need for effective preprocessing techniques to enhance anomaly detection accuracy. Furthermore, we covered the evaluation and performance metrics used to assess the effectiveness of anomaly detection techniques, including commonly used evaluation datasets and performance metrics such as accuracy, precision, recall, F1-score, ROC curve, and AUC.

We highlighted the challenges faced in anomaly detection for IDS, such as the curse of dimensionality, concept drift, and adversarial attacks. These challenges require ongoing research and development efforts to improve the accuracy and resilience of anomaly detection techniques. Additionally, we discussed emerging trends and future directions in the field, including the integration of deep learning techniques, the use of explainable AI, and the exploration of real-time and streaming anomaly detection methods.

In conclusion, anomaly detection techniques play a crucial role in IDS for enhancing network security by identifying potential threats and attacks. However, there are ongoing challenges and opportunities for further research and development. By addressing these challenges and embracing emerging trends, we can advance the field of anomaly detection in IDS and improve the detection and prevention of sophisticated and unknown attacks, ultimately enhancing the overall security of network systems.

References

  1. 1. Kumar KN, Sukumaran S. A survey on network intrusion detection system techniques. International Journal of Advanced Technology and Engineering Exploration. 2018;5(47):385-393
  2. 2. Modi C, Patel D, Borisaniya B, Patel H, Patel A, Rajarajan M. A survey of intrusion detection techniques in cloud. Journal of Network and Computer Applications. 2013;36(1):42-57
  3. 3. Liu M, Xue Z, Xu X, Zhong C, Chen J. Host-based intrusion detection system with system calls: Review and future trends. ACM Computing Surveys (CSUR). 2018;51(5):1-36
  4. 4. Khraisat A, Gondal I, Vamplew P, Kamruzzaman J. Survey of intrusion detection systems: Techniques, datasets and challenges. Cybersecurity. 2019;2(1):1-22
  5. 5. Jyothsna V, Prasad R, Prasad KM. A review of anomaly based intrusion detection systems. International Journal of Computer Applications. 2011;28(7):26-35
  6. 6. Gangwar A, Sahu S. A survey on anomaly and signature based intrusion detection system (IDS). International Journal of Engineering Research and Applications. 2014;4(4):67-72
  7. 7. Jmila H, Khedher MI. Adversarial machine learning for network intrusion detection: A comparative study. Computer Networks. 2022;214:109073
  8. 8. Zamani M, Movahedi M. Machine Learning Techniques for Intrusion Detection. 2013. 11 p. Available from: arxiv.org [Revised in 2015]
  9. 9. Kocher G, Kumar G. Machine learning and deep learning methods for intrusion detection systems: Recent developments and challenges. Soft Computing. 2021;25(15):9731-9763
  10. 10. Yaacob AH, Tan IK, Chien SF, Tan HK. Arima based network anomaly detection. In: 2nd International Conference on Communication Software and Networks, 2010, Singapore. Singapore: IEEE; 2010. pp. 205-209
  11. 11. Shirani P, Azgomi MA, Alrabaee S. A method for intrusion detection in web services based on time series. In: 28th IEEE Canadian Conference on Electrical and Computer Engineering, CCECE (CCECE). Halifax, Canada: IEEE; 2015. pp. 836-841
  12. 12. Liu H, Lang B. Machine learning and deep learning methods for intrusion detection systems: A survey. Applied Sciences. 2019;9(20):4396
  13. 13. Davis JJ, Clark AJ. Data preprocessing for anomaly based network intrusion detection: A review. Computers & Security. 2011;30(6–7):353-375
  14. 14. Alasadi SA, Bhaya WS. Review of data preprocessing techniques in data mining. Journal of Engineering and Applied Sciences. 2017;12(16):4102-4107
  15. 15. Haq NF, Onik AR, Hridoy MAK, Rafni M, Shah FM, Farid DM. Application of machine learning approaches in intrusion detection system: A survey. IJARAI-International Journal of Advanced Research in Artificial Intelligence. 2015;4(3):9-18
  16. 16. Salih AA, Abdulazeez AM. Evaluation of classification algorithms for intrusion detection system: A review. Journal of Soft Computing and Data Mining. 2021;2(1):31-40
  17. 17. Aburomman AA, Reaz MBI. Survey of learning methods in intrusion detection systems. In: 2016 International Conference on Advances in Electrical, Electronic and Systems Engineering (ICAEES). Putrajaya, Malaysia: IEEE; 2016
  18. 18. Bohara B, Bhuyan J, Wu F, Ding J. A survey on the use of data clustering for intrusion detection system in cybersecurity. International Journal of Network Security & Its Applications. 2020;12(1):1
  19. 19. Wicaksana AK, Cahyani DE. Modification of a density-based spatial clustering algorithm for applications with noise for data reduction in intrusion detection systems. International Journal of Fuzzy Logic and Intelligent Systems. 2021;21(2):189-203
  20. 20. Xu Y-X, Pang M, Feng J, Ting KM, Jiang Y, Zhou Z-H. Reconstruction-based anomaly detection with completely random forest. In: HAPPENING VIRTUALLY: SIAM International Conference on Data Mining (SDM21) April 29 - May 1, 2021, Virtual Conference. Philadelphia, PA, USA: SIAM; 2021
  21. 21. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative adversarial networks. Communications of the ACM. 2020;63(11):139-144
  22. 22. Mahfouz AM, Abuhussein A, Venugopal D, Shiva SG. Network intrusion detection model using one-class support vector machine. In: Advances in Machine Learning and Computational Intelligence: Proceedings of ICMLCI 2019. Singapore: Springer Nature; 2021
  23. 23. Panigrahi R, Borah S. A detailed analysis of CICIDS2017 dataset for designing intrusion detection systems. International journal of. Engineering & Technology. 2018;7(3.24):479-482
  24. 24. Stiawan D, Idris MYB, Bamhdi AM, Budiarto R. CICIDS-2017 dataset feature analysis with information gain for anomaly detection. IEEE Access. 2020;8:132911-132921
  25. 25. Wang G, Hao J, Ma J, Huang L. A new approach to intrusion detection using artificial neural networks and fuzzy clustering. Expert Systems With Applications. 2010;37(9):6225-6232
  26. 26. Parhizkari S, Menhaj MB, Sajedin A. A Cognitive Based Intrusion Detection System. 2020. 19 p. Available from: arxiv.org [Revised in 2022]
  27. 27. Verleysen M, François D. The curse of dimensionality in data mining and time series prediction. In: Computational Intelligence and Bioinspired Systems: 8th International Work-Conference on Artificial Neural Networks, IWANN 2005, Vilanova i la Geltrú, Barcelona, Spain, June 8–10, 2005 Proceedings 8. Barcelona, Spain: Springer; 2005
  28. 28. Aljanabi M, Ismail MA, Ali AH. Intrusion detection systems, issues, challenges, and needs. International Journal of Computational Intelligence Systems. 2021;14(1):560-571
  29. 29. Brownlee J. Concept drift 2023. Available from: https://machinelearningmastery.com/gentle-introduction-concept-drift-machine-learning/
  30. 30. Castillo D. what is concept drift 2023. Available from: https://www.seldon.io/machine-learning-concept-drift.
  31. 31. Mbow M, Sakurai K, Koide H. Advances in adversarial attacks and defenses in intrusion detection system: A survey. In: Science of Cyber Security-SciSec 2022 Workshops: AI-CryptoSec, TA-BC-NFT, and MathSci-Qsafe 2022, Matsue, Japan, August 10–12, 2022, Revised Selected Papers. Matsue, Japan: Springer; 2023
  32. 32. Zizzo G, Hankin C, Maffeis S, Jones K. Adversarial attacks on time-series intrusion detection for industrial control systems. In: 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom) 29 Dec 2020 - 01 Jan 2021. Guangzhou, China: IEEE; 2020. ISBN: 978-0-7381-4380-4
  33. 33. Alotaibi A, Rassam MA. Adversarial machine learning attacks against intrusion detection systems: A survey on strategies and defense. Future Internet. 2023;15(2):62
  34. 34. Yehuda Y. New Trends in AI and Machine Learning for Anomaly Detection 2023. Available from: https://www.rad.com/blog/new-trends-ai-and-machine-learning-anomaly-detection
  35. 35. Zehra S, Faseeha U, Syed HJ, Samad F, Ibrahim AO, Abulfaraj AW, et al. Machine learning-based anomaly detection in NFV: A comprehensive survey. Sensors. 2023;23(11):5340

Written By

Siamak Parhizkari

Submitted: 28 May 2023 Reviewed: 01 August 2023 Published: 09 October 2023