Open access peer-reviewed chapter

Particle Swarm Optimization of Convolutional Neural Networks for Human Activity Prediction

Written By

Preethi Gunishetty Devarakonda and Bojan Bozic

Reviewed: 15 March 2021 Published: 02 November 2022

DOI: 10.5772/intechopen.97259

From the Edited Volume

Optimisation Algorithms and Swarm Intelligence

Edited by Nodari Vakhania and Mehmet Emin Aydin

Chapter metrics overview

118 Chapter Downloads

View Full Metrics

Abstract

The increased usage of smartphones for daily activities has created a huge demand and opportunities in the field of ubiquitous computing to provide personalized services and support to the user. In this aspect, Sensor-Based Human Activity Recognition (HAR) has seen an immense growth in the last decade playing a major role in the field of pervasive computing by detecting the activity performed by the user. Thus, accurate prediction of user activity can be valuable input to several applications like health monitoring systems, wellness and fit tracking, emergency communication systems etc., Thus, the current research performs Human Activity Recognition using a Particle Swarm Optimization (PSO) based Convolutional Neural Network which converges faster and searches the best CNN architecture. Using PSO for the training process, intends to optimize the results of the solution vectors on CNN which in turn improve the classification accuracy to reach the quality performance compared to the state-of-the-art designs. The study investigates the performances of PSO-CNN algorithm and compared with that of classical machine leaning algorithms and deep learning algorithms. The experiment results showed that the PSO-CNN algorithm was able to achieve the performance almost equal to the state-of-the-art designs with a accuracy of 93.64%. Among machine learning algorithms, Support Vector machine found to be best classifier with accuracy of 95.05% and a Deep CNN model achieved 92.64% accuracy score.

Keywords

  • Human Activity Recognition
  • Particle Swarm Optimisation
  • Convolutional Neural Network
  • Time Series Classification
  • Deep Learning
  • Sensors

1. Introduction

1.1 Background

Activity Recognition aims at identifying the activity of users based on series of observations collected during the activity in a definite context environment. Applications that are enabled with activity recognition are gaining huge attention, as users get personalized services and support based on their contextual behavior. The proliferation of wearable devices and smartphones has provided real-time monitoring of human activities through sensors that are embedded in smart devices such as proximity sensors, cameras, microphone, magnetometers accelerometers, gyroscopes, GPS etc., Thus, understanding human activities in inferring the gesture or position has created a competitive challenge in building personal health care systems, examining wellness and fit characteristics, and most pre-dominantly in elderly care, abnormal activity detection, diabetes or epilepsy disorders etc.,

Thus, Human Activity Recognition (HAR) plays a significant part in enhancing people’s lifestyle, as it should be competent enough in learning high level quality information from raw sensor data. Effective HAR applications are incorporated for contextual behavior analysis [1], video surveillance analysis [1], gait investigation (to determine any abnormalities in walking or running), gesture and position recognition.

1.2 Research problem

Human Activity Recognition (HAR) is evolving to be a challenging time series classification task which involves predicting the human activity based on sensor data where the data points are recorded at regular intervals. Though HAR seems to be the straightforward approach of performing HAR, there are numerous issues and challenges that are encountered in selecting the appropriate feature processing technique and thus choosing the correct modeling algorithm for the time series data is crucial. Thus, the conventional approaches have made extraordinary progress on Human Activity Recognition (HAR) by incorporating machine learning algorithms such as Naïve Bayes, Decision Tree, Support Vector Machine, Logistic Regression as there are only few labeled data. It requires domain knowledge to manually process the feature extraction. On the other hand, deep learning algorithms has seen high performance in areas like Natural language Processing, Object Recognition etc., In spite of these advancements, another line of research has emerged in applying nature-inspired meta heuristic optimization techniques like Particle Swarm Optimization, Genetic Algorithms on Neural Networks. The research question that is aimed to be addressed in the current study can be concisely stated as below.

“To what extent can the Particle Swarm Optimized Convolutional Neural Network significantly enhance the recognition of human activity from raw inertial sensor data when compared with supervised machine learning algorithms and Deep Learning Algorithms”.

Algorithms: Naive Bayes, Support Vector Machine (SVM), Random Forest (RF) Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), PSO Optimized Convolutional Neural Network.

1.3 Research objectives

The most feasible solution in overcoming these challenges could be by looking into existing works and analyzing the experimental set up. Thus, picking the right sensor and right gestures with demonstrated capabilities can significantly eliminate the chances of inaccurate sensor data. The traditional machine learning algorithms require large amount of labeled static data and manually performing the feature selection tasks. But in real applications most of the activity data are unlabelled and entire data needs to be analyzed. Since Deep Learning methods can perform training on the entire data, and analyzing the complex features, this study focuses on investigating the performance various deep learning models in classifying the time series data.

Convolutional Neural Network requires large number of parameters to tune and it is time consuming. The study explores the optimization using metaheuristic algorithm Particle Swarm Optimization for Convolutional Neural Network. Thus, the study has a deep investigation towards major approaches followed in HAR namely, machine learning using hand-crafted features and deep learning using raw inertial signals. The process in which the research carried to achieve the results is mentioned below.

  1. Exploring the previous works on Human Activity Recognition, identifying gaps in the research with a detailed analysis.

  2. Data Preparation and Data Pre-Processing is conducted by considering two versions of the dataset.

  3. Designing a solution to perform Human Activity Recognition by using Particle Swarm Optimized Convolutional Neural Network.

  4. Implementing the solution fortified in the proposed design and tune the models to obtain the expected accuracy.

  5. Evaluate the performance of the various models.

  6. Comparing the results obtained for different models and place the findings in the study.

Advertisement

2. Review of existing literature

This section provides an overview on Human Activity Recognition and its applications. Various approaches for HAR task are discussed. Particularly, Sensor based HAR is detailed with different sensor modalities. Additioanlly gives an overview of the modeling approaches for HAR.

Sensor Based Human Activity Recognition: Due to the immense growth of sensor technology and ubiquitous computing, sensor-based Human Activity Recognition is gaining attention which is widely used with enhanced protection and privacy. According to [2], the HAR task can be achieved by placing the sensors at different locations to recognize human activity for specific context Table 1 based on sensor placements at different locations. HAR with different sensor modalities are listed below.

ModalityDescriptionSensor Types
WearableUsually Worn by the user to capture the body movementsSmartphone, watches gyroscope,accelerometer
Object SensorsMounted on objects to capture objects movementsRFID, accelerometer on objects
Ambient SensorsMounted in environment surroundings to record user interactionBluetooth,Sound,WiFi,
Hybrid SensorsCombination of multiple sensorsMultiple types, often deployed in smart environments

Table 1.

Sensor modalities for human activity recognition tasks [3].

Body-worn sensors/wearable sensors Wearable sensors are one of the widely used sensor modalities in HAR. These sensors are often worn or attached to the users, namely an accelerometer, gyroscope, and magnetometer. As the human body moves, the acceleration and angular velocity are varied, this data is further analyzed to predict the activity. These sensors can be embedded in smart phones, smart watches, fit bands, headbands etc., The Figure 1 shows the different wearable sensors that can be used by humans [4, 5] studies the significance of sensor and its appropriate position to be placed on the body of the user. Many research are conducted to investigate the variability in accuracies by placing the sensors on different pasts of human body. One such study is performed by [6] by placing sensors on chest and wrist for duration of two hours which gave 83% classification accuracy.

Figure 1.

Body worn sensors [4].

Thus, wearable sensors were widely used for HAR [7] in various health monitoring systems. In recent days, inertial sensing, that uses movement-based Sensors which can be attached on user’s body has been studied widely [8]. Among those work, the accelerometer is mostly used sensor for collecting position details. Gyroscope and magnetometer are also used in combination with accelerometer.

2.1 Modeling approaches for human activity recognition

In any data mining project, the choice of the appropriate modeling algorithm does not depends only on the type of problem to solve, but also on the type of input data. Due to the natural ordering of the temporal feature data, the Human Activity Recognition is considered as a typical pattern-recognition system where it involves classifying the human activity based on the series of data. The main difference between Machine Learning Algorithms and Deep Learning Algorithms in recognizing human activity is the way the input features are extracted. In this aspect, the below sections explain the methodology chosen for the task of Human Activity Recognition.

Figure 2.

Process of human activity recognition using hand-crafted features modeled with machine learning algorithms.

Machine Learning Algorithms for Human Activity Recognition: Considering HAR as one of the pattern recognition problem, the conventional pattern recognition methods have seen extraordinary results by utilizing machine learning algorithms like hidden Markov models, decision tree support vector machine, naive Bayes [9]. The Figure 2 illustrates the process of Human Activity Recognition using hand-crafted features modeled with machine learning algorithms. The raw inertial activity signals received from the sensors are subjected to feature -extraction process by domain knowledge experts [10]. The features that are usually extracted are based on two main domain features namely; time domain and frequency domain [11]. The time domain features are computed based on mathematical functions to extract statistical details from the signals. The frequency domain features possess mathematical functions that record recursive patterns of signals. Thus, in machine leaning approach for HAR, the input data are always extracted from human engineered hand-crafted features. These features may be further pre-proceed using Data Dimensionality Reduction techniques to select the significant features.

Selecting important features is more significant than choosing a classification algorithm [12], this is because poor quality features may hinder the performance of the classifier. Hassan et al. [13] in his recent work, employed Kernel Principal Component Analysis (KPCA) which works based on statistical analysis before applying modeling. Furthermore, [14] employed Stepwise Linear Discriminant Analysis (SWLDA) which is a non-linear method, selects the subset of features by using regression combined with F-test. The model showed enhanced performance after applying Data Dimensionality Technique.

Different modeling algorithms have been employed to predict the human activity recognition. Ravi et al. [15] in his work used Naïve Bayes classifier with few parameter settings to classify 8 different activities, which outperformed other classification algorithms. Several research employed Naive Byes as the primary classifier for human activity recognition [16, 17].

In recent times, learning algorithms which are based on error computation namely; Artificial Neural Networks [18, 19] Support Vector Machine [20, 21] new are used for predicting HAR without any Data pre-processing technique applied.

The most used modeling algorithms that showed efficient results as per the study are Naive Bayes, Multinormal Logistic Regression, K - Nearest Neighbor Hidden Markov Models, Support Vector Machine and Artificial Neural Network.

Deep Learning Approaches for Human Activity Recognition: Though, conventional Pattern Recognitions (PR) approaches gained satisfactory results in HAR, these methods heavily rely on hand crafted feature generation usually done by domain expertise [22]. This sometimes leads to error in collection data and missing some significant data points. On the other hand, Deep Neural Networks are capable of automatic feature extraction without human intervention. In fact, the model becomes more robust when data is large [23].

The Figure 3 illustrates the process of HAR followed by Deep Learning Algorithms. Initially, the raw sensor signals collected from inertial sensors (accelerometer, gyroscope etc.,) are it is directly subjecting to modeling, where no feature extraction step is performed. Additionally, deep learning follows a unsupervised, incremental learning which makes it more feasible to implement HAR tasks [24].

Figure 3.

Process of human activity recognition using raw inertial signals modeled with deep learning algorithms.

Several Deep learning mode were employed to perform Activity Recognition in various contexts. Liu et al. [25] investigated the performance of Restricted Boltzmann Machines for Activity Recognition from data collected through smart watches. The method outperformed other models and gained high accuracy results with less computation time.

Additionally, Long short-term memory (LSTM) models has been utilized to predict the activity performed for unbalanced real world data where the model performance wss evaluated using f1 score due to imbalance nature of data [26]. Vepakomma et al. [27] the hand-crafted features are obtained from inertial sensors, and these features are added into DNN algorithm. In this aspect, [28] used PCA as a Dimensionality Reduction Technique before modeling to Deep Neural Network (DNN). However, since domain knowledge is used for feature extraction, the model cannot be generalized.

Some works used Recurrent Neural Network (RNN) for the HAR [26, 29], where the learning rate and computational power are the main constraints. More time is invested in finding the optimal set of hyper parameters that provides the best results. Inoue et al. [30] identified various model parameters and recommended a model that would achieve high accuracy of HAR by turning the hyperparameters. The main constraint of RNN based Human Activity Recognition models is to deal with the time, power constraint environment, while still thriving to achieve good performance results.

Furthermore, CNN’s are used more extensively for HAR tasks with varied experimental settings. In general, CNNs are mostly used for image classification using 2-Dimensional Convolution since it accepts the data with shape n * n. Several works resized the single dimension input data to a 2D image so as to make use of 2D convolution. Ha et al. [31] used similar approach in reshaping the input data to a 2D image. While [7] designed a complex design of CNN algorithm by transforming time series data into an image. Other works include [32, 33] performed data transformation to achieve CNN model driven results.

Advertisement

3. Why particle swarm optimization?

As seen from the above section, deep learning networks have gained better results with less efforts in parameter settings. In particular, Deep Convolutional Neural Networks are used extensively due to its flexibility in both data driven approach (Using 1D Convolution for signal data) and model driven approach (data transformation of signal data to a 2D image). In order to gain higher performance of the model, several layers has to be used and parameter initialization has to be done carefully. This needs a detailed knowledge on CNN architecture and also on the dataset.

Thus, to find the optimal CNN architecture automatically without human intervention, a meta heuristic algorithm Particle Swarm Optimisation is utilized which is easy to implement with lower computational cost.

Theory:

Particle Swarm Optimization (PSO) is a nature inspired, meta-heuristic algorithm often used for discrete, continuous and sometimes for combination optimization problems. The PSO was first introduced by Kennedy and Eberhart in 2001 [34] which is inspired by the pattern followed by a flock of words during flying. PSO works by making only few or no assumptions regarding the problem being optimized and possess the ability to search large spaces of candidate solutions in an efficient manner.

In PSO, a particle is called a single solution and the total of all such solutions is termed as swarm. The main ideology behind PSO is that each particle is well known of its velocity and the best configuration achieved in the past (pBest), and the particle which is the current global best configuration in the swarm of particles (gBest). Hence, at every current iteration, each particle updates its velocity in such a way that its new position will be close enough to global gBest and its own pBest at the same time. The velocity and particle vector are adjusted based to the following Eqs. 2.1 and 2.2 respectively:

vidt+1=wvidt+c1r1Pidxidt+c2r2(Pgdxidt)E1
xidt+1=xidt+vidt+1E2

where vid indicates the velocity of ith particle in the dth dimension, zid indicates the position of ith particle in the dth dimension, Pid and Pgd represents the local best and the global best in the dth dimension, r1 and r2 are the random numbers between the range 0 and 1, c1, c2 and w, are acceleration coefficient for exploitation,acceleration coefficient for exploration and inertia weight respectively. Since the encoded vector in the proposed method is fixed-length and consists of decimal values, and PSO is effective to search for the optimal solution in a fixed-length search space of decimal values, the proposed method will use PSO as the search algorithm. One of the advantages of PSO is that they converge at a faster rate than Genetic Algorithms (GA).

Gaps in the Research:

The literature review outlines the existing works on Human Activity Recognition in terms of the modeling approaches chosen. However, certain gaps are found in both the approaches.

Some research works employed Machine Learning Approach to perform HAR with hand-crafted features faced low performance as only shallow features are explored and learned by the classifiers. Before deep learning was used extensively, shallow neural network classifiers, that is Multi-Layer Perceptron (MLP), was considered to be a promising algorithm for HAR. In this aspect, [35] performed HAR with algorithms like logistic regression, decision tree and MLP and MLP outperformed the other two models.

As deep convolutional neural networks (CNNs) has been used to obtain the excellent results in most of the image classification benchmarks datasets, they have overcome the need of human experts for classification. But still, it remains a challenging task to find the meaningful CNN architecture that would apply for all type of domains. As a result of some of the successful CNN’s architecture like ResNet [36], VGG16 [37], DenseNet [38] were introduced recently considering domain knowledge. The results from this outperformed the state-of-the-art baseline CNN model. However, the CNN’s architecture are designed by doing lot of trial and error methods and are suitable to handle problems only in specific context.

PSO algorithm was employed to train an Artificial Neural Network by [39]. The results show that ANN’s training time was reduced with PSO greatly. In this aspect, [40] designed PSO algorithms for two tasks that is to train ANN and to find better architecture. This resulted in achieving competitive result than other models.

Thus, most of the works in PSO was used to find optimal architectures in full connected networks [41], but these cannot be used for tasks like image classification, activity recognition which indeed used a complex deep layers. Human Activity Recognition. Recently, [42] came up with PSO trained for CNN architecture that is suitable for only image classification with 2-Dimensional Convolutions. The experiment was performed with 10 benchmark datasets, and the results were outstanding.

However, there is not much work done on using 1-Dimensional Convolution for finding optimal architectures in CNN. Considering the gaps in the literature review, the current research aims to address few issues and find solutions that generalizes the models for Human Activity Recognition tasks.

Research Question: Thus, considering the gaps in the above mentioned literature review, this research aims to address the below research question.

“To what extent can the Particle Swarm Optimized Convolutional Neural Network significantly enhance the recognition of human activity from raw inertial sensor data when compared with supervised machine learning algorithms and Deep Learning Algorithms”.

Advertisement

4. Experiment design and methodology

This chapter gives the plan and the research methodology used for performing the research. The Cross Industry Standard Process for Data Mining (CRISP–DM), a well proven methodology with a structured approach (Piatetsky, 2014) is employed to conduct the current study.

The Figure 4 shows the design flow to be followed for the current research. The experiment begins with the Business Understanding phase, which indicates what is to be accomplished from a business perspective. The expected outputs of this phase form the main objectives of the project. Here the insights and goals of the project are defined. In order to answer the research question, the experiment is conducted with two versions of the datasets which is explained in Data Understanding Phase. Additionally, data description report is prepared to understand each filed description. This is done separately for both the datasets.

Figure 4.

Design flow.

The third phase is the Data preparation stage. Here the data is checked for duplicates, null records and appropriate action is taken to address them. Further, new derived fields can be formed based on the domain knowledge. Data from multiple databases are integrated to form the final dataset for modeling. The fourth step is the modeling stage.

Based on initial analysis done from the literature review, suitable modeling technique is chosen and applied on the two versions of the dataset. Next phase is Evaluation phase. Based on the evaluation criteria, models are evaluated to see if it meets the business objective.

Business Understanding: The primary aim of the project is to identify a best classification algorithm which identifies the different human activities in motion accurately. Thus, the predicted activity can be applied to multiple applications like health monitoring and controlling systems, wellness and fit tracking, alarming to emergency situations etc.,

  • Business Success Criteria: The solution for the research problem must not only find the models which performs in classifying target data, but also ensure to show the confirmation that the results obtained are significant and is consistent when tried to repeat the solution. By considering the above business objectives, evaluation criteria and constraints, below hypothesis is formed to answer the research question.

Hypothesis 1

  • Null Hypothesis (H0): If Particle Swarm Optimized Convolutional Neural Network is used instead of supervised machine learning then there is no significant improvement in classification of Human Activity in terms of accuracy and the F1 score.

  • Alternate Hypothesis (HA): If Particle Swarm Optimized Convolutional Neural Network is used instead of supervised machine learning, then there is significant improvement in classification of Human Activity in terms of accuracy and the F1 score.

Hypothesis 2

  • Null Hypothesis(H0): If Particle Swarm Optimized Convolutional Neural Network is used instead of deep learning algorithms, then there is no significant improvement in classification of Human Activity in terms of accuracy and the F1 score.

  • Alternate Hypothesis (HA): If Particle Swarm Optimized Convolutional Neural Network is used instead of deep learning algorithms, then there is significant improvement in classification of Human Activity in terms of accuracy and the F1 score.

Data Understanding: The dataset used in this study is downloaded from UCI Machine Learning Repository created at SmartLab, one of the Research Laboratories at DIBRIS at University of Genova. (Anguita, 2006) experimented on a group of 30 volunteers within a range of age between 19 and 48 years who were performing daily activities. Each subject are volunteer performed daily activities which are monitored while carrying a smartphone (Samsung Galaxy S II) that is waist-mounted. The smartphone was embedded with inertial sensors.

With the help of this embedded accelerometer and gyroscope, 3-axial linear acceleration and 3-axial angular velocity were captured at a constant rate of 50 Hz. To label the data manually, the experiments are captured in the form of video. Additionally, the sensor signals accelerometer and gyroscope were processed by including noise filters and sampled with fixed width sliding windows of 2.56 seconds and 50% overlap that indicates 128 readings per sliding window. The resulting processed signals is a combination of gravity acceleration and body acceleration components and were subjected into a low pass filter to get the separated components with the gravitational force components cut off at the lower end of the filter. As such, vectors are formed from each window to obtain time and frequency domain variables.

Thus, each record in the dataset includes below features:

  • total acceleration and approximate body acceleration which is obtained from Triaxial Accelerometer.

  • Triaxial Angular velocity is extracted from gyroscope.

  • A 561-feature vector with time and frequency domain variables.

  • activity as target.

  • An identifier which indicates the participant who performed the experiment.

Data Gathering: Two versions of the data was made available for modeling purposes. These are mentioned as follows

  • Hand-crafted features of activity windows- Version 1: Each recorded window possess a 561 column vector with time and frequency domain variables rectified separately, a activity label ID indicating the activity performed by the subject and an identifier or the ID of the subject who carried out the experiment.

  • Raw Inertial sensor data - Version 2: Raw signals which are tri-axial from the gyroscope and accelerometer sensor are collected by placing wearable sensors on the volunteer called subjects.

4.1 Modeling

The Figure 5 shows the distribution of target activities. Modeling algorithms are sued to predict the target class.

Figure 5.

Distribution of target class.

4.1.1 Machine learning algorithms – with hand-crafted features – Version 1

As discussed, Version −1 dataset is modeled with classical machine learning algorithms. Each modeling algorithms along with the parameter settings are discussed below.

Naïve Byes: Naïve Byes classifier implements Bayes Theorem providing probabilistic classification [30]. This is suitable for fast computation especially in huge data.

Random Forest: Random Forest Classifier is nothing but a combination of multiple decision Trees which is an ensemble learning method for classification, regression and other machine learning tasks. This performs training by building multiple decision trees and designating the output of the class which is mode in case of classification of the individual trees.

Support Vector Machine: Support Vector Machine is one of the baseline models which gained highest accuracy in human activity recognition when compared with other classical machine learning algorithms. SVM is implemented for both classification and regression tasks and this works by building builds the hyperplane margin between classes.

4.1.2 Deep learning algorithms – With raw inertial signals – Version 2

Sensor based activity recognition requires domain-level knowledge about human activities to analyze even the minute details of sensor data. Though traditional machine algorithms have shown some extra-ordinary performance in classifying human activities, it requires domain knowledge and few labeled data. In contrast, Deep Learning exhibits the capability of training real time activity data that are coming in stream or sequence. Considering Human Activity Recognition as a Time Series Classifications problem which aims at classifying sequences of sensor data, two well-known algorithms LSTM and CNN are modeled on the raw inertial signal data - Version 2.

Long Short-Term Memory (LSTM): Long Short-Term Memory networks – are a special kind of Recurrent Neural Network (RNN), capable of learning long-term dependencies. LSTMs are designed to avoid the long-term dependency problem. The LSTM learns to map each sliding window of sensor data to an activity, where the data points or samples in the input sequence are read one at a time, and each time step may consist of one or more variables.

Convolutional Neural Network (CNN): CNN has achieved good results in image classification, sentiment analysis and speech recognition task by extracting features from signals. CNN has been used for time series classification problems especially in classifying real time activity data because of scaling invariable and local dependencies. Local dependency means the nearby signals in Human Activity Recognition (HAR) are likely to be correlated to each other, while scale invariant means that the scale remains same for different time and domain frequencies. Thus, CNN has a better understanding of learning features that are present in recursive patterns.

Particle Swarm Optimization Based CNN: Though CNN’s have showed good results in HAR, there are multiple parameters to take care to find the optimal CNN architecture. The main focus of any neural network is to minimize the error between training targets and predicted outputs. It is cross-entropy in case of CNN’s, which is carried out by backpropagation and gradient descent. Even a simple CNN’s have many parameters to tune them. Thus, it is significant to find algorithms which finds and evaluates CNN architecture with less time. Thus, motivated from this, a new PSO-CNN is utilized for Human Activity Recognition. The below Figure 6 shows the working of the model.

Figure 6.

Particle swarm optimization training for convolutional neural network.

The working of PSO-CNN can be divided into five stages as below

  • CNN Training – The CNN is trained with some pre-denied weights initialixed.. It uses a CNN with 1D convolutional layer, since the HAR dataset consist of signals in shape [samples, time_steps, no of features]. The output is one hot vector encoded which is 6 (target activity to be predicted).

  • Pre-PSO Training – Here weights are captured from CNN training and it is converted to particle.

  • Particle Swarm Optimization Training - After initializing the values of convergence, cognitive value, social value, number of particles, stopping condition and number of epochs, PSO algorithms searches the hyperplane for optimized vector using the CNN loss function .

  • Update CNN Architecture - Using the values of weight in previous phase, the final results are computed. A new CNN architecture is created is created based on these weights rather than basis of the output.

  • Computation of Prediction Accuracy and Results - The output of the CNN is formed and the final accuracy, loss valued are evaluated.

Advertisement

5. Implementation

Parameter Settings for Particle Swarm Optimization: The parameters used in this category control the behavior of the Particle Swarm Optimisation algorithm. It consists three parameters namely, the number of iterations, the size of the swarm, (Cg) represents the probability of selecting a layer from global best while computing each particle’s velocity. The number of iterations specify the actual number of iterations that the optimal search algorithm will run before optimization is completed. The best CNN architecture that is with best accuracy is saved at after the optimization of the last particle. The swarm size indicates number of particles in the PSO algorithm. Here, each individual particle is a one complete CNN architecture whose performance to be tested by the algorithm Table 2.

DescriptionValue
Number of iterations10
Swarm Size20
Cg0.5

Table 2.

Parameter initialization for particle swarm optimization.

Parameter Settings for initializing CNN architecture: The parameter settings used in the second category control the initial movement of the particles. It involves eight parameters listed in the table below Table 3. In this step, an initial population of swarm which is of CNN architectures. This initial population consists of individuals with CNN architectures picked randomly as defined by these parameters. To limit the number of feature maps from a output of a convolution layer, minimum and maximum number of outputs must be defined. The size of the kernel is always chosen will between the range of the minimum and maximum size of a convolutional kernel. Only the initial architecture is controlled by these parameters, after first initialization the architecture is updated based on design specified.

DescriptionValue
Minimum number of outputs from a conv layer3
Maximum number of outputs from a conv layer256
Minimum number of neurons in a FC layer1
Minimum number of layers3
Maximum number of layers20
Minimum size of a Conv kernel3
Maximum size of a Conv kernel7

Table 3.

Parameter settings for initializing CNN architecture.

Parameter Settings for training Convolutional Neural Network: The parameters here specify the training process of each particle. It includes four parameters to set which is listed in Table 4. These parameters control the weight updating process during the training of each particle. The number of epochs specifies the total number of times the particle is trained using entire dataset before its accuracy is evaluated. The dropout parameter is used in the particle to avoid overfitting. Furthermore, the model includes batch normalization between the layers to avoid overfitting during training process.

DescriptionValue
Epochs for particle evaluation100
Epochs for global best256
Dropout rate0.5
Batch normalizer layer outputsyes

Table 4.

Parameter settings for training convolutional neural network.

Advertisement

6. Evaluation metrics

In order to evaluate the performance of the modeling algorithms, appropriate metric is chosen. Based on the research question that the study is going to address, appropriate performance criteria and its measure has to be chosen. Should also consider the ability and feasibility of the work and the study.

Confusion Matrix Confusion Matrix - Confusion matrix is also known as contingency table, provides a overall performance of the classification model. The Figure 7 shows the format of a confusion matrix.

Figure 7.

Confusion matrix.

This confusion matrix provides important elements of the modeling results, they are described below

  • True Positives (TP): The number of observations in the positive target class which were correctly classified by the model.

  • False Negatives (FN): The number of observations in the positive target class which were incorrectly classified as in the negative target class by the model.

  • False Positives (FP): The number of observations in the negative target class which were incorrectly classified as in the positive target class by the model.

  • True Negatives (TN): The number of observations in the negative target class which were correctly classified by the model.

HAR is a Multilablel classification problem. The main challenge in classification task is to correctly classify the target variables. Only accuracy score cannot give us the overall performance of the model. Hence, confusion matrix which gives the actual number of correct and incorrect predictions made for each target class is considered. Additionally, precison, recall and f1 score is computed. But for comparisions accuracy and f1 score are considered.

Advertisement

7. Results and discussion

The performance of PSO-CNN is evaluated against Machine Learning and Deep Learning Algorithms.

Comparing PSO-CNN with machine learning algorithms: The analysis of results is performed by comparing PSO-CNN with Machine Learning Algorithms. The below Figure 8 shows the results.

Figure 8.

Comparsion of PSO-CNN with machine learning algorithms.

From the table, it is evident that Support Vector Machine achieved accuracy of 95.04% and F1 score 95.1%. The machine learning models were built using Hand-crafted features -Version 1 Dataset. The model achieved satisfactory results without performing any Data Dimensionality reduction techniques. On the other hand, PSO-CNN also achieved considerable results with raw sensor data with accuracy of 93.64%. However, the hand-crafted feature extraction process requires human effort to manually design the features.

Comparing PSO-CNN with Deep learning algorithms: The analysis of results is performed by comparing PSO-CNN with Deep Learning Algorithms. The below Figure 9 shows the results.

Figure 9.

Comparsion of PSO-CNN with deep learning algorithms.

From the table, it is clear that PSO-CNN was able to achieve high performance of accuracy when compared with LSTM and CNN models. LSTM performance was low with accuracy 84.71% and F1 score with 84.42%. This, PSO-CNN gained better results than the state-of-the art CNN model. For a classification problem, the capability of the modeling algorithm to classify each target class correctly also plays a major role. Each The algorithm’s ability to classify each activity like walking, sitting, laying are discussed. From the classification report of PSO-CNN Figure 9, it is evident that PSO-CNN was able to classify most number of activities correctly.

7.1 Hypothesis evaluation

Two hypothesis are formed for the evaluation of the experiment.

Hypothesis 1

  • Null Hypothesis(H0): If Particle Swarm Optimized Convolutional Neural Network is used instead of supervised machine learning then there is no significant improvement in classification of Human Activity in terms of accuracy and the F1 score.

  • Alternate Hypothesis (HA): If Particle Swarm Optimized Convolutional Neural Network is used instead of supervised machine learning, then there is significant improvement in classification of Human Activity in terms of accuracy and the F1 score.

From the Figure 5.13 which compares the results of PSO-CNN with machine learning algorithms, it is evident that PSO-CNN did not achieve better results than SVM. Hence, there is no significant evidence to reject the null hypothesis.

Hypothesis 2

  • Null Hypothesis(H0): If Particle Swarm Optimized Convolutional Neural Network is used instead of deep learning algorithms, then there is no significant improvement in classification of Human Activity in terms of accuracy and the F1 score.

  • Alternate Hypothesis (HA): If Particle Swarm Optimized Convolutional Neural Network is used instead of deep learning algorithms, then there is significant improvement in classification of Human Activity in terms of accuracy and the F1 score.

From the Figure 5.14, it is evident that PSO-CNN achieved higher accuracy than deep learning algorithms CNN and LSTM. Hence we have significant evidence to reject the null hypothesis.

Advertisement

8. Conclusion

This section provides a overall review of the current research. Gives a summary of the research overview, problem definition along with key findings in Experiment Design. Suggestions for future work are highlighted.

Research Overview: The main aim of the research was to predict the human activity using the data collected from Inertial sensors. Detailed analysis on the existing research is made which includes the type of data used and approaches used for Human Activity Recognition task. Broadly, HAR task was achieved using two approaches namely machine learning and deep learning by which used hand – crafted features data and raw inertial sensor data, respectively.

There was not much exploration done on the meta-heuristic optimisation algorithms like Particle Swarm Optimization. With this motivation and considering the gaps in research, the current study was aimed to perform the experiments with two versions of the datasets using various algorithms. As a result, the data gathering, and preparation was performed separately for both the datasets. To compare the performance of PSO based CNN, various modeling algorithms were chosen, and was evaluated using the performance metrics.

The research conducted with the objective to find the classifier with high predictive accuracies compared with two different family of modeling algorithms.

Design/Experimentation, Evaluation & Results: CRISP-DM approach was followed through out the project to get the best outcomes at each step. Accordingly, the implementation began with performing Data Gathering, Data Understanding and Data Preparation for both the data sets separately.

The design involved performing experiments with two versions of the data sets. For hand-crafted features, Machine Learning Algorithms were used for modeling. And since, deep neural networks have capability to take the raw input without any domain-knowledge applied, raw inertial signals data was used. Furthermore, the performance of PSO-CNN is evaluated with suitable metric.

The results were tabulated and detailed analysis was given and it is proved that PSO-CNN showed good results than Deep Learning algorithms, but failed to achieve satisfactory results compared to machine learning algorithms.

Contributions and impact:

Detailed literature review was performed emphasizing on the applications of Human Activity Recognition in various fields. In particular, Sensor Based HAR is highlighted for the readers. This also detailed about the current state of the art techniques in HAR.

A systematic investigation is done for importing two versions of the sensor datasets. This can be used as reference for future works.

Illustrated that PSO based CNN proved to be the best classifier for data where human-engineered feature knowledge is not needed.

Additionally, the work tries to enhance the performance of state-of-the-art design of the CNN model by using Optimisation. This adds up to the generalization of using PSO-CNN model for other Activity Recognition tasks.

In the current research PSO algorithm is used to find the optimal architectures in deep convolutional neural network. Furthermore it make use of the benefits of global and local exploration capabilities of the particle swarm optimization technique PSO and the gradient descent back-propogation thereby to form an efficient searching algorithm this is because the performance of deep convolution network extremely depends on their network structure used and hyper-parameter selections.

In order to find the best hyper parameters lot of training time is employed which requires the deep understanding of CNN architecture and also the domain knowledge. Hence PSO-CNN is employed to optimize these parameter configurations and through which efficient parameters are evolved that would increase the performance with less training time.

Future Work and recommendations The current research can be explored and improved in many ways so as to improve the human activity recognition tasks. The proposed approach also provides a flexible methodology where one can change the initial parameter settings of both PSO and CNN. In this way a trade-off between the model generalization capabilities and complexity of the model can be justified. From the experimental results it is illustrated that PSO has been shown to converge faster and find the best configuration with less training time. This exceed he performance of state-of-the-art results obtained in the domain of HAR.

To some extent, the algorithm failed to recognize the similar he activities like WALKING_UPSTAIRS and WALKING, LAYING and SITTING. This may be due to the insufficient data. The solution can be further explored with large time series data.

The experiment is conducted to explore the capability of deep learning algorithms in HAR tasks. In order to generalize the model capability, this can be applied to other Activity Recognition tasks which includes Time Series data.

Advertisement

Acknowledgments

I would first like to express my sincere thanks to my supervisor Prof. Bojan Bozic for his continuous support, guidance and advice throughout the project. It’s been a good experience and honor to work under his supervision.

Advertisement

Abbreviations

HARHuman Activity Recognition
HARHuman Activity Recognition
ADLActivities of Daily Living
CRISP-DMCross Industry Standard Process for Data Mining
DTDecison Tree
MLMachine Learning
LRMultinomial Logistic Regression
SVMSupport Vector Machine
HMMHidden Markov Model
LSTMLong Short Term Memory
CNNConvolutional Neural Network
PSOParticle Swarm Optimisation
PCAPrincipal Component Analysis

References

  1. 1. Aurangzeb K, Haider I, Khan MA, Saba T, Javed K, Iqbal T, et al. Human behavior analysis based on multi-types features fusion and Von Nauman entropy based features reduction. Journal of Medical Imaging and Health Informatics. 2019;9(4):662–669
  2. 2. Chavarriaga R, Sagha H, Calatroni A, Digumarti ST, Tröster G, Millán JdR, et al. The Opportunity challenge: A benchmark database for on-body sensor-based activity recognition. Pattern Recognition Letters. 2013;34(15):2033–2042
  3. 3. Wang J, Chen Y, Hao S, Peng X, Hu L. Deep learning for sensor-based activity recognition: A survey. Pattern Recognition Letters. 2019;119:3–11
  4. 4. Piwek L, Ellis DA, Andrews S, Joinson A. The rise of consumer health wearables: promises and barriers. PLoS medicine. 2016;13(2):e1001953
  5. 5. Cleland I, Kikhia B, Nugent C, Boytsov A, Hallberg J, Synnes K, et al. Optimal placement of accelerometers for the detection of everyday activities. Sensors. 2013;13(7):9183–9200
  6. 6. Parkka J, Ermes M, Korpipaa P, Mantyjarvi J, Peltola J, Korhonen I. Activity classification using realistic data from wearable sensors. IEEE Transactions on information technology in biomedicine. 2006;10(1):119–128
  7. 7. Jiang W, Yin Z. Human activity recognition using wearable sensors by deep convolutional neural networks. In: Proceedings of the 23rd ACM international conference on Multimedia; 2015. p. 1307–1310
  8. 8. Yu H, Cang S, Wang Y. A review of sensor selection, sensor devices and sensor deployment for wearable sensor-based human activity recognition systems. In: 2016 10th International Conference on Software, Knowledge, Information Management & Applications (SKIMA). IEEE; 2016. p. 250–257
  9. 9. Lara OD, Labrador MA. A survey on human activity recognition using wearable sensors. IEEE communications surveys & tutorials. 2012;15(3):1192–1209
  10. 10. Bengio Y. International Conference on Statistical Language and Speech Processing. 2013
  11. 11. Figo D, Diniz PC, Ferreira DR, Cardoso JM. Preprocessing techniques for context recognition from accelerometer data. Personal and Ubiquitous Computing. 2010;14(7):645–662
  12. 12. Khusainov R, Azzi D, Achumba IE, Bersch SD. Real-time human ambulation, activity, and physiological monitoring: Taxonomy of issues, techniques, applications, challenges and limitations. Sensors. 2013;13(10):12852–12902
  13. 13. Hassan MM, Uddin MZ, Mohamed A, Almogren A. A robust human activity recognition system using smartphone sensors and deep learning. Future Generation Computer Systems. 2018;81:307–313
  14. 14. Khan AM, Siddiqi MH, Lee SW. Exploratory data analysis of acceleration signals to select light-weight and accurate features for real-time activity recognition on smartphones. Sensors. 2013;13(10):13099–13122
  15. 15. Ravi N, Dandekar N, Mysore P, Littman ML. Activity recognition from accelerometer data. In: Aaai. vol. 5; 2005. p. 1541–1546
  16. 16. Yang J. Toward physical activity diary: motion recognition using simple acceleration features with mobile phones. In: Proceedings of the 1st international workshop on Interactive multimedia for consumer electronics; 2009. p. 1–10
  17. 17. Kose M, Incel OD, Ersoy C. Online human activity recognition on smart phones. In: Workshop on Mobile Sensing: From Smartphones and Wearables to Big Data. vol. 16; 2012. p. 11–15
  18. 18. Kwapisz JR, Weiss GM, Moore SA. Activity recognition using cell phone accelerometers. ACM SigKDD Explorations Newsletter. 2011;12(2):74–82
  19. 19. Tang Y, Teng Q, Zhang L, Min F, He J. Efficient convolutional neural networks with smaller filters for human activity recognition using wearable sensors. arXiv preprint arXiv:200503948. 2020
  20. 20. Shaafi A, Salem O, Mehaoua A. Improving Human Activity Recognition Algorithms using Wireless Body Sensors and SVM. In: 2020 International Wireless Communications and Mobile Computing (IWCMC). IEEE; 2020. p. 607–612
  21. 21. Aslan MF, Durdu A, Sabanci K. Human action recognition with bag of visual words using different machine learning methods and hyperparameter optimization. Neural Computing and Applications. 2020;32(12):8585–8597
  22. 22. Arel I, Rose DC, Karnowski TP. Deep machine learning-a new frontier in artificial intelligence research [research frontier]. IEEE computational intelligence magazine. 2010;5(4):13–18
  23. 23. Najafabadi M, Villanustre F, Khoshgoftaar T, Seliya N, Wald R, Muharemagic E. In: Deep Learning Techniques in Big Data Analytics; 2016. p. 133–156
  24. 24. Plötz T, Hammerla N, Olivier P. Feature learning for activity recognition in ubiquitous computing; 2011
  25. 25. Liu Z, Wu M, Zhu K, Zhang L. SenSafe: A smartphone-based traffic safety framework by sensing vehicle and pedestrian behaviors. Mobile Information Systems. 2016;2016
  26. 26. Guan Y, Plötz T. Ensembles of deep lstm learners for activity recognition using wearables. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies. 2017;1(2):1–28
  27. 27. Vepakomma P, De D, Das SK, Bhansali S. A-Wristocracy: Deep learning on wrist-worn sensing for recognition of user complex activities. In: 2015 IEEE 12th International conference on wearable and implantable body sensor networks (BSN). IEEE; 2015. p. 1–6
  28. 28. Walse KH, Dharaskar RV, Thakare VM. Pca based optimal ann classifiers for human activity recognition using mobile sensors data. In: Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems: Volume 1. Springer; 2016. p. 429–436
  29. 29. Edel M, Köppe E. Binarized-blstm-rnn based human activity recognition. In: 2016 International conference on indoor positioning and indoor navigation (IPIN). IEEE; 2016. p. 1–7
  30. 30. Inoue M, Inoue S, Nishida T. Deep recurrent neural network for mobile human activity recognition with high throughput. Artificial Life and Robotics. 2018;23(2):173–185
  31. 31. Ha S, Yun JM, Choi S. Multi-modal convolutional neural networks for activity recognition. In: 2015 IEEE International conference on systems, man, and cybernetics. IEEE; 2015. p. 3017–3022
  32. 32. Singh MS, Pondenkandath V, Zhou B, Lukowicz P, Liwickit M. Transforming sensor data to the image domain for deep learning—An application to footstep detection. In: 2017 International Joint Conference on Neural Networks (IJCNN). IEEE; 2017. p. 2665–2672
  33. 33. Li X, Zhang Y, Marsic I, Sarcevic A, Burd RS. Deep learning for rfid-based activity recognition. In: Proceedings of the 14th ACM Conference on Embedded Network Sensor Systems CD-ROM; 2016. p. 164–175
  34. 34. Kennedy J. Swarm intelligence. In: Handbook of nature-inspired and innovative computing. Springer; 2006. p. 187–219
  35. 35. Godino-Llorente JI, Gomez-Vilda P. Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors. IEEE Transactions on Biomedical Engineering. 2004;51(2):380–384
  36. 36. Mihanpour A, Rashti MJ, Alavi SE. Human Action Recognition in Video Using DB-LSTM and ResNet. In: 2020 6th International Conference on Web Research (ICWR). IEEE; 2020. p. 133–138
  37. 37. Qassim H, Verma A, Feinzimer D. Compressed residual-VGG16 CNN model for big data places image recognition. In: 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC). IEEE; 2018. p. 169–175
  38. 38. Iandola F, Moskewicz M, Karayev S, Girshick R, Darrell T, Keutzer K. Densenet: Implementing efficient convnet descriptor pyramids. arXiv preprint arXiv:14041869. 2014
  39. 39. Gudise VG, Venayagamoorthy GK. Comparison of particle swarm optimization and backpropagation as training algorithms for neural networks. In: Proceedings of the 2003 IEEE Swarm Intelligence Symposium. SIS’03 (Cat. No.03EX706); 2003. p. 110–117
  40. 40. Carvalho M, Ludermir TB. Particle swarm optimization of feed-forward neural networks with weight decay. In: 2006 Sixth International Conference on Hybrid Intelligent Systems (HIS’06). IEEE; 2006. p. 5–5
  41. 41. Dehuri S, Roy R, Cho SB, Ghosh A. An improved swarm optimized functional link artificial neural network (ISO-FLANN) for classification. Journal of Systems and Software. 2012;85(6):1333–1345
  42. 42. Junior FEF, Yen GG. Particle swarm optimization of deep neural networks architectures for image classification. Swarm and Evolutionary Computation. 2019;49:62–74

Written By

Preethi Gunishetty Devarakonda and Bojan Bozic

Reviewed: 15 March 2021 Published: 02 November 2022