Open access peer-reviewed chapter

Autonomous Update of a Dataset for Anomaly Detection Services in Elderly Care Smart House

Written By

Linos Nchena and Martin Tomášek

Submitted: 04 January 2022 Reviewed: 25 February 2022 Published: 13 July 2022

DOI: 10.5772/intechopen.103953

From the Edited Volume

Internet of Things - New Trends, Challenges and Hurdles

Edited by Manuel Domínguez-Morales, Ángel Varela-Vaca and Lourdes Miró-Amarante

Chapter metrics overview

106 Chapter Downloads

View Full Metrics

Abstract

This work proposes a smart system that could be useful in the delivery of elderly care services. Elderly care is a set of services that are provided to senior citizens to help them have a more comfortable and independent life which would not be possible without these services. This proposed system is unique in that it combines the detection algorithm with the automatic update of the dataset. It also uses a heuristic mechanism to reduce false detections. This is on the premise that the AI effort is good, but it could be made better with the inclusion of heuristics. Fall detection accuracy is initially solved by the first classifier, then another classifier evaluates the result with inferences before evoking an alarm. It checks the location of the subject to use in its inferences. Hence the smart house design consists of two machine learning systems. One system performs human activity classification while the other performs fall occurrence detection. Of the eight different classification methods utilized, XGBoost was most accurate with an average of 97.65% during training. A customized dataset is then generated with newly labeled data hence improving system performance.

Keywords

  • artificial intelligence
  • machine learning
  • human activity recognition
  • activity of daily living
  • fall detection
  • fall prevention

1. Introduction

According to the United Nations [1], in 2025, there would be a total of about 1.2 billion people over the age of 60. By 2050 this number would increase to 2 billion with 80% of them living in developing countries. The population growth has increased for older persons than for the rest of the population. In 1950 the total number of people over 60 years old was 8 percent. By 2007, this percentage had grown to 11 percent. By 2050, this number is projected to be at 22 percent. This kind of population growth comes with various challenges of its own [2].

This increase in the elderly population means more people would require assistive living care than they were before. Thus, Smart houses could play a key role in attending to this huge elderly population that needs assistive living care. A smart house is a special type of house that has automated services delivered by that house. Smart houses are of diverse types based on their purpose. Some of the common types of smart houses include (a) healthcare-oriented, (b) entertainment-oriented (c) security-oriented, and (d) energy-efficiency-oriented smart houses. In this research, we present a smart house model that is based on specific requirements for elderly citizens. With an emphasis on the needs of assistive technologies (AT), we shall recommend a smart house design. The design shall satisfy three major requirements which are vital for senior citizens. These three requirements are; (a) ATs services, (a) privacy requirements, and (c) security services. We are aiming to develop a smart house system which consists of several monitoring services This system should also enable modularization and allow easier replacement of components.

With the vast improvements in medicine and quality of living, many people are now living longer lives than previously was possible. This has been a result of vast investment in research that will improve quality of life. In trying to improve the quality of life, several researchers have attempted to provide a solution to care for senior citizens [3]. These solutions need enhancement to build completely novel solutions to deal with the growing demand for senior citizen care soon. The purpose of this work is to explore how this problem can be controlled using assistive technologies. To help assist with this issue we shall have to perform three experiments. We need to create a system that can be able to tell when an anomaly has occurred in the senior’s smart house. This requires knowing what is and what is not an anomaly. We have data that is recorded from the activity in the smart house using a conventional sensor such as those in mobile phone sensors or smartwatches. We shall require label data to determine if the data tread is normal or abnormal. This can be achieved by using the labeling used in the previous experiment dataset. Several publicly available datasets exist. Among the common dataset include Sisfall, MobiAct, Ucihar, Unifall, and Unimab datasets [4, 5, 6]. These datasets provide acceptable benchmarks to determine the classification of ADLs and falls. These could be used in the classification of data, or to assess a system’s accuracy.

In a smart house, assistive technologies are installed to detect abnormalities in human activity or environmental parameters. This is achieved using several methods. Three of the common methods are threshold, heuristics, and machine learning [7]. Threshold systems used specified rules in which a dataset is evaluated on those rules. Based on these rules a censored dataset can be labeled as a fall or not a fall. Using machine learning a different approach is used. A network is created which has node relationships that can be able to determine whether an activity is a fall or not has occurred. ML works similarly to a Blackbox solution as the rules are not logically deductible easily in the network c. Several machine learning classification algorithms exist in the labeling of subject data. Nine of the common classical ML algorithms are k-Means, Linear discriminant analysis (LDA), Naïve-Bayes, K-nearest neighbor (KNN), Vector support machine (SVM), Artificial neural network (ANN), Random Forest, and Decision trees [8].

Moreover, when collecting personal data, security and privacy should be considered. For example, cameras might capture more private information than smartwatches, some of this information can violate privacy. When using the toilet, the activity is not appropriate to record on camera while a smartwatch record of toilet activity might be more acceptable. Hence the choice of sensor method is especially important in developing this system. However, take note that it may be easy for detecting activity with a camera than with smartwatches. Therefore, a compromise needs to be considered in such cases.

The rest of this article is organized as follows; In Section 2, we describe some of the previous works related to ours by other researchers. In Section 3 we describe the methodology of what would be performed and how it would be performed. The section describes the flow of the algorithm and the dataset used. We then discuss the results of the experimental works in Section 4. In Section 5 we discuss issues that are related to our results and the future directions of this research. In Section 6 we present our findings and conclusion from this research and what is next.

Advertisement

2. Related research

Several people have attempted to solve this problem. The following are the most interesting of the research works which are of interest to this article’s research aims.

According to the article [3], a solution is proposed where a group of agents work together to sense communicate and interpret sensor data. The agents are seven types which consist of communication, sensor, refining, reconstruction, interpretation, prevention, and cognitive agents. The agents are separated into two groups. The group of agents each processes the sensor data and then aggregates the result to form a concrete decision. The first group of agents was prediction activities in the smart house. The performance was tabulated as 72.00% for machine learning, 88.00% for expert-knowledge agents, and 91.33% for meta-prediction agents. The last group of agents was prevention agents. This was a simulation and achieved 100% accuracy. However, the last group was not real-life but simulative experiment.

The researcher in the article [9] presents a monitoring system for senior citizens. If an anomaly is identified, then the system will send an alarm to a caregiver. Some activities monitored include waking up in the morning, preparing food, having breakfast, reading, working on a computer, having lunch, napping, or reading. An example of an anomaly is where the system detects that she woke up and starts walking around the house at 2 pm. This is an anomaly because this time is an awkward time for walking around the house. An alarm is evoked as this is not part of the normal schedule. A mock apartment was designed for use in this experiment.

The study in the article [10], graphical presents a comparison and heuristic technique was utilized in detecting falls. A publicly available suit GBAD test suite was defined. It selects the best subgraph or pattern and then compares it with the sensor data. Each graph is compared to the full graph using the formula. An abnormality can be identified from the graph in the suit presented.

2.1 Datasets with classification labels

In article [11] a dataset (Sisfall), and a fall detection algorithm are presented. The algorithms have five stages which are included in this order; sensor data, pre-process data feature extraction fall detection, and finally, call for help when a fall is detected. Four algorithms were used, and these are decision tree (DT), Logistic regression, k-nearest neighbor (KNN), and support vector machine. The dataset that was used is called Sisfall. Sisfall contains 15 types of falls and 19 types of ADL. These were performed by 23 young people aged 19 to 30 years old and 15 elderly people aged 60 to 75 years old. The recording frequency was 200 HZ for sampling. The sensors used were two accelerometers and one gyroscope. The accuracy was derived from the relationship between true positive (TP), true negatives (TN), false positive (FP), and false negatives (FN). Accuracy is defined as in Eq. (1) below.

Accuracy=TP+TNTP+FN+TN+FPx100E1

The accuracies recorded were DT at 99.02%, LR at 99.38%, KNN at 99.91%, and SVM at 99.98%. The most accurate results are the SVM classifier. SVM was found to have performed not only better in this experiment but also better than selected previous benchmark works of previous performances.

In article [12], a dataset (MobiFall) is presented. An experiment was conducted to standardize a dataset that can be used to determine if there is or no fall in a sensor. Datasets are used in machine learning to benchmark and identify specified activities. They compared two fall detection systems. One threshold-based system and another machine learning-based system. The machine learning system had a higher accuracy level. In this dataset, four kinds of falls were studied. Forward-lying, front-knees-lying, sideward-lying, and back-sitting-chair. Apart from fall detections, ADLs were also studied. These were nine which include standing, walking, jogging, jumping, starting up, stair down, sitting in a chair, car-step in, car step out. The sensor data used was from three types of sensors, the accelerometer, gyroscope, and orientation signals. The Size of displacement defined by the slope (SL) was measured using the formula SL given as in Eq. (2) below.

SL=maxxminx2+maxyminy2+maxzminz2E2

Where the X stands for X-axis displacement, Y is the Y-axis displacement and Z is the Z-axis displacement.

Accuracy in fall detection was at 98% and in fall classification was at 68% using the 10-fold cross-validation. However, in another method where two-thirds are for training and one-third for testing, the accuracy was fall detection at 98.74 and fall classification at 68%. The dataset used is called MobiFall. This dataset is publicly available at Hellenic Mediterranean University (HMU) in Crete, Greece.

In article [13] a dataset (Ucihar) and six classifiers are used to detect falls. These involve distinguishing falls from ADLs. The six classifiers are the k-nearest neighbor (k-NN), least squares method (LSM), support vector machines (SVM), Bayesian decision making (BDM), artificial neural networks (ANNs), and dynamic time warping (DTW), Fourteen people performed the experiment for data acquisition. The trial had 20 falls and 16 ADLs. The formula used to determine the total acceleration is given in Eq. (3) below.

AT=Ax2+Ay2+Az2E3

where Ax is acceleration in the x-axis, Ay is in the y-axis, and Ax is in the z-axis.

A database was created containing fall activities and ADLs. All the six algorithms performed at around 95% with K-NN and LSM being the most accurate. The researcher suggests using these two algorithms for live data stream detections. This dataset is accessible at the University of Irvine Machine Learning Repository.

2.2 Types and placement of sensors

For the recording and collection of data, specific sensors are acquired. Several types of sensors exist. Wearable sensors are sensors that can be placed on the body of interest. Environmental sensors are sensors that are not embedded in a body of interest. Data collection should be done at a regular frequency. However, we shall assume that the sensors wherever they would be placed would collect the same type of data at the same frequency. This is on the pretext that, if the location of the sensor is changed the frequency and the quality of data might as well change.

In article [14], two types of sensors are defined. Vision-based and sensor-based. The vision sensors use cameras of diverse types. These sensors however are not very much acceptable to the intended beneficiaries of the system. Another type is sensor-based, which includes wearables ambient sensors, and sensors on an object. This is the most used type as it is considered less intrusive and is more acceptable by potential beneficiaries. An accelerometer and gyroscope are two examples of wearable sensors. In article [15] three sensory systems are defined. These are wearables, vision, and ambient sensors which are a combination of visual with sound and location sensors. The visual sensors become more data accessing with the addition of location sensors and sound sensors. The human voice can also be used as input. A representation of the three types of sensors is shown below in Figure 1.

Figure 1.

Types of sensors in the system design.

In article [16] a system of sensors is proposed. Environmental and wearables sensors are combined to identify the location and the motion of the subject person. Each of these has achieved a significant level of activity classification accuracy. It is declared that using only an accelerometer the accuracy is 54.19%, while if you combined an accelerometer with environmental sensors the accuracy is 97.42%.

In research [17] the authors say the approach is intrusive as it requires active dressing, and it could be monitoring specific categories and not extensive categories of activities. It also might require the continuous wearing of the sensors throughout the day to enable data sensing. Therefore, researchers argue that environmental sensors are much better than wearable sensors. They provide experiments using a vision-based sensor. The activities involved include sitting, standing, walking, sleeping, getting assistance, and using the bedside commode and background. The camera has two kinds of data frames. A thermal defame and depth frame. They adjust each frame to its best resolution for better results. Although the authors acknowledge the intrusive and costly nature of vision sensors, they insist that more can be done with visual sensors to be used in the detection of activities in houses of seniors.

In article [18] a voice is used as input for a smart house. The smart house must interpret this voice according to the training and evoke some devices to perform a particular action. The purpose of the article was to provide voice input with a secure connection to a smart house with IoT Network. In article [19] the author mentions that training is a difficult part of a detection system. This is because the training dataset can become obsolete when the individual trained on it changes their usual pattern. The mentioned in cases when senior develops diabetics or drift concept. With advanced age, this leads to body deterioration which results in changes in gait characteristics of seniors [20]. A summary of sensors and classifiers from related works is shown below in Table 1.

NoCitedType of sensors used in experimentClassifications models utilizedDatasetAccuracy
13AccelerometersRandom Forest, Hidden Markov Model, Support vector machine, Decision Tree (C45)Lab
Simulated
72.00%
29Wearable sensors, Vicon system (PIR), cameraDynamic Bayesian NetworkLab
Simulated
99.00%
310Infrared sensors, thermometers, object identifies, burning sensors, door sensorsGraph-Based Anomaly Detection (GBAD tool)Kyoto dataset (CASAS-400)n/a
411Accelerometers, and a GyroscopeSupport vector machine, Decision Tree, Logistic Regression, K-Nearest NeighborSisfall99.98%
512Accelerometer, Gyroscope and Orientation signalK-Nearest Neighbor, Naïve Bayes, Decision tree-J48, Random Forest, Support vector machineMobilFall98.41%
613Accelerometer, gyroscope, and magnetometer/campusK-Nearest Neighbor, Support vector machine, BDM, Decision Tree, ANNUniversity of Irvine Machine95.00%
715Accelerometers and a GyroscopeCNN-Long short-term memory (LSTM)UMA Dataset86.63%
816Thermometer, Humidity sensor, Light sensors, Orientation and Motion sensorsRandom Forest, Naive Bayes, Bayes Net, Logistic Regression, MLP, Radial basis function (RBF), Decision Table, Decision Trees(J48), Random TreeCREST testbed,99.13%
917Depth sensor Camera, thermal sensor cameraConvolutional Neural Network - (ResNet-34 architecture)Experimental testbed95.80%
10193D Accelerometer and GyroscopeMulti-layer perceptron (MLP)MobiAct98.75%

Table 1.

Listing of sensors and classifier methods in selected previous research.

Advertisement

3. Research methodology

In this section, we shall describe the tools and the methods we shall use. We shall discuss the sensors used, the datasets used, the algorithms utilized and the floor plan of the apartment that will be used for the senior’s smart house.

3.1 Implementation procedure, software, and hardware

Since the research is about seniors, the dataset had to be slashed to only utilize data that applies to seniors. The age of participants was the main factor used to extract records. Some activities were removed as we considered them not necessary for seniors. Based on this In the Sisfall of 56,786 records, only 14,000 have seniors. In the MobiFall dataset on the 62,259 records, only 16,598 have seniors. In the Ucihar dataset on the 7352 records, only 2018 has seniors. After extracting these records, seven classifier algorithms were used to train the model on all three datasets and then validate the models.

The classification was for two separate tasks. The first task was fall detection. The MobiAct and Sisfall were used in this task. Each dataset has its subsection used in the experiment. The performances of the algorithms were compared to the dataset subclass from these two datasets. The best algorithm was identified. The second task was to identify if the ADLs were Laying or not. In this task, only the Ucihar dataset was used. The eight classifiers were again used to detect if Laying was the activity performed or not. This task was then linked to some other external tasks that tell the location where Laying was occurring if Laying was identified as the current ADLs.

The experiment was conducted using Python 3 software package. The computer system had the following specification: Processor: Intel(R) Core (TM) i7-4510U, CPU: 2.00GHz, 2 cores, RAM Memory: 4 GB DDR3 1600 MHz, OS: Windows 64 bits.

3.2 Sensors

The sensors we shall use in collecting data are a gyroscope and accelerometer for the XYZ plane. Nowadays These are readily available in mobile phones and smartwatch devices, which are usable for human activity recognition. The sensor using cameras and environment sensors has the advantage that they do not require excessive preparation and arrangement for senior citizens. However, the sense that the senior is under surveillance might be a discomforting feeling for most seniors. Therefore, this discomfort has made us decide to use an accelerometer and gyroscope other than cameras. Therefore, in this work gyroscope and accelerometer which are embedded in smart are utilized. This sensor data can be analyzed and processed in separate locations. The data we collected from the sensors is not labeled. Therefore, to enable labeling we have mapped our data to the data labeled by previous researchers. The labeled data will be obtained from publicly available data sets.

3.3 Datasets

Several datasets exist for human activity recognition (HAR) and fall classification. These can be used to classify test data as either a fall or an ADL. In our research, we shall use the publicly available datasets; SisFall, MobiAct, and Ucihar dataset.

In research [11, 21] the Sisfall dataset is generated in which most of the subjects are between the ages of 20 and 47 years. The primary research was to create a prototype dataset for fall detection. The second dataset is MobiAct in research [12, 22]. In the MobiAct dataset, the subjects’ ages are from 40 to 47 also doing both the falls and the ADL. This Sisfall and the MobiAct dataset are used primarily for the exploration of fall detection. The third dataset is the Ucihar [13, 23]. This dataset does explore the identification of ADL. There is no fall detection in Ucihar. We shall use this in ADL identification where appropriate as shall require a labeled dataset that could define ADL. In our research we focus on senior citizens hence we prefer data for people about 60-year-old and above. However, this data is not readily available hence shall show infer it in various datasets. Below is Table 2 which shows the composition of the dataset to be used in the experiment.

No.DatasetHuman activity recognition samplesFall detection samplesTotal number of data samples
1Sisfall44,79511,99156,786
79%21%100%
2MobiAct50,18812,07162,259
81%19%100%
3UCIHAR735207352
100%0%100%
4Total102,33524,062126,397
81%19%100%

Table 2.

Composition of the selected datasets used in the experiment.

The experimental devices must be held compatible with mobile devices that were sensing the data. For fall detection we shall classify falls as dangerous activities. Therefore, we should need to send a warning message if a fall occurs unlike when an ADL occurs. The data has been labeled by the above public libraries (Sisfall, MobiAct, and Ucihar). We shall use the accelerometer, which records the speed of objects. And using this speed we could tell the presence or absence of a fall. The gyroscope would be used for rotational movements which is another parameter in the detection of a fall by the senior citizen. The accuracy is the efficiency of the system. We shall use selected sections from the three sample datasets, which are more appropriate per our requirements.

After preprocessing the raw data, a training and test dataset is derived which has a smaller number of records. The total data fields are nine both in MobiAct and Sisfall as shown in Figures 2 and 3 respectively.

Figure 2.

Sample contents of the Sisfall dataset.

Figure 3.

Sample contents of the MobiAct dataset.

Ucihar is for the activity classification classifier. Unlike the above dataset, the original dataset has a total of 548 columns. While Figure 4 below shows only 14 columns that have the largest value in importance for classifier purposes.

Figure 4.

Sample contents of the Ucihar dataset.

In the sample datasets, there are various scenarios of the data record. However, in this research, only the scenarios that are best suited to our purpose were used. The human activities include that were studied include, Standing, Sitting, Laying, walking fast, walking slow, walking downstairs, and walking upstairs [24]. We divided these activities into two groups per dataset. Group 1 contains ADLs and group 0 contains falls. There are four types of fall activities and six types of ADLs. For data preprocessing, an operation is performed to divide records into two groups. In the fall detection procedure, MobiAct and Sisfall datasets will groups ADLs as zeros and all the several types of falls as ones. In the activity recognition procedure, the data in the Ucihar dataset’s group consisting of Laying activity (sleeping) is labeled as 1, and all other activities are grouped under label 0. By creating these two groups we can thus use a simpler classifier, binary classification, instead of multiple classifications.

3.4 Algorithms selection from common algorithms

Several algorithms exist for anomaly detection systems. For the execution of detection falls, we shall utilize eight different machine learning algorithms [25]. These algorithms would be compared, and the best should be used in the application of the smart house. Eight algorithms are selected for this experiment as follows. Logistic regression, Linear discriminate analysis (LAD), k-nearest neighbors (k-NN), decision tree classifier, Gaussian naive Bayes, Support Vector Machine (SVM), Random Forest, and xgboost algorithms. The best performing among these algorithms is to be utilized. The algorithm’s performance would have to be weighed by the following parameters.

  1. Optimization—eliminate the worst performing algorithms

  2. Completeness—eliminate some other solution where a result is returned

  3. Accuracy and precision—the degree of accuracy attained and required

  4. Execution time—period of performance the classification

  5. Resource consumption—memory and processor usage

The combination of human activity detection and the fall detection algorithm is detected based on these above four factors utilized.

3.5 Design of the floor plan of the smart house

Figure 5 below which shows the simulated floor plan of the smart house. The sample house has the following floor plan. A1 and A2 are bedrooms for sleeping. B1 and B2 are corridors. C is the toilet; D is the Bathroom and E is the kitchen for cooking.

Figure 5.

An illustration of the sample smart house floor plan design.

3.6 Remainder alarm for medicine taking routine and camera/pressure mat

There would be a remainder device. This becomes the third Component of the system. The first is fall detection, the second is activity classification and the third is alarm detection and camera. Given the apartment above, we must use the fall data and the human activity recognition. We should make these assumptions.

  1. Non-serious fall is considered a fall. No alarm but warning recorded

  2. Sleeping occurs in room A1 or A2. Send alarm if sleeping anywhere else.

  3. Laying outside of rooms A1 and A2 should trigger a warning

  4. Activity should not switch abruptly. For example, sleep to walk to walk

  5. Being in the toilet for a lengthy period indicates a problem, hence alarm is evoked,

  6. A medical dispensary is kept in the room (E). A pressure sensor sends a photo when a pill is removed from the dispensary. After 5 minutes after the scheduled time for taking the pill passes, then an alert is sent out to the caregiver.

These six points are used as the heuristic when identified. Once it is identified then an alarm or warning is evoked. We need to conduct two tests before we send an alarm as a way of avoiding false alarms. Identify the activity and then identify the room and the applicable algorithm’s location also establish that Laying is not in an inappropriate room. Furthermore, a delay in the bathroom should trigger a warning. The second algorithm verifies that a fall is not a Laying, and a Laying is not a fall. Once this is established only then can an alarm to send. This could reduce false alarms and increase confidence levels unlike having one algorithm.

3.7 Description of the three algorithms

Three algorithms are derived to execute the above procedures. Below is the description of the three algorithms pseudocode.

Algorithm 1: Update datasets, test, and use them in future training.

01: Retrieve sample records from the datasets #1, #2, and #3.

02: train model using standard algorithms and records of the subject person.

03: If the time elapses sent collected data for analysis by the selected algorithms.

04: end if.

05: if the algorithm accuracy is top two use the best and then discard the rest.

06: end if.

07: process and identify the chances of ADL.

08: process and identify the chances of a FALL.

09: if ADL is sleeping but the room is not sleeping quarters send an alert signal.

10: endif.

11: if a Fall is detected sent an alert signal.

12. else save the data and then move into the waiting stage.

13: endif.

14. Repeat the strategy starting from point 01.

Algorithm 2: Sleeping area locator for logical heuristic missed fall prediction.

01: Retrieve sample records from the datasets #1, #2, and #3.

02: train model using standard algorithms and records of the subject person.

03: if the HAR is laying or napping find out which room is activity occurring.

04. If the room is not appropriate for Sleeping quarters, send a warning alarm.

05. If sleep is in the sleeping quarter’s location, then move to the waiting stage.

06: end if.

07. Repeat the strategy starting from point 01.

Algorithm 3: Medical remainder algorithm, to predict skipping of medicine routine.

01: Retrieve from schedule records on times required for taking medication.

02: At each required time check the weight of the medicine pressure.

03: If the weight has been adjusted then its confirmed medicine has been taken.

04: If the weight is still the same then send a warning signal to inform caretaker.

05. else save the data and get a new data sample.

06: endif.

07. Repeat the strategy starting from point 01.

Based on the above algorithms, the alarm is triggered as a response. These responses will differentiate possible similar activities (such as laying and falling) before evoking the alarm. As shown in Table 3, when the results of the algorithm are as provided in Answer (I) then the Alarm is evoked. If the answers are as in Answer (II) then a warning is logged in a database. Three warnings in a sequence also trigger an alarm.

AlgorithmsTests before triggering an alarmAnswer(I)Answer (II)
1Falling has been detected or not?YesNo
2Laying has been detected or not?NoYes
3Laying occurring in an appropriate room?NoYes
ResultSend alarm to CaregiverYesNo

Table 3.

Question to answer before invoking a particular algorithm.

3.8 Updating training dataset

The data collected from the senior citizen’s sensor is originally not labeled. When a detection process is completed the sensor data would then be assigned a label. Once labeled, then the system would save this information with its given label. After the label is authenticated, this record is then moved to the created dataset for extension of the original dataset. At this point, the system would save the labeled data into a new dataset which is the original dataset plus the new record. The record can then be used in training sessions. This new dataset would have an extra record that more closely represent the person involved. In this case, the training would reflect the subject senior citizen. Below is Figure 6 showing the systems’ flow chart.

Figure 6.

Flow chart of creation of custom datasets.

In Article [3] similar research is presented. The authors make a comparison of three datasets and look at the performances of different algorithms. In this work, we have compared results from two datasets for the human activity detection algorithms. The results would be fused to reduce the probability of false positive or false negative.

Advertisement

4. Results

Eight algorithms had their performance studies as indicated in Figure 7 below. Of these eight, some would be discarded in preference for the best-performing algorithm. During the experiment, algorithm ranking was established. The lower-ranked as less effective algorithms are to be eliminated. This reduced number of options increases the efficiency as unfavorable options are not computed. The extra computation would be the worst of resources. The selecting of the best option thus avoids unnecessary use of computing power in analyzing some irrelevant options. The less likely algorithm options are removed immediately when identified. Below is Figure 7 which is a snapshot of the accuracies of various investigated methods.

Figure 7.

Comparison of algorithm performance per dataset.

In Figure 7, Algorithm KNN is more accurate for the Sisfall dataset, and XGBoost is more accurate for both MobiAct and Ucihar datasets. A selection of three of the graphs of the accuracy with the best performance is presented below. The next two graphs are for fall classification and the second is for human activity classification.

The detection accuracy for the Sisfall dataset is indicated as 98.08% for the training session and 97.92% for the testing session as shown below in Figure 8. This is the best accuracy from the list of classifiers in the experiment.

Figure 8.

Results for accuracy simulation using the Sisfall dataset.

The detection accuracy for MobiAct dataset accuracy is indicated in Figure 9 at 99.23% for the training session and 98.84% for the testing session. This is the best accuracy from the list of classifiers in the experiment.

Figure 9.

Results for accuracy simulation using MobiAct dataset.

The detection accuracy for Ucihar dataset accuracy is indicated at 96.85% for the training session and 95.21% for the testing session as shown below in Figure 10. The Ucihar accuracy at 96.85% is the worst accuracy of the usages of XGBoost classifiers.

Figure 10.

Results for the accuracy simulation using the Ucihar dataset.

The Sisfall test dataset has 14,783 ADLs cases and 3775 fall cases; Figure 11 shows the confusion matrix for the Sisfall test dataset during the training session.

Figure 11.

Confusion matrix for XGBoost classifier on Sisfall dataset.

The MobiAct test dataset has 16,562 ADLs cases and 3984 fall cases; Figure 12 shows the confusion matrix for the MobiAct test dataset.

Figure 12.

Confusion matrix for XGBoost classifiers on MobiAct dataset.

The Ucihar test dataset has 1963 ADLs cases and 464 Laying cases; Figure 13 shows the confusion matrix for the Ucihar test dataset.

Figure 13.

Confusion matrix for XGBoost classifier for Ucihar dataset.

From the confusion matrix, we can extract the accuracy, sensitivity, and specificity of our classifier. The higher each of these parameters the better the performance of the classifier. However, accuracy must be considered in conjunction with specificity and sensitivity. A classifier must have a high sensitivity and specificity, to be defined as having superior performance. As seen in Table 4 below, both sensitivity and specificity are above 78% which is high performing case. This shows that the one selected option from the eight models performs quite well and can be used to develop the proposed system. As indicated in Figure 10, the MobiAct dataset accuracy is recorded at 98% for the training and at 99% for the testing which is the best accuracy of our possible classifiers. Below is Table 4 which indicates the performance of the most effective classier(XGboost) in the experiment.

Dataset ExtractsTrue / FalseFalse / TrueClass TotalFinal TotalAccuracySensitivitySpecificity
ADL(Sisfall)14,7196414,78318,73998.69%95.42%99.57%
Fall (Sisfall)18137753956
ADL(MobiAct)16,5471516,56220,54699.73%98.97%99.91%
Fall (MobiAct)4139433984
ADL(Ucihar)1921421963242794.19%78.66%97.86%
Laying(ucih..)993654642427

Table 4.

Indicators of classifier efficiency.

The high accuracy indicates that the models were efficient and can be used in detections. However, when executing these algorithms speed and accuracy are a factor in optimization. If given more time an algorithm can perform better. However, this has to be considered with efficiency on time when an anomaly is reported. If it takes too much time to compute a high accuracy decision, it could be that by the time the decision is taken it’s too late for the damage already done. Below is Figure 14 which indicates the time it took for each algorithm to complete a single task.

Figure 14.

Recorded time consumed per tested algorithm.

Moreover, these eight machine learning methods were compared with deep learning. Deep learning has multiple models chained together to enhance performance. However, the computation costs were more than for the machine learning method.

The accuracy deep learning method on the Sisfall dataset Sisfall selection dataset accuracy is recorded at 96.84% for the training session and at 93.55% for the validation session as indicated below in Figure 15.

Figure 15.

Deep learning accuracy on the Sisfall dataset.

Deep learning accuracy on the MobiAct dataset selection dataset accuracy is recorded at 96.97% for the training session and at 100.0% for the validation session as indicated below in Figure 16.

Figure 16.

Deep learning accuracy on the MobiAct dataset.

Deep learning accuracy on the Ucihar dataset Sisfall selection dataset accuracy is recorded at 98.96% for the training session and at 100.0% for validation as indicated below in Figure 17.

Figure 17.

Deep learning accuracy on the Ucihar dataset.

Advertisement

5. Discussions and future works

The results show that the fall detection and activity recognition algorithms are competitive. The fall detection was at 96% minimum accuracy, which is above average for the cited other researchers’ works which had an average rate of 89% as shown in Table 1. We have used cross verification with other dataset records. This is with the hope that the subject senior citizens’ data would have to be recorded and then observed to make the correct adjustments. The activity recognition was the most accurate. It performed at 98.96% for training and 100.0% for the validation session. The general accuracy for the xgboost algorithm was above 96% on each of the three datasets. However, the deep learning method was the most accurate. After a successful classification, we can add the new record and its classification label to the dataset. This thus extends and improves the dataset. We could also include one heuristic to improve accuracy and reduce computational costs. For instance, we should be able to tell that sleeping must have zero chance of occurring in the toilet, hence not consider it a possibility when computing. This allows resources to be concentrated on viable options when performing classification. Eliminating such computation decision cost can fine-turn the system, as it allows only options with high likelihood. The low likelihood options are removed from the allocation of computation resources. This allocation of energy to the option improves efficiency. In the case that one class is larger than another the dataset is unbalanced. In usual cases this system works on unbalanced datasets, hence it was important to have good characteristics in its sensitivity and specificity.

5.1 Edge processing and security

In this work, the data sensing was performed using non-protruding methods which are a mobile phone and a smartwatch data sensor. However, this method is replaceable in the structure of the system. The data can also be collected by using a camera. A labeled dataset is then used to label extracted images. Mobile collects data and then this data is not processed on the phone but sent to some processing point due to limitations of the processing capability of the phone [26]. The training and testing set labels the images. However, in our project, we use an accelerometer and gyroscope as input sensors. The rest of the system would be the same. A systems camera sensor is an advantage in that it would be easier to label.

5.2 Replacement of training and testing dataset

When data is collected from the subject because seniors change rapidly for deteriorate help, the trend of that senior would change. Hence the dataset must be properly monitored and adjusted to match the rate of changes in the trend of the senior. Otherwise maintaining the same training data for an extended period would result in an obsolete detection system [20]. Aging can change the gait pattern of an individual hence the importance to update the dataset constantly. Having been limited by the current covid-19 situation we are having difficulty arranging our data collection activities. However, we believe the public datasets, SisFall, MobiAct, and Ucihar have provided good insights into the record we could have managed to collect. We believe these datasets have an unobstructed view of the results we could have obtained. After collecting sensor data and labeling it, the dataset component from Sisfall, MobiAct, and Ucihar databases would be gently removed and replaced with these records. This is a continuous process until the dataset remains pure containing the new record of the senior citizens without the legacy dataset. Erroneous labeled data would continuously be removed to have a robust and current training dataset.

5.3 The medicine routine remainder service

The medicine remainder is a time-based scheduler. The schedule must be executed correctly and if not, then an alarm is sent. The alarm is triggered based on the failure of sending confirmation of executing the medicine schedule by the senior. The senior must send a confirmation once prompted to do so. However, there is a risk that the subject might be able to falsely confirm they took the medicine when in fact they did not take it. This system currently cannot help when the senior is specifically not providing the correct state of medicine routine. It is meant to help in cases where participants are willing to take the medicine. An alert will be sent to caregivers allowing them to prompt the senior on the state of their medicine routine.

Advertisement

6. Conclusion

We commend this system can evaluate and monitor the situation and status of the senior citizen in their apartment. This system has been designed using three main algorithms. First algorithm tests if a fall has occurred or not. Eight different common algorithms are evaluated to see which one is most effective. If a fall is detected, then the caregiver is alerted. The second algorithm is to identify what activity is the senior citizen doing. This algorithm has the purpose of detecting the location where activity is occurring. If an activity such as sleeping or laying is taking place in the wrong location an alert is evoked. The last algorithm is detecting whether the medicine routine prescribed is been executed or not. The system utilizes a dispenser that can record if a pill was extracted from the pill dispenser or not. This is checked at a specified period as scheduled. When there is no change in the medical container after the expected pill dispensation time elapsed, then an alert warning message is sent to caregivers. The medical remainder also enables caregivers to prompt subject seniors to implement the medicine routine, when medicine taking has been skipped.

These three algorithms are the primary functions of this smart house design. The system utilizes the executed algorithms to effect detection. The system must use few resources and utilize the improved performance algorithm. The design would be updated with specific data when training for a specific client. The sample, data is also to be replaced with the change in the pattern of the senior citizen trends. The final efficiency in the algorithms improves from 96–98% in the training session and the validation is at 85–100% for the testing session. Since the customized records are generated from the citizen, the training and validation are guaranteed to improve at every iteration. With a sensitivity minimum of 78.66% and a specificity minimum of 97.86%, the model is performing well as the dataset used is not balanced.

This prototype would allow using the most effective classifier and dynamically determine the most effective classifier. Dynamic evaluation of algorithm efficiency should be integrated, as the accuracy would not always be the best since training data is updated periodically. Using the location variable may also reduce the computing resources needed if integrated into the classifier algorithm. Without a logic heuristic, the computation process would require more resources.

Advertisement

Acknowledgments

This work was supported by IGA/CebiaTech/2021/001, a research project of the Faculty of Applied Informatics, Tomas Bata University in Zlín.

References

  1. 1. World Health Organization. Active Aging: A Policy Framework. No. WHO/NMH/NPH/02.8. Madrid Spain: World Health Organization; 2002
  2. 2. Sander M, Oxlund B, Jespersen A, Krasnik A, Mortensen EL, Westendorp RGJ, et al. The challenges of human population ageing. Age and Ageing. 2015;44(2):185-187
  3. 3. Kaluža B, Mirchevska V, Dovgan E, Luštrek M, Gams M. An agent-based approach to care in independent living. In: International Joint Conference on Ambient Intelligence. Berlin, Heidelberg: Springer; 2010. pp. 177-186
  4. 4. Islam MM, Tayan O, Islam MR, Islam MS, Nooruddin S, Kabir MN, et al. Deep learning based systems developed for fall detection: A review. IEEE Access. 2020;8:166117-166137
  5. 5. Reyes-Ortiz J-L, Oneto L, Sama A, Parra X, Anguita D. Transition-aware human activity recognition using smartphones. Neurocomputing. 2016;171:754-767
  6. 6. He J, Zhang Z, Wang X, Yang S. A low power fall sensing technology based on FD-CNN. IEEE Sensors Journal. 2019;19(13):5110-5118
  7. 7. Xu T, Se H, Liu J. A two-step fall detection algorithm combining threshold-based method and convolutional neural network. Metrology and Measurement Systems. 2021;28(1):23-40
  8. 8. Usmani S, Saboor A, Haris M, Khan MA, Park H. Latest research trends in fall detection and prevention using machine learning: A systematic review. Sensors. 2021;21(15):5134
  9. 9. Zhu C, Sheng W, Liu M. Wearable sensor-based behavioral anomaly detection in smart assisted living systems. IEEE Transactions on Automation Science and Engineering. 2015;12(4):1225-1234
  10. 10. Paudel R, Eberle W, Holder LB. Anomaly detection of elderly patient activities in smart homes using a graph-based approach. In: Proceedings of the 2018 International Conference on Data Science. United States: CSREA Press; 2018. pp. 163-169. ISBN: 1-60132-481-2
  11. 11. Hussain F, Umair M, Ehatisham-Ul-Haq M, Pires I, Valente T, Garcia N, et al, editors. An efficient machine learning-based elderly fall detection algorithm. In: SENSORDEVICES 2018, the Ninth International Conference on Sensor Device Technologies and Applications, Venice, Italy, 16–20 September 2018. United States of America: Xpert Publishing Services; 2018
  12. 12. Vavoulas G, Pediaditis M, Chatzaki C, Spanakis EG, Tsiknakis M. The mobifall dataset: Fall detection and classification with a smartphone. International Journal of Monitoring and Surveillance Technologies Research (IJMSTR). 2014;2(1):44-56
  13. 13. Özdemir AT, Barshan B. Detecting falls with wearable sensors using machine learning techniques. Sensors. 2014;14(6):10691-10708
  14. 14. Bouchabou D, Nguyen SM, Lohr C, LeDuc B, Kanellos I. A survey of human activity recognition in smart homes based on IoT sensors algorithms: Taxonomies, challenges, and opportunities with deep learning. Sensors. 2021;21(18):6037
  15. 15. Wisesa IWW, Genggam Mahardika. Fall detection algorithm based on accelerometer and gyroscope sensor data using recurrent neural networks. In IOP Conference Series: Earth and Environmental Science. Vol. 258, No. 1. United Kingdom: IOP Publishing; 2019. p. 012035
  16. 16. Jin M, Zou H, Weekly K, Jia R, Bayen AM, Spanos CJ. Environmental sensing by wearable device for indoor activity and location estimation. In: IECON 2014-40th Annual Conference of the IEEE Industrial Electronics Society. United States of America: IEEE; 2014. pp. 5369-5375
  17. 17. Luo Z, Hsieh J-T, Balachandar N, Yeung S, Pusiol G, Luxenberg J, et al. Computer vision-based descriptive analytics of seniors’ daily activities for long-term health monitoring. Machine Learning for Healthcare (MLHC). 2018;2:1
  18. 18. Venkatraman S, Overmars A, Thong M. Smart home automation—Use cases of a secure and integrated voice-control system. Systems. 2021;9(4):77
  19. 19. Mahfuz S, Isah H, Zulkernine F, Nicholls P. Detecting irregular patterns in IoT streaming data for fall detection. In: 2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON). United States of America: IEEE; 2018. pp. 588-594
  20. 20. Delahoz YS, Labrador MA. Survey on fall detection and fall prevention using wearable and external sensors. Sensors. 2014;14(10):19806-19842
  21. 21. Sucerquia A, López JD, Vargas-Bonilla JF. SisFall: A fall and movement dataset. Sensors. 2017;17(1):198
  22. 22. Vavoulas G, Chatzaki C, Malliotakis T, Pediaditis M, Tsiknakis M. The MobiAct dataset: Recognition of activities of daily living using smartphones. In: International Conference on Information and Communication Technologies for Ageing Well and e-Health. Vol. 2. Portugal: SCITEPRESS; 2016. pp. 143-151
  23. 23. Anguita D, Ghio A, Oneto L, Parra X, Reyes-Ortiz JL. Energy efficient smartphone-based activity recognition using fixed-point arithmetic. Journal of Universal Computer Science. 2013;19(9):1295-1314
  24. 24. Liu L, Hou Y, He J, Lungu J, Dong R. An energy-efficient fall detection method based on FD-DNN for elderly people. Sensors. 2020;20(15):4192
  25. 25. Thakur N, Han CY. A study of fall detection in assisted living: Identifying and improving the optimal machine learning method. Journal of Sensor and Actuator Networks. 2021;10(3):39
  26. 26. Pan D, Liu H, Dongming Q, Zhang Z. CNN-based fall detection strategy with edge computing scheduling in smart cities. Electronics. 2020;9(11):1780

Written By

Linos Nchena and Martin Tomášek

Submitted: 04 January 2022 Reviewed: 25 February 2022 Published: 13 July 2022