Semantic Annotation of Mobile Phone Data Using Machine Learning Algorithms

Cell phone call location data has been utilized for the study of travel patterns, but the underlying activities that originate the movement are still at a less explored stage. Resulted from routine and automated features of decision-making processes, human activity and travel behaviour exhibit a high level of spatial-temporal periodicities as well as a certain order of the activities. In this chapter, a method has been developed based on these regularities, which predicts activities being conducted at call locations. The method includes four steps: a set of comprehensive variables is defined; feature selection techniques are applied; a group of state-of-the-art machine learning algorithms and an ensemble of the above algorithms are employed; an additional enhancement algorithm is designed. Using data gathered from natural communication of 80 users over a period of 1 year, the proposed method is evaluated. Based on the ensemble of the models, prediction accuracy of 69.7% was achieved. Using the enhancement algorithm, the performance obtained 7.6% improvement. The experimental results demonstrate the potential to annotate call locations based on the integration between machine learning algorithms and the characteristics of underlying activity and travel behaviour, contributing towards the semantic interpretation and application of the massive data.


Problem statement
Nowadays, cell phones are frequently used as an attractive means for sensing human behaviour on a large scale. They provide a source of real and reliable data, enabling automatic monitoring call and travel behaviour of users. Studies have been conducted to discover statistical laws that govern the key dimensions of human travel, e.g. travel distance and time spent at different locations [1]. These studies provide a modelling framework capable of describing general features of human mobility.
However, despite the discovery of these general features, previous studies do not provide further insights into the motivation or activities behind the identified mobility features. In general, most of the current research on cell phone data has focused on spatial-temporal dimensions. The behavioural aspects associated with the mobility features, e.g. travel mode and activities being conducted at the locations, are still at a less studied stage. Due to privacy concerns, cell phone data provided by phone operation companies usually does not have contextual information, leading to a wide gap between the raw data and the semantic interpretation of the traces. If a method can be found which helps to bridge this gap, the potential applications of the semantically enriched phone data are immense. They include inferring people's travel motivations in activity-based transportation modelling, mining individual life styles and activity preferences in urban planning, and providing activity tailored services in the cell phone environment [2].

Related state of the art
Methods have been developed to derive activities being conducted at a location from global positioning systems (GPS)-based data or from multi-modal data recorded by cell phones. The GPS-based methods first decompose continuous GPS points into a chain of stops, where the individual stays for a minimum period of time conducting activities, and moves that are the points between two consecutive stops. The stops are then compared with a geographic map by matching them in space, and interesting places that are relevant to the studies are subsequently found. The GPS-based methods have received much attention during the past years [3], but are still faced with a number of limitations. (1) The data collection process is expensive in terms of battery consumption of GPS devices. (2) Linking a GPS trajectory to detailed geographic information on all interesting places in a study area needs a lot of computational work. (3) The methods are location-specific, and the quality of the annotation process depends on the study area, making the process not transferable to other areas. (4) The matched location alone may not disclose a particular reason of why an individual travels there. A person could go to a place (e.g. a shopping mall) with different purposes (e.g. working, shopping or having a lunch). (5) The matching of exact GPS positions raises privacy concerns, as some of the places visited by an individual may be highly privacy-sensitive. Some of the above-described limitations have been addressed by the annotation process based on multi-modal data recorded from sensors equipped on cell phones [4]. This process is composed of two steps. In the first step, data from GPS and other sensors (e.g. Wi-Fi and accelerometer) is collected from each individual. The data is then clustered into a number of visit places, each of which is represented by an ID number rather than geographic positions of the cluster points. In the second step, the obtained places are annotated based on contextual information from the sensors and phone applications, as opposed to GPS data. In this process, various machine learning methods are proposed, and different sets of features are defined [5]. These studies achieved good prediction performance without the need of additional geographic information and GPS data. Nevertheless, while the machine learning methods eliminate the need for a map, this entire annotation process still partly relies on GPS data for the identification of visit places in the first step. Thus, this process as a whole does not fully address the privacy concern. On top of that, while these studies mainly focus on selecting efficient classification models and relevant features, none of them have conducted postprocessing analysis to examine how the predicted results are consistent with the sequential information that is embedded in daily activity and travel sequences. In-depth examination into the prediction errors is also lacking in these studies.

Research contributions
Extending the current research on annotating people's movement traces, our study proposes a new approach. The method utilizes data collected from simple cell phones, and it combines machine learning methods with the characteristics of underlying activity and travel behaviour that originates the traces. It has the following advantages over the existing studies. (1) The method is based on spatial-temporal regularities as well as sequential information intrinsic to human activity and travel behaviour. (2) It does not depend on additional sensor data and map information, reducing data collection costs and increasing transferability. (3) An enhancement algorithm has been developed to improve the prediction results by machine learning methods. (4) A set of extensive experiments and in-depth examination into the classification errors have been conducted. (5) Compared to GPS points, the wide coverage of a cell ID allows the process to reduce privacy concerns considerably.
The rest of this paper is organized as follows. Section 2 introduces the cell phone data and Section 3 elaborates on the annotation process. Experiments are conducted in Section 4 and examination into the experiment results is carried out in Section 5. Finally, Section 6 ends this chapter with major conclusions and discussions for future research.

Data
The cell phone data is composed of full mobile communication patterns of 80 users over a period of 1 year, collected by a European phone company for billing and operational purposes. The data records the location and time when each user performs a call activity, including initiating or receiving a voice call or message. The locations are represented with cell IDs, each of which has a coverage ranging from a few hundred square metres in cities to a few thousand in rural areas. The users along with their phone numbers and the corresponding cell IDs are all anonymized. Table 1 illustrates typical call records of an individual identified as '10027534' on a day.
Among all the users, 9132 distinct call locations were detected and 259 (2.8% of the total identified locations) were labelled with activities conducted at these places. These labelled locations are used as the ground-truth data for training and validating our models. Activities are divided into five types, including 'work/school', 'home', 'social visit', 'leisure' and 'nonwork obligatory', accounting for 30, 29, 15, 14 and 12% of the training data, respectively. The type of 'work/school' represents all work-or school-related activities outdoors; while 'home' accommodates all time spending at home. 'Social visit' refers to all visit activities, 'leisure' includes recreational activities outside home, e.g. sports and eating/drinking, and 'non-work obligatory' consists of activities like bringing/getting people, shopping and personalized services. If activities in multiple types are executed in the same location for a particular individual, the most frequent activity is selected, such that each location is uniquely linked to an activity type for the individual.

Overview of the approach
The approach incorporates basic knowledge about human activity and travel decision-making processes and their resultant activity and travel behaviour. As Liu et al. [6] underlined, human activity and travel decision-making processes demonstrate routine and automated features. People do not generally schedule their activities on a daily basis; but rather depend on fixed routines or scripts executed during the day without much alteration. This leads to a high level of spatial-temporal regularities in activity and travel behaviour as well as a certain sequential order of the activities [6]. The spatial-temporal recurrences of the locations can be adequately reflected in the movement traces of cell phone users through a long period of call records. In addition, the spatial-temporal constraints of locations, stemming from the characteristics of various activities, which are performed in their own daily, weekly or monthly rhythms, can thus suggest the possible activities carried out at the locations. This enables the annotation for the third dimension, i.e. travel motives (activities). Furthermore, evidence also suggests that activity and travel behaviour differs across various time periods of a day, between weekdays and weekends, and between normal days and holidays [7].  The columns, respectively, denote the user, cell ID, time and duration (in minutes) of the call, the call type including 'voice call' and 'message' and the direction including 'incoming', 'outgoing' and 'missed calls'. Table 1. Call records of a user. a models and an additional ensemble method to combine these prediction results are employed.
(4) An enhancement algorithm is developed to improve the annotation performance based on sequential constraints of the activities.

Variable definition
For each user, all distinct locations, where the person has performed at least a call activity during the entire data collection period, are extracted. Let N as the total number of these locations. At each location Loc i (i = 1…N), a set of variables is defined from two perspectives, including the call behaviour and the underlying travel behaviour. The call behaviour defines the variables that are directly related to call communication activities. Most of the variables are also used in the multi-modal data annotation process, as described in Section 1. The travel behaviour, however, approximates the spatial-temporal features of a location. The difference between these two perspectives can be illustrated by two groups of major variables. The first group includes the call frequency CFreqR and visit frequency VFreqR. CFreqR depicts how often calls are made at a location; by contrast, VFreqR reveals how often the location is reached, irrespective of the number of calls that are made at each visit. The second is the call duration CDur and visit duration VDur. CDur describes the duration of the call; while VDur is defined as the time interval between the first and last calls at the location. Apart from the different perspectives, all the variables are also divided based on spatial-temporal factors, including spatial repetition, temporal periodicity, day types and day segments. All the variables are listed in Table 2.
In terms of day segments, different definitions of time periods have been adopted, depending on the context of the study area [8]. Instead of making such an a priori assumption, a method that is proposed in this study estimates the splitting points of the day from empirical data. The resultant splitting points delimit the largest difference in the distribution of various activity types across these time intervals. Specifically, the segment process starts with a full day of 24 hours, and each hour is examined independently. An hour under investigation divides the day into two time intervals, e.g. 0-10 am and 10 am to 24 pm at 10 am. A contingency table is then constructed, in which these two time intervals and the five activity types are the row and column variables, respectively. The frequencies of the aggregated observations from the labelled call locations that fall into the corresponding time intervals and activity classes are the cell values. A chi-square statistics is subsequently calculated for this table. After chi-square statistics is obtained for each of the 24 hours, the hour with the largest statistics is chosen as the first splitting point, denoted as S 1 . This point divides the day into two intervals between 0 and S 1 as well as between S 1 and 24. This process is repeated for each of the latest formed intervals, until further splitting does not generate substantial difference or until a pre-specified number of intervals is reached.

Feature selection
Due to the small size of the training dataset, particularly relative to the large number of defined variables, over-fitting is a potential problem. To address this issue, feature selection techniques are employed in order to decrease the number of predictors actually utilized by the classification models. Two methods including wrapper [9] and filter [10], which have shown effectiveness in the multi-modal data annotation process, are chosen for feature selection. Wrapper searches for an optimal feature subset using the classification model itself. In contrast, filter examines each feature separately and selects the feature that has high correlation with the target variable, but low relation with the features that have already been chosen.

Machine learning
A group of state-of-the-art machine learning algorithms, including decision trees (DTs) [11], random forests (RF) [12], multinomial logistic regression (MNL) [13] and multiclass support vector machines (SVMs) [14], are employed. These algorithms have demonstrated comparative performance for multi-category classification problems. These methods mainly differ in terms of the way the classification question is formulated, the learning function and the solution to deciding the optimal function parameters. As each learning algorithm has its strength and weakness, it is often challengeable to identify a single algorithm that performs best for a particular classification problem [15]. Thus, in this study, a fusion process is   developed, which integrates the results of these algorithms, in order to utilize the strength of one while complementing the limitation of another. In this process, the four individual model prediction results (i.e. the probabilities of different possible activity types) for each call location are used as predictors, and the observed activity types are still as the dependent variable. The correlation between these predictors and the observed activity types can be built again by a classification model.

The enhancement algorithm
While machine learning methods provide an effective solution to annotating each single location, they disregard the activity orders and transitions embedded in daily activity and travel sequences. When the annotated locations on a day are linked according to the temporal order, they should follow a certain sequential constraint. The interdependencies of daily activities have been considered as a crucial factor in activity and travel decision making, as discussed in Section 3.1. By considering sequential information, the activity locations that are accessed by an individual on a day are viewed and tackled as a whole, rather than isolated participation in activities.
The enhancement algorithm takes the preliminary inference results as well as the sequential knowledge as inputs and aims to improve the prediction. The method is composed of two components: transition probability-based enhancement and prior probability-based enhancement. Figure 1 illustrates how the prediction is improved using a daily location sequence of a user.
According to the training data of the user, he/she has conducted the chain of activities of 'work-social visit-work' at the respective call time on a day. But the prediction from the classification models is 'work-non-work obligatory-work'. A prediction error occurs at the second location. In this case, if a location (e.g. the second location) has a prediction probability P (0.443) smaller than a threshold T 1 (0.72 in our case study), it is assumed that the location is likely to be wrongly annotated. The enhancement algorithm is then applied to the false location to improve its prediction in the following steps. (1) If there is an additional location adjacent to the false one (including backwards and forwards) in the predicted sequence for that day and if this location has P larger than a threshold T 2 (0.9), it is considered as possibly correct prediction. The additional location is thus used to fix the prediction of the false one, using the  transition probability-based enhancement.
(2) Otherwise, if no other locations in the neighbouring areas are predicted with a high probability, the prior probability-based enhancement method is employed to increase the prediction accuracy based on the call time at the false location. After recalculation, the activity type with the largest enhancement probability P' is chosen as the annotation result of the false location on that particular day. As a location may be repeatedly visited on multiple days, the multiple days' enhancement results are integrated by majority voting rules as the final annotation for the location. Under the appropriate parameters T 1 and T 2 , the false prediction is likely to be corrected while accurate inference results are maintained. Figure 2 demonstrates the details of the enhancement process.

Transition probability-based enhancement
The sequential information is represented in a 5 Â 5 transition probability matrix between different activities. Let a i and a j (a i , a j = 1,…5) as the activities performed at the previous location i and current location j, respectively; Tr(a j |a i ) as the transition probability from a i to a j , calculated from the training data as follows:  (k=1… N(d)).

Next location k=k+1
Prior probability-based enhancement is applied Obtain the final classification based on multiple days' enhancement results If exists a second adjacent location with P>T2?
A revised probability P' for the location k is calculated Tr a j ja i F(a j |a i ) is the frequency of a j followed by a i . The probability of the location j being annotated as a j conditioned by a i at the previous location i can be recalculated as P 0 (a j |X) according to Eq. (2).
P(a j |X) is the result of the classification model. It is noted that P 0 (a j |X) is biased towards frequently visited locations, e.g. home and work/school places, as transitions to these places are more likely than to other less visited locations. Consequently, most of the locations under Eq.
(2) will be redirected to these two activity types. To overcome this, Tr(a j |a i ) is divided by the frequency of a j , resulting in the probability Qr(a j |a i ).

Qr a j ja
P 0 (a j |X) can be revised as P' (a j |X).
In the user's case, as shown in Figure 1, since the transition probability Qr from work to nonwork obligatory activities is very small, after the enhancement, P' (non À work À obligatory) (0.008) drops behind P' (visit) (0.033), we get the visit activity as the revised annotation.

Prior probability-based enhancement
The above-described transition probability-based enhancement involves at least two locations, which are adjacent in time, and one of which has a prediction probability larger than T 2 . However, such daily trajectories derived from the classification models are not always available for each day. For example, one of the neighbouring locations has a probability smaller than T 2 . Or, in the case where people may stay at a location (e.g. home) during an entire day, engaging only in a single (home) activity. This is particularly true with cell phone data. People may not make calls when travelling to an activity location, resulting in the daily movement traces not being fully revealed by their call data. In these cases, we utilize the typical activity and travel behaviour at different time of a day through the prior probability distribution of the activity a j at different call time t, i.e. P(a j |t). By applying Bayesian methods, we compute the posterior probability of a j based on X and t, i.e. P' (a j |X, t). This probability can be computed as follows, with the assumption that X is independent of t.
P(a j |X) is the output of the classification model, i.e. the probability of a j performed at the location j conditioned on the previously defined variables X. When P(a j |X) is compared with the new probability P' (a j |X, t), since t is added in the conditional part of P', the new probability is more discriminative and informative than P.
P(a j |t) and P(a j ) can be derived from the training data as follows: Here, F(a j |t) refers as the occurrences of a j at t and F(a j ) refers as the occurrences of a j at all time.
It should be noted that from the theoretic perspective, the above enhancement process has two weak assumptions. One is the replacement of P(a j |X) with the result of the classification model and the other concerns the hypothesis of the independence between X and t. Nevertheless, based on Eq. (5), the preliminary prediction probability is complemented with the prior probability distribution.

Case study
In this section, adopting the proposed method and using the cell phone data described in Section 2, a set of experiments is presented. The results of these experiments are discussed and the performance of the annotation process is evaluated. Table 3 lists the optimal points for each of the intervals, based on the method described in Section 3.2. The first splitting point over an entire day was found at 9 am, generating two intervals of 0-9 am and 9 am to 24 pm. This process was iterated for each of the two newly obtained intervals. If the largest chi-square value over all potential points of an interval was lower than a predefined threshold, i.e. 200 in this experiment, this search stops. Figure 3 further shows the evolution of the chi-square statistics, in which the first 3 orders yield much higher values than the remaining ones. From the fourth order on, the statistics starts to decline sharply. Thus, the first 3 optimal points were extracted and 4 time periods were generated including 0-8:59 am, 9-13:59 am, 14-18:59 pm and 19-23.59 pm. After each day was segmented into the four periods, all the variables defined in Table 2 were obtained and used as candidates for subsequent feature selection and machine learning. Weka, an opensource Java application consisting of a collection of machine learning algorithms for data mining tasks [16], was used for the implementation.

Results of individual classification models
The original training dataset is randomly divided into 10 subsets. In each model run, one of these subsets is used as the validation data and the remaining subsets combined as the training data. The number of correctly annotated locations in the validation subset is denoted as C i (i = 1…10). Let Num as the total number of locations in the training dataset; the prediction accuracy can be defined as follows: The rows, respectively, denote the current interval (hour) under investigation, the optimal splitting point S, the chisquare value, the decision on whether or not the interval is split (if it is 'Yes' then two new intervals are formed and if it is 'No' then the symbol 'X' is used), and the order of the optimal points according to the chi-square values.  The individual classification models are built on the features of locations drawn from the perspectives of both travel and call behaviour as well as on the features profiling only call behaviour, respectively. In addition, the models are also run separately on all candidate variables as well as on the variable subsets that are chosen by filter or wrapper. The prediction results with the best parameter setting in each case are presented in Table 4.
From the prediction results, the following observations can be drawn. (1)

Results of fusion models
In this fusion process, the four individual classification models are, respectively, employed as the fusion models to predict the activity types, while the results from each of the classifiers with the best parameter performance shown in Table 4 are used as the predictors. The prediction with the two best performances for each fusion model is presented in

Transition matrix
Similar to the temporal variables, the transition matrix is also built for weekdays, weekend and holidays separately as well as for different periods of a day. The identification of optimal cutting points for the matrix is the same as the previously described method, except the time intervals. For each potential dividing point, two intervals but three scenarios are obtained depending on the time of the two concerned activities in the transition. The first and second scenarios occur when both activities take place in the first interval or in the second. The third scenario is when the first activity takes place in the first interval and second activity in the second interval. Given the small size of the training set, only the first significant cutting point was identified, which is 18 pm. Under this time division, the largest difference in the distribution of activity transitions is among the three scenarios: transitions within 0-17:59 pm or 18-23:59 pm, and transitions from 0-17:59 pm to 18-23:59 pm. Table 6 shows the transition matrix in the first scenario during weekdays. The rows represent the fusion models, and the columns include the individual classifiers and the prediction accuracy. X indicates the corresponding individual models being chosen as the predictors. As expected, for the probability Tr(a j |a i ), the highest values are dominated by the transitions to either home or work/school activities. With Qr(a j |a i ), however, the dominance of these two activities is reduced by their high frequencies, and transitions to other less represented activities are exposed. This can be manifested by the high transitions from home to non-work activities and from social visit to second social visit locations.

Activity distribution at different time
The activity distribution is also differentiated between weekdays, weekend and holidays. The weekday distribution at each hour P(a j |t) is shown in Figure 4(a) and the distribution of the The row and column represent the current and previous activities respectively; the maximum probability for each column is in bold.  ratio between P(a j |t) and the overall probability of the activity P(a j ) is depicted in Figure 4(b). These two distributions show remarkable deviation: in Figure 4(a) either home or work/school types dominate the activities, whereas in Figure 4(b) the most likely activity shifts across various types as the day unfolds.

Selection of T 1 and T 2
Based on the previous results, two fusion models, including MNL built on all the four individual classifiers and RF on the combination between this model and MNL, are selected for the enhancement algorithm. To decide the threshold T 1 , the correlation is examined between different values of T 1 and the prediction rates of the fusion models, as shown in Figure 5. It is observed that for both models, when T 1 is below the crossing point of 0.72 in Figure 5(a) and 0.8 in Figure 5(b), the number of false prediction is higher than that of the correct one. Thus, 0.72 and 0.8 are selected as T 1 for MNL and RF, respectively. T 2 is set as 0.9, above which the prediction rate is 69.7 and 66.4% for these two models. Table 7 presents the prediction results by the enhancement algorithm (in the column 'After'), along with the results before the enhancement (in the column 'Before') as well as the difference between these two prediction results (in the column 'Difference'). Overall improvement of 4.4 and 7.6% for MNL and RF is achieved. The examination into the results across various activities discloses that the enhancement algorithm particularly performs better on less representative activity types, e.g. non-work obligatory, social visit and leisure activities. This could be originated from the fact that the machine learning algorithms usually favour majority types if the prediction accuracy is used as the evaluation criterion, while the enhancement algorithm puts equal weights on all activity types of the dependent variable (call locations).

Enhancement results
The effectiveness of each of the two enhancement methods is also investigated, by running the RF fusion model using each of these methods independently to revise a weak prediction result. The prediction rates of 73.7 and 75.2% were obtained for the transition probability-based and prior probability-based enhancement methods, respectively. Due to the small size of the training set, many locations are labelled as one single known activity of a day, the sequential information is thus not available on these days. With a large dataset, the transition matrix would better represent typical activity and travel behaviour of users. This would lead to the transition probability-based method and the enhancement algorithm as a whole bringing greater improvement over the current experimental results. (1) Home. Homes are featured with high visit frequencies and spatial-temporal regularities. However, seven homes are misidentified, of which five have lower visit frequencies than 10% on weekdays, i.e. less than 1 in 10 trips on weekdays ending at home. The unusually less visited homes could be due to the fact that the corresponding users spend less time at home and/or they make fewer calls than expected at home. This results in the home visit frequencies less represented by their call records. Alternatively, some of the misclassified locations can be a second home for users who already have a home at different locations. Two in these five users have two labelled homes. While their second homes are occasionally accessed, their main homes are routinely visited and correctly annotated. (2) Work/school. Like homes, work/school locations are also characterized by a high level of routine visits, but these two types differ regarding the time of the visits. While most of the trips to homes are at night and weekends, trips to offices or schools occur during the daytime on weekdays. Of all the work/school  Table 7. Prediction result comparison between before the enhancement algorithm and after that (%).

Analysis on the prediction results
locations, 10.1% are wrongly predicted as non-work obligatory or social visit activities if they are accessed infrequently during weekdays. All the corresponding users work/study at multiple places, and the misidentified locations are their additional work/school places. Another 10.1% are mistaken as homes, if they have high visit frequencies at weekend. For instance, one of these users has two labelled work locations. They were visited at rates of 32% during weekdays and 42% on Sunday, respectively. While the first one was correctly identified, the second one was wrongly predicted as home. This suggests that the work regime plays an important role in distinguishing work locations from homes. While most people work during weekdays, certain minorities work on different shifts, especially to weekends or nights, generating distinct activity and travel patterns from the main stream of the population. (3) Nonwork obligatory. The activities have low visit frequencies and short duration. The misclassification of the activities can be partially attributed to a combination of heterogeneity within this category. The various detailed types of the activities are likely performed at spatially independent locations and temporally varied preferences. For instance, shopping is mostly done in later time of the day than service or bringing/picking up activities. (4) Social visit. The activities are profiled with a middle level of visit frequencies during weekdays. If the locations are accessed less, they tend to be annotated as leisure or non-work obligatory activities; if more, they are considered as home or work/school places. The limited predictability could be caused by the underlying structure of an individual's social network, in which various degrees of relationship exist, ranging from closed one they visit routinely to the one they just meet occasionally. This generates variations in spatial-temporal features of the locations. (5) Leisure. Leisure activities are conducted in various places and at different time for an individual; they exhibit the lowest level of regularities and thus are the most challengeable to annotate. Apart from the spatial-temporal irregularities, the examination into two falsely predicted leisure locations reveals additional causes for the misclassification. The first one has a visit frequency of 36.3% in both the afternoon and evening on weekdays. It was the second most visited place for the corresponding user who has accessed this place 170 times over 337 days, such that 1 in 2 days he/her was observed there. This location is originally labelled as a restaurant; however, the call records suggest a high probability that he/her may work there instead of eating as a customer. The second location was ranked as the most visited place for the concerned user. He/ she has in total conducted 383 visits over 442 days during both weekdays and weekends as well as at night. Nearly three in 4 days, he/she made calls there. Furthermore, the user has five locations collected in the training dataset, but none of which is labelled as home. This location is documented as sports; however, for this particular user, it is likely that this place is a home rather than a recreation site. While further investigation into the above two typical cases is needed before any definite conclusions are drawn, they nevertheless illustrate that our annotation method based on underlying activity and travel behaviour can effectively predict the activities, which are tailored to each individual. A location may have a single or multiple functions, but people visiting there could have different purposes. The match with geographic information alone is not able to identify this distinction. We shall call the location annotation at the individual level as micro-location-annotation.

Conclusions and future research
In this study, a cell phone location annotation method has been developed based on spatialtemporal regularities as well as sequential information intrinsic to activity and travel behaviour. The method does not depend on additional sensors and geographic details. The data requirement is simple and its collection cost is low. It is also generic to be transferable to other areas. On top of that, the method is independent of precisely geometric positions of individuals, thus considerably reducing privacy concerns.
Experiments on the annotation method using data collected from natural phone communication of users have achieved 76.6% prediction accuracy. With this probability, the activity conducted at a location for a user can be predicted by the spatial-temporal features of the visits disclosed by his/her call records. Furthermore, this study also shows the added value of the integration between machine learning methods and underlying activity and travel behaviour when annotating the location traces.
Nevertheless, despite the spatial-temporal regularities, activity locations still share commonalities in these two dimensions at a certain degree. Activity and travel behaviour is not solely decided by spatial-temporal elements, it is also affected by socio-economic conditions. The first improvement in future research should thus take this general background information into account. In particular, to address the potential causes for misclassifications of home and work/ school locations, the annotation should be combined with the information on the number of home and work/school places of users as well as their work sectors and regimes. A broad picture of users' social networks, obtained from direct surveys and/or social networking sites, would strengthen the prediction of social visit activities. For non-work obligatory and leisure activities, the detailed types in each of these two categories should be handled separately, if a sufficient size of training data for the detailed types is available. The second improvement lies in finding an effective way of annotating locations, which are visited for multiple purposes for a particular user. While this study links the most frequent activity to a location, it dismisses additional activity types, which are performed by the user at different parts but within a same cell. In the training dataset, 5% of all the locations are visited for multiple purposes.
Today when simple phones are still prevalent constituting nearly 85% of total global handsets in use, this research makes undoubtedly an important contribution to the semantic explanation of the movement data. With the development of smart phones, the data from additional sensors installed on the phones will provide a third possibility of improvement by integrating the contextual information into the annotation process.