Personal destination pattern analysis with applications to mobile advertising
 Osama O. Barzaiq†^{1} and
 Seng W. Loke†^{1}Email author
https://doi.org/10.1186/s1367301600732
© The Author(s) 2016
Received: 7 February 2016
Accepted: 5 July 2016
Published: 10 October 2016
Abstract
Many researchers expect mobile advertising to be the killer application in mobile business. In this paper, we introduce a trajectory prediction algorithm called personal destination pattern analysis (PDPA) to analyse the different destinations in various trajectories of an individual, and to predict a trajectory or a set of destinations that could be visited by that individual. The PDPA algorithm works on an individual level. Every destinationpattern analysis is related to the selfhistory and the personal profile of a targeted individual, not on what others do. In addition, we developed a prototype system called SmartShopper. SmartShopper is a personal destinationpatternaware pervasive system for mobile advertising in (outdoor and indoor) retail environments. The predicted destinations from the PDPA algorithm will be used by SmartShopper to generate a list of relevant advertisements adapted to the personal profile of previous destinations of a targeted individual. We tested the destination prediction accuracy of the PDPA algorithm with a synthetic dataset of a virtual mall and a real GPS dataset.
Keywords
Introduction
The significant rapid growth of mobile advertising is making advertising companies consider mobile phones as a new platform for advertising revenue [1, 2]. One of the strategic places for advertising is retail environments, as most mobile phone users visit a mall to purchase products or services, and many researchers are exploiting different methods to generate a personalized list of advertisements that could capture the interest of a mobile phone user [3–5].

a novel approach to predict a set of destinations that could be visited by a targeted individual using only the personal history of that individual,

a new perspective on generating a list of advertisements for users, i.e. by using the predicted set of destinations and the personal profile of a targeted individual,

highlighting the use of selfhistories for destination prediction, rather than grouphistories (as in typical clustering style prediction approaches), i.e., we predict based on what the person typically does, rather than what people typically do,

highlighting the notion that taking into account the day of the week is crucial in capturing human routines and regularity, i.e. to predict what a person will do on a day, say its a Tuesday, it can be more useful to look at what the person typically do on Tuesdays, rather than what the person does on all days of the week, and

highlighting two kinds of behavioural regularity, in terms of the number of destinations visited and the actual destinations visited.
In order to illustrate the application we have in mind, we developed a prototype system (SmartShopper) and a destination prediction algorithm called personal destination pattern analysis (or PDPA, for short). SmartShopper uses the PDPA algorithm to predict the set of destinations that could be visited by a shopping mall visitor. Then, SmartShopper will use the predicted destinations and the personal history of the targeted mall visitor to generate a list of advertisements that aims to capture the interest of that visitor with high probability. We tested the prediction accuracy of the PDPA algorithm on a synthetic dataset of an indoor mall and on a real GPS dataset of outdoor Dtrajectories for nine persons (the GPS dataset was collected by the LifeMap system of [6]). The experimental results show the good prediction accuracy of the PDPA algorithm.
The rest of this paper is organised as follows. Section reviews related work. Section presents our Dtrajectory prediction algorithm. Section has the details of the synthetic dataset of the virtual mall,and the details of the GPS dataset of [6]. Section shows the experimental results. The details of the prototype system, SmartShopper, are discussed in "System prototype" section. "Conclusion and future work" section concludes the paper with future work.
Related work
Asahara et al. [7] have developed a prediction method that uses a mixed Markovchain model (MMM) to predict the next location of an individual. In their prediction method, they categorize individuals into groups, assuming that individuals in each group are having similar behavior. Accordingly, they build a mixed Markovchain model for every group, and every MMM is using one unobservable parameter. The unobservable parameter is used to determine which mixedmodel of which group of individuals should be used to generate the transition matrix. They tested the prediction accuracy of their MMM on a physical dataset that were collected during a big event conducted in a famous shopping mall in Japan. For 90 min, the trajectories of 691 participants in the experiment during the big event had been recorded. After that, the recorded trajectories had been divided to 10 sets, 9 sets were used for training purposes and only one set was used to test the prediction accuracy of MMM. For the prediction accuracy test, they built 190 mixed Markovchain models and the average accuracy was 64 % [7]. However, recording the participants’ trajectories during a particular event for a specific period of time could result in significant similarity between most of the recorded trajectories. This is because that the movements of the participants were mostly according to the designated path of the big event of the mall. Moreover, they used 190 mixed Markovchain models to achieve the prediction accuracy of 64 %, whilst one Markovchain model with one transition matrix for the 691 participants was able to achieve a prediction accuracy of 45.6 %. The first process of dividing the participants into groups and the second process of building 190 mixed Markovchain models would take time, and time is also required to predict the next location of an individual.
Kolodziej et al. [8] developed an algorithm using an activitybased continuoustime Markov jump process, which is an extension of the AMPuMM algorithm of [9]. In their algorithm, instead of using the discrete time set used in AMPuMM [9], they used a continoustime Markov process to build a model of the users movements. Their algorithm was able to predict the next location of a mobile user within a range of 115–250 m from the correct location of that user. Nevertheless, when the prediction algorithm is only using a set of four locations and the predicted location is that far from the correct location, the prediction algorithm needs a number of improvements to handle a bigger set of locations and to be able to provide a more accurate predictions. Furthermore, our PDPA algorithm is tested using synthetic dataset of a virtual mall that has 72 stores and a real GPS dataset for nine persons with a number of locations between 109 and 448.
Gambs et al. [10] developed an algorithm called, nMMC. The nMMC algorithm predicts the next place of an individual by using the Mobility Markovchain (MMC) model and the last two visited locations. In their algorithm, they built a transitions matrix that has the probabilities of all the different transitions of an individual to place k after visiting place i and place j. The nMMC algorithm managed to achieve a prediction accuracy ranging from 70 to 95 %. However, nMMC predicts the next place of an individual among only a small set of three possible locations: home, work, and other. Furthermore, if the next place predicted by the nMMC algorithm for a targeted individual was Other, it was not mentioned by the researchers how such a prediction could be used to generate and send useful and relevant information to that individual. Moreover, in their algorithm, n has a constant value of 2 which represents the two previously visited locations but our prediction algorithm, PDPA, is using a version of a transition matrix that has a memory size of n, which means it has the probabilities of visiting different destinations after visiting n number of locations by a targeted individual, where n is determined by analysing the dataset.
Another group of researchers designed a framework called LASA, Location Aware Shopping Advertisement, that uses an ontology based formulation of clients and products profiles to generate a list of ads related to the selection history of a targeted client [5]. LASA uses the shopping mall’s WiFi access points to detect the current location of a client, and once the client’s location is detected, it will send a list of titles of different products available in all stores in the coverage area of the access point that discovered the client’s current location. Nevertheless, the coverage area of an access point may include a big number of stores which could result in generating a long list of products’ titles from those stores. Furthermore, a number of access points could detect the signal of a client’s mobile phone which could affect the accuracy of determining the current location of that client. Consequently, failing to discover the correct location of a targeted client will hinder the process of determining the list of products’ titles that should be sent to that client. In addition, our PDPA algorithm predicts a set of stores that could be visited and then generate a list of ads for those stores instead of just generating a possibly long list of ads for every store in the discovered area of a targeted client as in LASA.
Kim et al. [3] developed a system called, AdNext, that uses Bayesian networks to build a transition matrix for the shopping mall clients in order to predict the business type of the next location that a targeted client could visit. AdNext uses the business type of the last two locations visited by a targeted client to predict the business type of the next location for that client. In addition, AdNext detects the current location of the targeted client by using the shopping mall’s access points. Based on the discovered location of the targeted client and the predicted business type, AdNext will start generating a list of ads for every store that lies under the predicted business type and it is also in the area of the detected location. However, predicting the exact next store that could be visited by a client is significantly different from predicting the business type of the next location. Predicting the business type of the next location could result in generating many ads, because such a list will include ads from every store that falls under the predicted type of business. Furthermore, the knowledge of the business type of the last two visited locations is crucial for AdNext to be able to predict the business type of the next location. Consequently, if a targeted client has a plan to only visit one or two stores, then, AdNext will not be able to make any predictions. Instead of predicting the business type, our prediction algorithm, PDPA, predicts a set of stores that could be visited by a targeted client, and then, uses the predicted stores to generate a list of advertisements for the targeted client. In addition, PDPA algorithm can predict the first store to visit by a targeted client while AdNext will need to wait for the targeted client to visit at least two stores in order to be able to make any predictions.
The PDPA algorithm
We developed a trajectory prediction algorithm called PDPA, personal destination pattern analysis. The PDPA algorithm is predicting what we call a Dtrajectory, which is a trajectory of destinations. The Dtrajectory is a subsequence of the actual walking trajectory of a targeted individual (containing meaningful destinations, meaningful as determined by the domain).

Targeted analysis period and targeted trajectories. The PDPA algorithm will analyse up to a history window of 2 years of the different trajectories made by a targeted individual on the same day of the week—e.g. to predict a Dtrajectory for a Wednesday, we look at the Dtrajectories for all Wednesdays in the last 2 years. We observe that same day of the week is a useful selection filter for the history to be used (we also experimentally show this later).

Dynamic prediction length. It is necessary to determine the length of the predicted Dtrajectory. The dynamic trajectory length is used by the PDPA algorithm to determine the number of destinations that are most likely to be visited by a targeted individual on a specific day. The dynamic length will be determined by finding the number of destinations that were visited by an individual on the majority of days that are similar to the day of his/her current activity. For example, if the current activity of individual(x) is occurring on Sunday and the number of destinations that are often visited by individual(x) on all previous Sundays is 7, then the length of the predicted Dtrajectory by PDPA is seven destinations.

The initial distribution matrix. D is an initial distribution matrix that contains the probabilities for every destination to be the first destination to visit by individual(x). The matrix is computed from the selfhistory of individual(x).

The transition matrix. R is a transition matrix that contains the probabilities of different transitions of individual(x) from one destination to another. R will be used as a transition model for individual(x), and it’s built from the set of the targeted analysis trajectories. The PDPA algorithm is using two versions of the transition matrix. The first version of the transition matrix has a memory size of one, which means that the transition matrix has the probabilities of visiting different destinations after visiting the current location of individual(x). The second version of the transition matrix has a memory size of n, which means it has the probabilities of visiting different destinations after visiting n number of locations by individual(x), where n is obtained by analysing the dataset.

Dtrajectory prediction. The following steps show how PDPA predicts the Dtrajectory that is most likely to be used by individual(x) in the current day of his/her current activity.
 1.
Once individual(x) starts leaving his/her house or parking his/her car in a shopping mall, PDPA will use the current date and location of individual(x) to fetch the records of his/her various trajectories for a period of up to 2 years that precede the current date.
 2.
Then, PDPA will filter and analyse only the trajectories used on all selected historical days, i.e. days that are of the same day of the week as the day of the current activity (e.g. if the current day whose Dtrajectory is to be predicted is Monday, then only the trajectories on Mondays in the history are selected and used).
 3.
From the fetched trajectories, the dynamic prediction length (i.e. length of the Dtrajectory to be predicted) will be determined by finding the number of destinations that were visited by individual(x) on the majority of the selected historical days.
 4.
Then, PDPA will build the initial distribution matrix D that contains the probabilities for every destination to be visited first by individual(x), and it will also build the transition matrix R that contains the probabilities of different transitions of individual(x) from one destination to another.
 5.
The destination that is most likely to be visited first by individual(x) is the destination that has the highest probability in D.
 6.
The next destination to be visited by individual(x) is the destination that has the highest probability in R to be visited after visiting the current destination.
 7.
We use the dynamic prediction length to determine the number of times required to repeat the previous step to find all the remaining destinations of the predicted Dtrajectory.
 1.
Datasets
In this section we will discuss the details of the two datasets that we used to measure the prediction accuracy of the PDPA algorithm. Using seed information from real individuals, we generated a synthetic dataset that has 3 years of different indoor trajectories and various shopping activities inside a virtual mall. The second dataset is a real GPS dataset of outdoor trajectories for nine persons, which is collected by the LifeMap system of [6] (over roughly 8 months). Note that for our work, the important aspect of the dataset to be considered is the size of the history and the behaviour variability across clients, rather than the number of clients [though there should be sufficient variability in the clients to demonstrate that our system can work with different types of client behaviours (i.e. different extents of behavioural (ir)regularity)]; hence, the nine clients in that dataset and the different generated synthetic behaviours we have were more than adequate to demonstrate the applicability of our algorithms across a range of behaviours (from extremely regular to very irregular).
The virtual mall dataset
We built a simulation system that consists of a virtual mall and a number of virtual clients. The purpose of the simulation system is to generate a synthetic dataset that consists of an adequate number of records of various shopping activities, which could allow us to test the prediction accuracy of the PDPA algorithm.
The virtual mall
VMall components
Attribute  Description 

Year  3 years (2010, 2011 and 2012) 
Month  12 months (January–December) 
Day  7 weekdays (Monday–Sunday) 
Visiting hour  8 visiting hours (9 a.m., 10 a.m., 11 a.m., 12 p.m., 1 p.m., 2 p.m., 3 p.m. and 4 p.m.) 
Gate  4 gates (gate01, gate02, gate03 and gate04) 
Store  72 stores (using names existing in the real world) 
Store size  3 sizes (small, medium and big) 
Visiting duration  6 visiting durations inside a store (5, 10, 15, 20, 25 and >25 min) 
Category  17 categories (using names existing in the real world) 
Category type  2 types (limited purchase times, multiple purchase times) 
Item  The number of items/products depends on the two factors: the store size and the category type 
Price range  3 ranges (low, medium, high) 
Shopping behavior model
 1.
How certain are you about visiting VMall at least one time in any month of the 12 months?
 2.
How certain are you about the number of visits that you usually make to VMall in any month of the 12 months?
 3.
How certain are you about visiting VMall at least one time on any day of the week?
 4.
How certain are you about starting your visit to VMall by using any gate of the 4 gates of VMall as your favorite gate?
 5.
How certain are you about starting your visit to VMall at any of the following times (9 a.m., 10 a.m., 11 a.m., 12 p.m., 1 p.m., 2 p.m., 3 p.m., 4 p.m.) on any day of the 7 days in a week?
 6.
How certain are you about the number of stores that you usually visit on any day of the 7 days in a week?
 1.
How certain are you about visiting a store of the 72 stores?
 2.
How certain are you about visiting any of the 72 stores at least one time every month?
 3.
How certain are you about visiting any of the 72 stores at least one time on every day of a week?
 4.
How certain are you about the time that you usually spend in any of the 72 stores on different days of a week? (Use: 5, 10, 15, 20, 25, >25 min)
 5.
How certain are you about using any of the 72 stores as your ”first store to visit” in your different visits to VMall?
 6.
How certain are you about buying at least one item from each of the 72 stores in every visit you make to VMall?
 7.
How certain are you about the number of different items that you usually buy from each of the 72 stores in every visit you make to VMall?
 8.
How certain are you about the chances of visiting a new store in every trip you make to VMall? (New store is a store that have not been visited before)
We carefully selected three individuals to participate in the survey, after an initial observation that they would likely provide diverse (in terms of regularity of visiting stores) stereotypical behaviors. The participants have the following characteristics: a married female that has two children, and two single males. Note that the number of participants is not as significant as we aim towards distinguishable stereotype behaviors, and the survey is only meant to provide seed data for data generation. The collected answers are used by our simulation system to generate a shopping behavior model for every one of the participants. Such models will help VMall to generate a synthetic dataset of various purchasing activities that could be performed by each stereotype participant in VMall. The system generated 3 years of different trajectories and shopping activities for each one of the participants. Then, the PDPA algorithm will analyse only the personal history of a specific individual, which is different from other collaborative/clustering prediction style approaches.
Clients creation and dataset generation
VMall dataset stats
Client ID  Visits  Dtrajectories  Visited stores 

ClientS1  376  376  63 
ClientS2  627  627  8 
ClientM4  537  537  15 
ClientM5  859  859  19 
ClientF8  947  947  40 

Virtual clients creation. VMall will create 5 virtual clients. Three virtual clients will be created based on the seed data obtained from the shopping behavior model of each one of the survey’s participants (using questions described earlier). Those virtual clients will have the following tags: clientM4, clientM5, and clientF8. In addition, VMall will create another two virtual clients using a shopping behavior model developed completely by VMall. The first virtual client will be created as a client with nonspecific and irregular shopping behavior, and it will be tagged clientS1. Having an extremely irregular shopping behavior means that such client can visit any number of stores on any day and at any time with absolutely no specific reason/pattern. The fifth virtual client will be created as a client with an extremely regular shopping behavior, and it will be tagged clientS2. This client has an extremely regular behavior because he is always visiting the same stores on the same days and at the same times. Figure 2 shows a sample of different trajectories conducted by clientF8 in VMall. The purpose of creating 5 virtual clients with three different types of shopping behaviors is to generate a synthetic dataset that consists of various shopping activities with 3 main stereotypical shopping behaviors, irregular, regular and extremely regular, which allows us to cover the range of possible trajectories better than even real data (which may not exhibit the range of stereotypical behaviours needed for this study). The different transitions of clientS1 and clientS2, as shown in Figs. 3, 4, represent two extreme ends and it can be intuitively observed that the shopping behavior of any individual will be between these two ends (in terms of regularity rather than actual stores visited). Therefore, having three or more respondents to our survey will not change the fact that the shopping behavior of any individual will be between the two extreme ends, irregular and extremely regular.

Generation period. VMall will generate various shopping activities and trajectories for a period of 3 years for every virtual client. 2 years of the generated activities will be assigned to a training set, and the activities of the 3rd year will be used to evaluate the prediction accuracy of the PDPA algorithm. After the completion of the generation step, we will have a dataset that consists of 3 years of various shopping activities for five virtual clients in VMall. Table 2 describes the sizes of the contents of the virtual mall dataset. For this dataset, each destination is a visited store and each Dtrajectory is a sequence of visited stores in a visit to the shopping mall. Note that for five clients, there are five datasets of Dtrajectories, one for each client.
GPS dataset
GPS dataset stats
Subject ID  Days  Dtrajectories  Locations (>2min) 

SUB1  133  133  414 
SUB2  197  197  220 
SUB3  270  270  312 
SUB4  191  191  208 
SUB7  130  130  345 
SUB8  249  249  346 
SUB9  191  191  201 
SUB10  47  47  109 
SUB12  361  361  448 
Evaluations and results
In this section, we will measure the prediction accuracy of the PDPA algorithm by using a transition matrix with two different memory sizes. In addition, we developed a Markovchain algorithm for predicting the Dtrajectory of a targeted individual. The Markovchain algorithm is used only for comparison with the PDPA algorithm.
PDPA with two memory sizes

PDPA_M1 The PDPA algorithm uses a transition matrix that has the probabilities of visiting different destinations after the current location of individual(x). Hence, suffix _M1 refers to a memory size of one destination.

PDPA_MN Suffix _MN with the PDPA algorithm indicates that the algorithm uses a transition matrix that has the probabilities of visiting different destinations after visiting n number of destinations. For example, if individual(x) visited location(i), location(j) and location(k), then, the PDPA algorithm will predict the next destination that has the highest probability to be visited after those three destinations.
Markovchain algorithm with no memory

The training set. 2 years of different trajectories from the VMall dataset and 60 % of the GPS dataset of individual(x) will be used by Markovchain algorithm for training purposes.

Predefined prediction length. Different lengths will be used in the process of predicting the Dtrajectory of individual(x). Suffixes _3, _5 and _7 with Markovchain algorithm indicates that the algorithm will be used to predict a trajectory with a length of 3, 5 and 7, respectively. For example, MC_5 means that Markovchain algorithm will predict a trajectory of five destinations.

The initial distribution matrix \(\pi\) that contains the probabilities of visiting every destination by individual(x).

The transitions matrix T that has the probabilities of different transitions of individual(x) from one destination to another. The training set of individual(x) is used to build \(\pi\) and T.

The Dtrajectory prediction process includes the following steps:
 i.
The first destination to be visited by individual(x) is the destination that has the highest probability in \(\pi\).
 ii.The Markovchain algorithm does not keep a memory of the last visited destination(s) when making a prediction for the possible next destination. Hence, the next remaining destinations on the predicted trajectory will be found by using the following formula:The next destination to be visited by individual(x) is the destination that has the highest probability in \(\pi ^i\).$$\begin{aligned} \pi ^i=\pi ^{i1} * T \end{aligned}$$(1)
 iii.
Based on the predefined prediction length, the previous step will be repeated to find the remaining destinations of the predicted Dtrajectory.
 i.
Measuring prediction success
Note that while we predict a sequence of stores (a Dtrajectory), the evaluation we provide in the rest of the paper focuses on precision and recall measures (analogous to information retrieval since relevance of ads relies on this more) rather than order (order is considered elsewhere for the shopping mall scenario in [12]).
As mentioned earlier, a synthetic dataset of indoor Dtrajectories and a real dataset of outdoor Dtrajectories (derived from a GPS dataset), altogether involving 14 subjects/clients (five in a virtual mall and nine GPS based), were used to evaluate the accuracy of a PDPA predicted Dtrajectory compared to the actual trajectory used by the subject/client.
Dynamic prediction length vs. fixed length
Prediction accuracy with VMall dataset
Prediction accuracy with GPS dataset
Selfhistories vs. grouphistories
Figure 14 shows the Fmeasure values that represent the prediction accuracy of both algorithms, PDPA and Markovchain, using grouphistories instead of selfhistories with the subjects in the GPS dataset. From Fig. 14, we can see a significant drop in accuracy when using grouphistories with the PDPA algorithm. With the PDPA algorithm and GroupHistories, the drop in Fmeasure values can be seen as follows: from 46.17 to 0 % with SUB1, from 72.51 to 0 % with SUB2, from 60.26 to 17.85 % with SUB3, from 55.39 to 4.71 % with SUB4, from 43.15 to 0 % with SUB7, from 38.18 to 7.08 % with SUB8, from 55.63 to 0 % with SUB9, from 59.95 to 0 % with SUB10, and from 45.52 to 35.15 % with SUB12. In addition, with Markovchain algorithms and GroupHistories, the significant drop in Fmeasure values was as follows: from 57.92 to 0 % with SUB1, from 65.21 to 0 % with SUB2, from 45.14 to 3.41 % with SUB3, from 40.69 to 0 % with SUB4, from 42.35 to 0 % with SUB7, from 61.24 to 46.77 % with SUB9, from 54.10 to 0 % with SUB10, and from 44.15 to 1.12 % with SUB12.
Monday for monday vs. no preselected days
The PDPA algorithm focuses on activities and trajectories that occurred on previous weekdays that are of the same day of the week as the current predicted weekday of a targeted individual. For example, if the current weekday is Monday, then, the PDPA algorithm will focus on recorded trajectories that occurred on up to 2 years of all previous Mondays. Figures 9 and 12 show the good prediction accuracy that PDPA managed to achieve when using this approach. In comparison, Figs. 15 and 16 show the significant drop in Fmeasure value when using recorded trajectories for up to 2 years of all previous days. With VMall dataset, there was a drop in Fmeasure value by 2.19 % with ClientS1, 15.96 % with ClientM4, 6.56 % with ClientM5 and 3.35 % with ClientF8, as shown in Fig. 15. With the GPS dataset, the drop in Fmeasure value was between 3.30 and 11.85 %, a drop by 4.36 % with SUB1, 11.72 % with SUB2, 7.71 % with SUB3, 11.85 % with SUB4, 3.30 % with SUB7, 9.24 % with SUB8, 4.10 % with SUB9, 3.60 % with SUB10 and 11.25 % with SUB12, as shown in Fig. 16.
Discussion

Compared to the use of a predefined length, using a dynamic length to predict the number of destinations that could be visited by a targeted individual allowed the PDPA algorithm to achieve good prediction accuracy with most subjects and virtual clients, as shown in Figs. 9 and 12.

Figure 17 shows the average Fmeasure with all virtual clients in VMall dataset. The PDPA algorithm was able to achieve the highest average Fmeasure of 60.77 %, while with Markovchain algorithm, the best average Fmeasure of 21.71 % was achievable when using a fixed length of 7. Figure 18 shows the average Fmeasure with all subjects in the GPS dataset of [6]. The PDPA algorithm achieved the highest average Fmeasure of 52.97 % and Markovchain algorithm with a fixed length of 3 achieved an average Fmeasure of 48 %. The problem of showing only the average Fmeasure is that it does not show exactly how accurate the predictions of a prediction algorithm are for each individual. For example, as shown in Fig. 17, the average Fmeasure of 60.77 % achieved by the PDPA algorithm does not explicitly show how the prediction algorithm was able to achieve a significant Fmeasure value of 100 % with clientS2 or a low Fmeasure of 10.29 % with clientS1. In addition, with the GPS dataset, the PDPA algorithm achieved a high Fmeasure value of 72.51 % with SUB2 and low Fmeasure of 38.18 % with SUB8, but both results are hidden inside the average Fmeasure value of 52.97 %, as shown in Fig. 18.

In our work, we focus on the selfhistory of an individual in order to produce predictions related only to that individual. We also show the accuracy of our prediction algorithm with every single individual. Accordingly, in order to have a better idea of why with some individuals the results were significantly better and why with others our algorithm was struggling to produce an accurate prediction, we measured the behavior regularity of every individual and show how such level of regularity could affect the prediction accuracy of our PDPA algorithm, positively or negatively. Williams et. al. [13] defined regularity as repeated activity over time. For example, in their approach, the behavior of a targeted individual can be described as highly regular when that individual is visiting a location at very similar times each week. They tested their approach on three datasets and they found that the majority of individuals in those datasets are deemed to have high irregular behavior. The work in [13] found that the three datasets have finegrained scale and two of the datasets are at a citywide scale, and hence, high irregular behavior for the majority of individuals might have been an expected result by them.

\(W_k \in W = \{Mon,Tue,Wed,Thu,Fri,Sat,Sun\}\)

\(W_k(L_i) = a\ set\ of\ locations\ visited\ on\ previous\ W_k\)

\(W_k(L_{i+1}) = a\ set\ of\ locations\ visited\ on\ next\ W_k.\)
These two types of regularity, destination and length, were chosen because of their direct impact on our PDPA algorithm. The process to predict the number of destinations that could be visited by a targeted individual will be affected by the level of regularity of that individual regarding the length of his/her trajectory from one weekday to another similar weekday. In addition, the PDPA algorithm is using an initial distribution matrix and a transition matrix. Both matrices will be affected by the level of regularity of a targeted individual regarding the set of destinations that he/she tends to visit from one weekday to another similar weekday.
System prototype
Conclusion and future work
In this paper, we have introduced a trajectory prediction algorithm, PDPA, that uses the selfhistory of a targeted individual in order to predict the set of destinations that could be visited by him/her. We show that selfhistories can provide reasonable predictions of future destinations, which can be exploited for ads. In future work, extensive analysis of the behavior of an individual is required in order to be able to identify which other behavior attributes can affect the prediction accuracy of PDPA, positively or negatively. Other personal profile information can be integrated to possibly improve the predictions such as past histories of transactions or purchases and current situations of the user. We think that our approach of using personal/selfhistories, rather than the combined histories of many people, can provide more personalised ads, and will address issues of variations of behavior across individuals. Further work will experimentally compare and contrast predictions that use the selfhistory of what a person typically does with predictions that use the combined histories of what people typically do in other scenarios.
CRAWDAD dataset yonsei/lifemap(v.20120103), http://crawdad.cs.dartmouth.edu/yonsei/lifemap, Jan. 2012.
Notes
Declarations
Authors' contributions
Both coauthors contributed significantly to the research and this paper, and the lead author is the main contributor. Both authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Dhar S, Varshney U (2011) Challenges and business models for mobile locationbased services and advertising. Commun ACM 54(5):121–128View ArticleGoogle Scholar
 Krumm J (2011) Ubiquitous advertising: the killer application for the 21st century. IEEE Pervasive Comput 10(1):66–73View ArticleGoogle Scholar
 Kim B, Ha JY, Lee S, Kang S, Lee Y, Rhee Y, Nachman L, Song J (2011) Adnext: a visitpatternaware mobile advertising system for urban commercial complexes. In: Proceedings of the 12th workshop on mobile computing systems and applications. ACM pp. 7–12Google Scholar
 Sánchez JM, Cano JC, Calafate CT, Manzoni P (2008) Bluemall: a bluetoothbased advertisement system for commercial areas. In: Proceedings of the 3rd ACM workshop on performance monitoring and measurement of heterogeneous wireless and wired networks. ACM pp. 17–22Google Scholar
 Liapis D, Vassilaras S, Yovanof GS (2008) Implementing a lowcost, personalized and location based service for delivering advertisements to mobile users. In: 3rd international symposium on wireless pervasive computing 2008. ISWPC 2008, IEEE pp. 133–137Google Scholar
 Chon Y, Shin H, Talipov E, Cha H (2012) Evaluating mobility models for temporal prediction with highgranularity mobility data. In: IEEE International Conference on pervasive computing and communications (Percom) 2012. IEEE pp. 206–212Google Scholar
 Asahara A, Maruyama K, Sato A, Seto K (2011) Pedestrianmovement prediction based on mixed markovchain model. In: Proceedings of the 19th ACM SIGSPATIAL international conference on advances in geographic information systems. ACM pp. 25–33Google Scholar
 Zhang D, Xia F, Yang Z, Yao L, Zhao W (2010) Localization technologies for indoor human tracking. In: 5th international conference on future information technology (FutureTech), 2010. IEEE pp. 1–6Google Scholar
 Kolodziej J, Khan SU, Wang L, MinAllah N, Madani SA, Ghani N, Li H (2011) An application of markov jump process model for activitybased indoor mobility prediction in wireless networks. In: Frontiers of information technology (FIT). IEEE pp. 51–56Google Scholar
 Gambs S, Killijian MO, del PradoCortez MN (2012) Next place prediction using mobility markov chains. In: Proceedings of the first workshop on measurement, privacy, and mobility. ACM p. 3Google Scholar
 Mathivaruni R, Vaidehi V (2008) An activity based mobility prediction strategy using markov modeling for wireless networks. In: Proc of the world congress on engineering and computer science 2008 WCECS. Citeseer pp. 379–384Google Scholar
 Barzaiq O, Loke SW, Lu H (2015) On trajectory prediction in indoor retail environments for mobile advertising using selected selfhistories. In: Proceedings of the 12th IEEE international conference on ubiquitous intelligence and computing (UIC 2015)Google Scholar
 Williams MJ, Whitaker RM, Allen SM (2012) Measuring individual regularity in human visiting patterns. In: Privacy, security, risk and trust (PASSAT), 2012 international conference on social computing (SocialCom). IEEE, pp 117–122Google Scholar