In this section, we will measure the prediction accuracy of the PDPA algorithm by using a transition matrix with two different memory sizes. In addition, we developed a Markovchain algorithm for predicting the Dtrajectory of a targeted individual. The Markovchain algorithm is used only for comparison with the PDPA algorithm.
PDPA with two memory sizes

PDPA_M1 The PDPA algorithm uses a transition matrix that has the probabilities of visiting different destinations after the current location of individual(x). Hence, suffix _M1 refers to a memory size of one destination.

PDPA_MN Suffix _MN with the PDPA algorithm indicates that the algorithm uses a transition matrix that has the probabilities of visiting different destinations after visiting n number of destinations. For example, if individual(x) visited location(i), location(j) and location(k), then, the PDPA algorithm will predict the next destination that has the highest probability to be visited after those three destinations.
Markovchain algorithm with no memory
Markovchain algorithm is used in several human trajectory approaches to find the next location of an individual [8–11]. Our Markovchain algorithm consists of the following steps:

The training set. 2 years of different trajectories from the VMall dataset and 60 % of the GPS dataset of individual(x) will be used by Markovchain algorithm for training purposes.

Predefined prediction length. Different lengths will be used in the process of predicting the Dtrajectory of individual(x). Suffixes _3, _5 and _7 with Markovchain algorithm indicates that the algorithm will be used to predict a trajectory with a length of 3, 5 and 7, respectively. For example, MC_5 means that Markovchain algorithm will predict a trajectory of five destinations.

The initial distribution matrix \(\pi\) that contains the probabilities of visiting every destination by individual(x).

The transitions matrix T that has the probabilities of different transitions of individual(x) from one destination to another. The training set of individual(x) is used to build \(\pi\) and T.

The Dtrajectory prediction process includes the following steps:

i.
The first destination to be visited by individual(x) is the destination that has the highest probability in \(\pi\).

ii.
The Markovchain algorithm does not keep a memory of the last visited destination(s) when making a prediction for the possible next destination. Hence, the next remaining destinations on the predicted trajectory will be found by using the following formula:
$$\begin{aligned} \pi ^i=\pi ^{i1} * T \end{aligned}$$
(1)
The next destination to be visited by individual(x) is the destination that has the highest probability in \(\pi ^i\).

iii.
Based on the predefined prediction length, the previous step will be repeated to find the remaining destinations of the predicted Dtrajectory.
Measuring prediction success
Note that while we predict a sequence of stores (a Dtrajectory), the evaluation we provide in the rest of the paper focuses on precision and recall measures (analogous to information retrieval since relevance of ads relies on this more) rather than order (order is considered elsewhere for the shopping mall scenario in [12]).
We are measuring the prediction accuracy of the PDPA algorithm as follows:
$$\begin{aligned} Precision= D \cap A / D \end{aligned}$$
(2)
$$\begin{aligned} Recall= D \cap A/A \end{aligned}$$
(3)
$$\begin{aligned} Fmeasure= 2*((Precision*Recall)/(Precision+Recall)) \end{aligned}$$
(4)
where A is the actual Dtrajectory (i.e., its corresponding set of destinations) used by individual(x), and D is the Dtrajectory (i.e., its corresponding set of destinations) predicted by the PDPA algorithm.
As mentioned earlier, a synthetic dataset of indoor Dtrajectories and a real dataset of outdoor Dtrajectories (derived from a GPS dataset), altogether involving 14 subjects/clients (five in a virtual mall and nine GPS based), were used to evaluate the accuracy of a PDPA predicted Dtrajectory compared to the actual trajectory used by the subject/client.
Dynamic prediction length vs. fixed length
Figures 5 and 6 show how using dynamic prediction length versus using a fixed length resulted in a significantly accurate prediction of the number of destinations that could be visited by a targeted individual. Figure 5 shows the length prediction accuracy for every virtual client in the VMall dataset. By using a dynamic prediction length, the PDPA algorithm was able to predict the number of destinations with an accuracy of 75.94 % with clientS1, 92.99 % with clientM4, 82.30 % with clientM5 and 100 % with clientS2. On the other hand, using a fixed length with Markovchain algorithm resulted in different accuracy values. The Markovchain algorithm with a fixed length of 3 was able to achieve a length prediction accuracy of 49.49 % with clientS1, 13.73 % with clientS2 and 9.53 % with clientF8. Using a fixed length of 7 helped the Markovchain algorithm predict the number of destinations with an accuracy of 43.32 % with clientM4 and 81.04 % with clientM5.
Figure 6 shows the length prediction accuracy for every subject in the GPS dataset of [6]. In comparison to the Markovchain algorithm with different fixed length values, the PDPA algorithm was able to achieve the highest prediction accuracy regarding the number of destinations that could be visited, for every subject. The PDPA algorithm was able to predict the length of the Dtrajectory with an accuracy of 71.13 % with SUB1, 75.57 % with SUB2, 71.46 % with SUB3, 60.37 % with SUB4, 69.42 % with SUB7, 68.88 % with SUB8, 71.90 % with SUB9, 70.67 % with SUB10 and 67.23 % with SUB12. On the other hand, the Markovchain algorithm with a fixed length of 3 was able to achieve a length prediction accuracy of 63.16 % with SUB2 and 61.76 % with SUB9, while using a fixed length of 7 resulted in an accuracy of 60.56 % with SUB8 and 66.08 % with SUB12. With a fixed length of 5, the Markovchain algorithm predicted the number of destinations with an accuracy of 67.23 % with SUB1, 67.08 % with SUB3, 66.16s % with SUB4, 65.00 % with SUB7 and 63.67 % with SUB10.
Prediction accuracy with VMall dataset
Figure 7 shows the Precision value with every virtual client in VMall dataset. With clientS1, who has an extremely irregular behavior, Markovchain algorithm with a fixed length of 3 achieved the highest Precision value of 18.15 %, while PDPA_M1 achieved 12.90 % and PDPA_MN achieved 16.80 %. With virtual clients, clientM4, clientM5 and clientF8, who have a regular behavior, the PDPA algorithm was able to achieve the highest precision values over the Markovchain algorithm with different fixed lengths. With clientM4, the PDPA_M1 achieved a Precision value of 83.09 % and the PDPA_MN achieved a Precision of 82.38 %, while the Markovchain algorithm with a fixed length of 3 achieved 31.76 %. The highest Precision value with clientM5 was achievable by PDPA_MN and PDPA_M1 with Precision of 64.79 and 64.21 %, respectively. The Markovchain algorithm with a fixed length of 3 was only able to obtain a Precision of 39.01 % with clientM5. Both prediction algorithms were struggling to obtain accurate predictions with clientF8, who is a mother of two children. With clientF8, PDPA_MN achieved a Precision of 50.03 %, PDPA_M1 achieved 48.92 % and Markovchain achieved 44.20 % using a fixed length of 3. The extreme regularity of clientS2 helped both algorithms, PDPA and Markovchain, to achieve a Precision value of 100 %.
Figure 8 shows the recall value with the virtual clients in VMall. The extreme irregularity of clientS1 was a significant obstacle for PDPA, and hence, PDPA_M1 achieved a recall value of 9.57 % and PDPA_MN achieved 9.69 %, while the Markovchain algorithm with a fixed length of 7 was able to achieve the highest Recall of 62.31 %. With clientM4, clientM5 and clientF8, the PDPA algorithm achieved the highest Recall values. With clientM4, PDPA_M1 achieved a Recall value of 82.09 % and PDPA_MN achieved 81.24 %, where the Markovchain algorithm with a fixed length of 7 achieved only 25.89 %. PDPA_M1 and PDPA_MN achieved a Recall value of 66.94 % with clientM5, and Markovchain using a fixed length of 7 achieved 32.61 %. The main challenge for both algorithms was clientF8. With that virtual client, PDPA_MN achieved a Recall of 46.97 % and PDPA_M1 achieved 45.99 %, while Markovchain with a fixed length of 3 could not achieve more than 4.32 %. Even though clientS2 has an extremely regular behavior, Markovchain algorithm achieved only a Recall of 12.43 % using a fixed length of 3, while both PDPA_M1 and PDPA_MN achieved a recall value of 100 %.
Figure 9 shows the prediction accuracy of both algorithms, PDPA and Markovchain, through the Fmeasure values with every client in VMall dataset. The highest Fmeasure value of 22.26 % with clientS1 was achievable by the Markovchain algorithm with a fixed length of 5, while PDPA_M1 and PDPA_MN could not obtain more than 10.29 and 11.26 %, respectively. On the other hand, with clients, clientM4, clientM5 and clientF8, the PDPA algorithm achieved the highest Fmeasure values over the Markovchain with different fixed lengths. PDPA_M1 achieved Fmeasure value of 82.33 % and PDPA_MN achieved 81.51 % with clientM4, while the Markovchain algorithm with the use of a fixed length of 7 could not achieve more than 25.25 % with the same client. With clientM5, PDPA_M1 and PDPA_MN achieved Fmeasure values of 64.49 and 64.76 %, respectively, whereas Markovchain with a fixed length of 5 achieved only 31.91 %. Even with the significant shopping history of clientF8, who is a mother of two children, PDPA_M1 was able to achieve Fmeasure value of 46.75 % and PDPA_MN achieved 47.87 %, while Markovchain with a fixed length of 3 obtained only 7.69 %. The last virtual client is clientS2, who has an extremely regular behavior. With that client, PDPA_M1 and PDPA_MN was easily able to achieve a Fmeasure of 100 %, while Fmeasure value of 22.56 % was the highest value that the Markovchain algorithm achieved with a fixed length of 3.
Prediction accuracy with GPS dataset
Figure 10 shows the Precision values with every subject in the GPS dataset of [6]. With SUB1, Markovchain with a fixed length of 3 achieved a precision value of 79.87 %, but with a different fixed length, there was a significant drop in the precision value. For instance, with the same subject, SUB1, MC_5 achieved a precision of 57.74 and 42.28 % with MC_7, while PDPA_M1 and PDPA_MN using a dynamic length achieved Precision values of 51.06 and 54.21 %, respectively. PDPA_M1 with SUB2 achieved the highest precision value of 77.78 %, and Markovchain with a fixed length of 3 achieved 72.03 %, while PDPA_MN comes in third place with a precision of 70.26 %. The highest precision values with subjects, SUB3, SUB4, SUB8 and SUB10, were achievable by PDPA_M1 with Precision values of 68.51, 70.85, 38 and 61.70 %, respectively. On the other hand, Markovchain with a fixed length of 3 achieved the highest precision values with SUB7 and SUB12. MC_3 achieved a precision of 58.33 % with SUB7 while PDPA_M1 obtained only 48.81 %. With SUB12, PDPA_MN achieved a precision of 56.80 % and PDPA_M1 achieved 53.66 % whereas MC_3 achieved the highest Precision value of 66.64 %. With SUB9, both algorithms managed to achieve a high precision value, with PDPA_M1 achieving a Precision of 62.27 % and MC_3 achieving 62.90 %.
Figure 11 shows the Recall values with every subject in the GPS dataset. With SUB1, PDPA_M1 achieved a Recall value of 47.04 % and PDPA_MN achieved 37.25 % while Markovchain with a fixed length of 7 achieved the highest Recall value of 59.49 %. MC_7 also achieved the highest Recall value of 77.03 % with SUB2 whereas PDPA_M1 achieved a Recall of 74.08 % and PDPA_MN achieved 62.17 %. On the other hand, PDPA_M1 was able to achieve the highest Recall with SUB3, SUB4 and SUB8, with Recall values of 60.17, 52.54 and 40.16 %, respectively. With those subjects, MC_7 achieved a Recall value of 48.06 % with SUB3, 48.35 % with SUB4 and 18.89 % with SUB8. With SUB9 and SUB12, the highest Recall values of 75.21 and 47.16 %, respectively, were achievable by MC_7 while PDPA_M1 achieved only a Recall value of 56.23 % with SUB9 and 43.59 % with SUB12. The Recall values with SUB7 were very close, with MC_7 achieving a Recall of 43.90 % and PDPA_M1 achieving 42.85 %. With SUB10, both algorithms managed to achieve almost identical Recall, with MC_7 achieving a Recall value of 65 % and PDPA_M1 achieving 64.60 %.
Figure 12 shows the prediction accuracy of both algorithms, PDPA and Markovchain, through the Fmeasure values with every subject in the GPS dataset. The highest Fmeasure value of 57.92 % with SUB1 was achievable by Markovchain algorithm with a fixed length of 3, while PDPA_M1 and PDPA_MN could not obtain more than 46.17 and 41.45 %, respectively. On the other hand, with subjects, SUB2, SUB3, SUB4, SUB7, SUB8, SUB10 and SUB12, the PDPA_M1 achieved the highest Fmeasure values of 72.51, 60.26, 55.39, 43.15, 38.18, 59.95 and 45.52 %, respectively. With those subjects and in the same order, MC_3 achieved recall values of 65.21, 45.14, 40.69, 42.35, 22.24, 54.10 and 43.10 %. With SUB9, the first place was reserved for MC_3 with Fmeasure value of 61.24 %, then PDPA_M1 in second place with Fmeasure of 55.63 %, and the third place was for PDPA_MN with Fmeasure of 50.96 %.
Selfhistories vs. grouphistories
In comparison to the prediction accuracy results using SelfHistories in Figs. 9 and 12, Fig. 13 shows the Fmeasure values that represent the prediction accuracy of both algorithms, PDPA and Markovchain, using grouphistories style instead of selfhistories with the virtual clients in VMall dataset. From Fig. 13, we can see a significant drop in accuracy when using grouphistories with the PDPA algorithm. With clientS1, PDPA achieved Fmeasure of 0 % in comparison to 10.29 % when using Selfhistories. With the PDPA algorithm and GroupHistories, the drop in Fmeasure values can be seen as follows: from 82.33 to 74.16 % with clientM4, from 64.49 to 50.85 % with clientM5, from 46.75 to 43.16 % with clientF8, and from 100 to 45.32 % with clientS2, who has an extremely regular behavior. Also, the prediction accuracy of Markovchain algorithm with different fixed lengths and by using grouphistories suffered from a serious drop in accuracy. With Markovchain algorithm and grouphistories, the drop in Fmeasure values was as follows: from 22.26 to 7.18 % with clientS1, from 25.25 to 12.89 % with clientM4, and from 31.91 to 28.88 % with clientM5. However, MC_7 and with the use of grouphistories was able to achieve a significant increase in Fmeasure value from 7.38 to 66.87 % with clientF8, who is a mother of two children. In addition, MC_7 achieved another increase in Fmeasure value from 22.42 to 33.66 % with clientS2.
Figure 14 shows the Fmeasure values that represent the prediction accuracy of both algorithms, PDPA and Markovchain, using grouphistories instead of selfhistories with the subjects in the GPS dataset. From Fig. 14, we can see a significant drop in accuracy when using grouphistories with the PDPA algorithm. With the PDPA algorithm and GroupHistories, the drop in Fmeasure values can be seen as follows: from 46.17 to 0 % with SUB1, from 72.51 to 0 % with SUB2, from 60.26 to 17.85 % with SUB3, from 55.39 to 4.71 % with SUB4, from 43.15 to 0 % with SUB7, from 38.18 to 7.08 % with SUB8, from 55.63 to 0 % with SUB9, from 59.95 to 0 % with SUB10, and from 45.52 to 35.15 % with SUB12. In addition, with Markovchain algorithms and GroupHistories, the significant drop in Fmeasure values was as follows: from 57.92 to 0 % with SUB1, from 65.21 to 0 % with SUB2, from 45.14 to 3.41 % with SUB3, from 40.69 to 0 % with SUB4, from 42.35 to 0 % with SUB7, from 61.24 to 46.77 % with SUB9, from 54.10 to 0 % with SUB10, and from 44.15 to 1.12 % with SUB12.
The results show that there is, in this case, greater value in predicting based on what the individual him/herself did in the past rather than what other people might typically do.
Monday for monday vs. no preselected days
The PDPA algorithm focuses on activities and trajectories that occurred on previous weekdays that are of the same day of the week as the current predicted weekday of a targeted individual. For example, if the current weekday is Monday, then, the PDPA algorithm will focus on recorded trajectories that occurred on up to 2 years of all previous Mondays. Figures 9 and 12 show the good prediction accuracy that PDPA managed to achieve when using this approach. In comparison, Figs. 15 and 16 show the significant drop in Fmeasure value when using recorded trajectories for up to 2 years of all previous days. With VMall dataset, there was a drop in Fmeasure value by 2.19 % with ClientS1, 15.96 % with ClientM4, 6.56 % with ClientM5 and 3.35 % with ClientF8, as shown in Fig. 15. With the GPS dataset, the drop in Fmeasure value was between 3.30 and 11.85 %, a drop by 4.36 % with SUB1, 11.72 % with SUB2, 7.71 % with SUB3, 11.85 % with SUB4, 3.30 % with SUB7, 9.24 % with SUB8, 4.10 % with SUB9, 3.60 % with SUB10 and 11.25 % with SUB12, as shown in Fig. 16.
The results show that selecting histories of the same day of the week as the day whose Dtrajectory is to be predicted helps, rather than simply using the history of all days.