Research | Open | Published:
Random forest and WiFi fingerprint-based indoor location recognition system using smart watch
Human-centric Computing and Information Sciencesvolume 9, Article number: 6 (2019)
Various technologies such as WiFi, Bluetooth, and RFID are being used to provide indoor location-based services (LBS). In particular, a WiFi base using a WiFi AP already installed in an indoor space is widely applied, and the importance of indoor location recognition using deep running has emerged. In this study, we propose a WiFi-based indoor location recognition system using a smart watch, which is extended from an existing smartphone. Unlike the existing system, we use both the Received Signal Strength Indication (RSSI) and Basic Service Set Identifier (BSSID) to solve the problem of position recognition owing to the similar signal strength. By performing two times of filtering, we want to improve the execution time and accuracy through the learning of random forest based location awareness. In an unopened indoor space with five or more WiFi APs installed. Experiments were conducted by comparing the results according to the number of data for supposed system and a system based on existing WiFi fingerprint based random forest. The proposed system was confirmed to exhibit high performance in terms of execution time and accuracy. It has significance in that the system shows a consistent performance regardless of the number of data for location information.
As the penetration rate of smart devices has recently increased, location-based service (LBS) has been applied to real life in various fields, and research on improving the accuracy of location recognition has been actively carried out [1,2,3,4]. Location awareness can be divided into outdoor and indoor areas. Users spend a significant amount of time indoors, and the need for such awareness is thus increasing, and many studies in this field are underway [5,6,7,8]. Although a typical outdoor location recognition technology uses the Global Positioning System (GPS), this system is difficult to use for indoor location recognition because it has a large position error and blocks signals on buildings and walls . Therefore, methods using indoor WiFi, Bluetooth, RFID, and other approaches have been studied for indoor location recognition [10,11,12,13].
A WiFi-based method uses a WiFi AP (access point) installed within the vicinity, and the location is calculated using a property in which the signal intensity is constantly damped with the distance. A Bluetooth-based method uses a Bluetooth low-energy (BLE) beacon to calculate the location instead of WiFi. The RFID method requires an RFID tag and an RFID reader as a method for identifying a user using radio waves. At this time, Bluetooth and RFID based methods have restrictions because they require an additional installation. In particular, the use of RFID is limited because the tag cannot be recognized by a smart device. Therefore, indoor location recognition using WiFi can use a WiFi AP installed in an existing building, and thus there is no need to create an additional environment, and most smart devices have an advantage in supporting WiFi.
As such, this study proposes a WiFi-based indoor location recognition system. In the proposed system, a smart watch based on Arduino platform is fabricated to use. The proposed system uses a fingerprint technique which has demonstrated higher accuracy compared to other techniques and the effort has been made to enhance the system accuracy through the random forest learning. In addition, this study has attempted to additionally establish the radio maps including the location information of AP and the Basic Service Set IDentifier (BSSID) by moving forward from the existing fingerprint techniques using the Received Signal Strength Indication (RSSI) only, thereby improving the processing performance of location comparison in a fingerprint technique.
Method of WiFi-based location recognition
Because WiFi is installed in most indoor areas, it is highly utilized, and research on indoor location recognition technology based on WiFi is actively underway [14,15,16,17]. Indoor location recognition methods that can be used for WiFi include triangulation and fingerprinting.
The triangulation technique is usually used to trace the location of the entity moving along a plane in real time. More than three reference points are required for this, and WiFi AP is the standard used in a WiFi environment. When the coordinates of AP1, AP2, and AP3 are defined as (x1, y1), (x2, y2), and (x3, y3) in Fig. 1, the distance from each AP can be calculated using the Pythagorean theorem.
Using formula (1), the distance is measured using the RSSI value between each AP that is generally measured. Formula (2) can be used to find L (loss of a signal transmitted using a moving entity) using Friis’ formula, and represent it as formula (3) for distance d. At this time, λ signifies the wavelength of radio waves and uses a unit such as distance d. It is possible to find the current position coordinate value by substituting d1, d2, and d3 found in this way for the Pythagorean theorem above.
However, a large number of errors occur in the triangulation technique because of the multi-route radio wave phenomenon in which different signals are reflected and cause interference . Therefore, the way to solve the multipath propagation phenomenon is necessary for use of indoor positioning.
Fingerprint uses BSSID and RSSI, which are inherent values of a WiFi AP. An indoor space is divided into small grids, and the APs create a database by collecting the RSSI values in each grid, which is called a radio map. Fingerprinting is a technique used to compare this radio map with the WiFi signal intensity received from an arbitrary point to presume the grid having the most similar pattern as the user’s position.
In the fingerprint techniques, accuracy and error of location vary depending on the size of cells that divide the space. As shown in Fig. 2, the larger the size of grid, the higher the accuracy as the probability of locating in the corresponding grid becomes higher. However, an error as big as the grid size, which is a distance between cells, can be created for the actually estimated location within a grid because of a huge range within the corresponding cell. Accordingly, it is quite important to define the appropriate grid size in fingerprint techniques .
Because fingerprinting is used to make and compare the radio map database by collecting the real data in which the nature of the indoor space is reflected, its accuracy is relatively high compared to the triangulation technique. However, there is a shortcoming in that the initial radio map should be built initially, and then built again when the macro-environment changes . In this paper, indoor location recognition is applied for a limited indoor space, which is not open, and thus the fingerprint method is suitable because there are few changes in the surrounding environment.
However, in the case of the general fingerprint, only the RSSI is stored and sorted in ascending order. Since it is not possible to know the RSSI of a certain WiFi AP, if the RSSI of the place other than the learned place is similar, the position may be misunderstood. In addition, since WiFi is a signal, it can be greatly affected by small obstacles. In other words, since the signal strength can be continuously varied, there is a limitation in recognizing the position depending only on the RSSI. Therefore, we tried to overcome these problems by storing RSSI and BSSID together.
As shown in Fig. 3, random forest is an ensemble learning method used for a classification and regression analysis. It is created by randomly extracting data from data, and each result is combined to make a prediction model. At that time, we divide all of the data into learning and evaluation data, and create a decision tree as the learning data, place the performance evaluation data into the decision tree created, and determine the final result. The random forest uses the bootstrap to restore and extract the data when dividing the data. If n pieces of data are restored and extracted, the data is re-extracted as shown in (4).
The extracted data become the learning data, and the unextracted data are called out-of-bag (OOB) data used for a performance evaluation. Because this technique can be used for a performance evaluation by applying the OOB data, it does not need to construct data for testing separately, and creates multiple decision trees to determine the resulting value. Therefore, when data other than the learning data are input, over-fitting problems with less accuracy can be avoided . In addition, although random forest achieves a good generalization because it makes a decision tree using random data from among the learned data, it can not explain the process that the result comes out .
The indoor location recognition system proposed in this paper is shown in Fig. 4, and consists of collecting the information (RSSI, BSSID, location), constructing the radio map in the server, and recognizing the location.
In the first step, collecting the information, an Arduino-based smart watch is used in an indoor space divided into a grid of a certain size to collect indoor location information through a WiFi sensor. This is a step used to store the RSSI and BSSID values in the server by measuring the WiFi signal for the location of the user wearing the manufactured smart watch. At this time, unlike the existing fingerprint methods that store and arrange only the signal intensity in ascending order, the proposed system additionally stored the BSSID value of AP, thereby solving a problem that is difficult to reflect varying signal intensity at each grid location depending on environmental influence. Since all BSSIDs of AP collected in each location are identical, the proposed system enabled to consider a specific location as corresponding location if BSSIDs of both locations match each other even though the slightly different signal intensity is measured for collected BSSIDs.
Next, based on the collected location data, four radio maps are constructed, which are regarding the RSSI, BBSID, and location from a WiFi AP for each indoor space grid. Finally, based on the radio map constructed as a step for recognizing the location, the WiFi signal transmitted from the smart watch is compared with the location information. In this case, the location information is put into the random forest based learning model, and the final location result is returned.
In this study, it is assumed that the indoor space is divided into a square grid with 2 m intervals order to minimize the position error, and it is assumed that at least 5 WiFi APs are installed in an open space for accurate location recognition.
Building radio map
A radio map is built by dividing an indoor space using a grid. Using a smart watch with a WiFi sensor, the signal is collected at each grid position, and the strongest RSSI and BSSID are collected. At this time, a total of four radio maps are constructed, and as shown in Fig. 5, are RSSI-based Radio Map (RRM), BBSID-based Radio Map (BRM), location-based from each WiFi AP Radio Map (LRM), BBSID list-based Radio Map (BLRM).
As such, four radio maps are used to address an issue of location accuracy due to accumulation of signal intensity of WiFi AP. Moreover, since the number of comparison increases as an increase in the number of locations with a similar signal intensity, the processing speed can be improved by reducing the number of comparison through a filtering that uses the Location-based Radio Map (LRM) and BSSID list-based Radio Map (BLRM).
As the order used to build a radio map, the RRM, BRM, and LRM are first built, and the last BLRM is then constructed. The RSSI, BSSID, and location values from each WiFi AP are constructed together at the same time as the learning is applied to build the radio map. BLRM is built based on the built BRM. The BLRM is a radio map for filtering the grid locations with similar BSSIDs prior to the location awareness. In other words, it was constructed to speed up the performance of learning by comparing first when recognizing the location.
The most important aspect in constructing a radio map is the number of times. The information (RSSI, BSSID) for each grid is learned. As the learning progresses, the accuracy improves to a certain level; however if the amount of data is large, it takes longer to learn, which is inefficient. Thus, it is important to know the appropriate number of learning times. Figure 6 shows a graph of the location accuracy and the execution time according to the number of learning times. It can be seen from the graph that the greater the number of learning times, the longer the process takes. In addition, it can be seen that the number of learning times is 20.
The radio map is constructed such that the execution times can be shortened through a comparison of not only the RSSI but also the BSSID. In the case of using only the signal strength as in a conventional method, it is difficult to reflect the difference in signal intensity at each grid location depending on the influence of the environment. Thus, a radio map is constructed based on the number of learning times. Assuming that the BSSID of location A is generally collected collectively at b1, b2, b3, b4, and b5, the BSSID LIST order changes slightly but is considered to be only a small difference due to the influence of location or environment. However, this is only a slight difference owing to the influence of the location or environment. Thus, by applying a refinement, the error is reduced.
Method for indoor location recognition
In the indoor location recognition method, an Arduino-based smart watch is used to periodically transmit the WiFi RSSI and BSSID measured at the user’s location to the server. As shown in Fig. 7, the BSSID that the user measured the current position and transmitted to the server is 15: 2C, C3: 24, 26: H3…. Then, the data from the server are compared with the existing radio map. At this time, the data filtering process is applied.
First, the BLRM, which stores all BSSIDs measured at each location, is compared with the BSSID measured by a smart watch, and the first filtering process excluding the grids not included in the BSSID list is conducted. Next, the grid position is retrieved from the LRM based on the filtered result. The secondary filtering is applied to compare the order with the measured BSSID with the BSSID order in the BRM at the corresponding grid position. Based on this, the RSSI values are extracted from the RRM and placed into a random-forest based learning model to derive the final position results. That is, finally, as shown in Fig. 7, X2,2 is the final position. In the existing fingerprint technique, only the pattern of the RSSI is analyzed, and the complexity increases because of the increase in the grid to be compared. In this paper, the execution time for returning the location information is reduced by performing two times of filtering operations using the four radio maps constructed above.
The above algorithm is a location recognition algorithm that scans WiFi AP signals from a smart watch. The RSSI and BSSID of the surrounding WiFi AP are stored in WiFi_List every predetermined measurement cycle. The WiFi_List structure is shown in Table 1. Next, the WiFi_List is sorted of stronger WiFi signal intensities, and then compared with BLRM; the result of the first filtering is then stored in the array. Based on this array, grids are located through the LRM, and the RSSI for the grid is obtained from the RRM. These data are used to generate a random forest learning model, and the final location is determined using the model with RSSI values measured from the smart watch. This location is stored in the server database using the same structure as shown in Table 2 according to the location measurement time.
The server program described in this paper uses a 4.20 GHz Intel Core i7-7700k CPU, 16.0 GB of memory, Windows 10 pro 64-bit version, and was built using JAVA_FX. As the WiFi signal information measurement device, a smart watch made using an Arduino board was used, as indicated in Table 3. First, when assembling the Arduino smart watch, we used small parts and solder to reduce the volume, making each part small in volume. A small Arduino Pro mini board was used to increase the portability, and was uploaded through the main CP2102 USB uploader because there was no module for communicating with the PC. We were able to download the Arduino program from a PC through CP2102. In addition, to make the portable smart watch, the charging and on/off functions using a Li-Po battery, battery protective circuit, and slide switch were realized. Thus, when charged, the program will be uploaded to memory without connecting to any other power source, and the smart watch will operate.
The ESP 8266 module is a large part of this study. We will collect the surrounding signal strengths using the WiFi module, which is the largest part of the study. However, the smart watch does not have time to synchronize itself or manipulate the settings separately. Therefore, we used Bluetooth. The Arduino smart watch exchanges data with a mobile phone using Bluetooth communication, synchronizes the time with the mobile phone, connects to WiFi, and collects the data of the surrounding APs through the WiFi module. Figure 8 shows the specifications of the Arduino smart watch.
The experimental environment was an indoor space with no walls and not dividing the space into several zones. The place has been learned by dividing it into a total of 10 square shaped cells with 2 m gap. Five or more WiFi APs were installed to enable each cell to detect more than five signals. The 90 s in minimum was defined for the time to collect information on BSSID, RSSID and location considering the time to measure the location and the time to move to next cell.
The experiments on the proposed system in this study was conducted by comparing the execution time and location accuracy between the existing methods and the proposed method. The existing methods refer to the typical fingerprint methods using RSSI and the methods using random forest for all data. The test was performed 10 times, then the results were averaged to compare. At this moment, the data for experiment were included in not only the experiment place but also the other places as well. This is to verify an influence of the number of data on the execution speed and location accuracy for each method.
Based on the location information collected from the smart watch, the location was recognized according to the previously designed method. In this experiment, the experimental environment is divided into ten grids; however, eight positions are compared. This is because the BSSID received from the Arduino smart watch is compared to a grid that is not included in the BSSID list of each grid in the BLRM.
From the execution time perspective, since the existing methods proceeds random forest with all data about the measurement place, the more the number of data is, the longer the time takes to create learning objects. On the other hand, the proposed method proceeds random forest through a filtering with only the data that include BSSID of currently measured place. Therefore, it takes longer time to create the learning objects as the number of data in current place increases. Data for other places have no influence at all regardless of its quantity due to a filtering. In terms of location accuracy perspective, since the existing methods perform a learning with all data, the signals with a similar intensity can be recognized as the same one, thereby leading to decreased accuracy. To the contrary, the proposed method maintains the location accuracy even though a similar signal intensity is measured in the other place because it performs a filtering with BSSID value of WiFi AP which is located at current measured location for all data.
Finally, Figs. 9, 10 depict the experiment results for execution time and location accuracy, respectively. In terms of the execution time, as an increase in the number of data, the execution speed of the existing method increased whereas no significant change was observed in the proposed method due to a filtering. In this case, data being increased is not the data for the learned place but the data for other locations. It was confirmed that the proposed method demonstrated an excellent location recognition accuracy with about 97.5% while the existing methods showed about 90.6%.
During the experiment, there was no significant difference in the WiFi signal intensity as a result of the mock test in the narrow indoor environment, and it was shown that the WiFi signal was inconsistent. In addition, there was a difficulty in measuring the exact position because the noise and signal decline of the nearby WiFi were not considered when measuring the WiFi signal, and the value changed depending on the state of the router.
For this study, we proposed an indoor location recognition system based on WiFi fingerprint, and applied a random-forest based learning for higher position recognition accuracy. This system was divided into steps for collecting the information data through a smart watch, constructing four radio maps by dividing the indoor space into a rectangular grid of a certain size, and location recognition based on a radio map. Thus, the RSSI and BSSID values of the neighboring WiFi APs are obtained through the smart watch and stored in the server, and the primary filtering is applied to extract only the grid having the BSSIDs collected in the list based on a radio map (BLRM) composed of a BSSID list. Secondary filtering is conducted through a comparison analysis with the BRM using the position of the grid, and the final indoor position value is obtained by putting it into the random forest learning model. The singularity proposed in this paper is used to improve the execution time and accuracy through random forest learning using first- and second-order filtering.
In contrast, the most important aspect in the fingerprint-based indoor location recognition system is the location measurement and the number of learning times required to construct a radio map. In this paper, the most accurate results were obtained when the number of learning times was 20. The experimental results show that the proposed method is superior in terms of accuracy and execution time. However, it did not show 100% accuracy because it did not consider the problem caused by WiFi noise or signal attenuation. Therefore, research on indoor location recognition is needed to solve the WiFi noise problem.
This study has a limitation in that the experiments were conducted at the indoor space with no walls and not divided into several zones. WiFi signals are commonly interfered in a typical indoor space by various factors such as interior obstacles or dividing the space into several zones. In future study, these factors will be improved so that the experiments can be available in actual environment. Then, the method proposed in this paper can be applied to various fields such as logistics and disasters, and it is thought that it will be convenient for users in the future to use an indoor space such as an IoT service using the moving pattern of the user within this space.
Khan MA et al (2017) Location awareness in 5G networks using RSS measurements for public safety applications. IEEE Access 5:21753–21762
Han YH, Lim HK, Gil JM (2017) Hierarchical location caching scheme for mobile object tracking in the internet of things. J Inf Process Syst 13:5
Zhou T et al (2018) Improved GNSS cooperation positioning algorithm for indoor localization. CMC Comput Mater Continua 56(2):225–245
Lee S, Moon N (2018) Design and implementation of indoor location recognition system based on fingerprint and random forest. J Broadcast Eng 23(1):154–161
Hwang CG, Yoon CP (2016) Ontology-based positioning systems for indoor LBS. J Korea Inst Inf Commun Eng 20(6):1123–1128
Kwak J, Sung Y (2017) Beacon-based indoor location measurement method to enhanced common chord-based trilateration. J Inf Process Syst 13:6
Shin EJ et al (2016) Message of interest: a framework of location-aware messaging for an indoor environment. In: Pervasive computing and communication workshops (PerCom Workshops), 2016 IEEE international conference on, IEEE
Kim MH, Kim BK, Ko YW, Bang KS (2016) Indoor location tracking system of low energy beacon using gaussian filter. J Korean Inst Inf Technol 14(6):67–74
Jedari E, Wu Z, Rashidzadeh R, Saif M (2015) Wi-Fi based indoor location positioning employing random forest classifier. In: Indoor positioning and indoor navigation (IPIN), 2015 international conference on IEEE. pp 1–5
Jian HX, Hao W (2017) WiFi indoor location optimization method based on position fingerprint algorithm. In: Smart grid and electrical automation (ICSGEA), 2017 international conference on IEEE. pp 585–588
Alletto S, Cucchiara R, Del Fiore G, Mainetti L, Mighali V, Patrono L, Serra G (2016) An indoor location-aware system for an IoT-based smart museum. IEEE Internet Things J 3(2):244–253
Ji X, Shuangshuang S, Mingchen W, Xiangtian L, Hui W (2018) Research on location intelligent detection based on RFID technology. In: 2018 Chinese control and decision conference (CCDC) IEEE. pp 6719–6723
Montaser A, Moselhi O (2014) RFID indoor location identification for construction projects. Autom Constr 39:167–179
Ding H, Zhengqi Z, Yu Z (2016) AP weighted multiple matching nearest neighbors approach for fingerprint-based indoor localization. In: Ubiquitous positioning, indoor navigation and location based services (UPINLBS), 2016 fourth international conference on, IEEE
An T, Ahn C, Nam M, Park J, Lee Y (2016) A study on improving accuracy of subway location tracking using WiFi Fingerprinting. J Korea Acad Ind Coop Soc 17(1):1–8
Niu J, Wang B, Shu L, Duong TQ, Chen Y (2015) ZIL: an energy-efficient indoor localization system using ZigBee radio to detect WiFi fingerprints. IEEE J Sel Areas Commun 33(7):1431–1442
Chen G, Meng X, Wang Y, Zhang Y, Tian P, Yang H (2015) Integrated WiFi/PDR/Smartphone using an unscented kalman filter algorithm for 3D indoor localization. Sensors 15(9):24595–24614
Kim TW, Lee DM (2016) The indoor localization algorithm using the difference means based on fingerprint in moving Wi-Fi environment. J Korean Inst Commun Inf Sci 41(11):1463–1471
Son S, Park Y, Kim B, Baek Y (2013) Wi-Fi fingerprint location estimation system based on reliability. J Korean Inst Commun Inf Sci 38(6):531–539
Kim BH (2015) Crowdsourcing based WiFi radio map management with magnetic landmark and PDR. Master’s Thesis, Seoul University
Tama BA, Rhee KH (2017) A detailed analysis of classifier ensembles for intrusion detection in wireless network. J Inf Process Syst 13(5):1203–1212
Lee S (2017) Indoor location recognition system using random forest based on machine learning and fingerprint, Master’s Thesis, Hoseo University
SL performed the design and implementation of random forest and WiFi fingerprint-based indoor location recognition system using smart watch. JK performed the experiment of this system and was a major contributor in writing the manuscript. NM performed total supervision of this study. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Availability of data and materials
The data that support the findings of this study are available from Jinah Kim. But restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of Jinah Kim.
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government MSIP) (No. NRF-2017R1A2B4008886).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.