Unsupervised Learning Technique to Obtain the Coordinates of Wi-Fi Access Points
Jeongsik Choi, Yang-Seok Choi, Shilpa Talwar

TL;DR
This paper introduces an unsupervised learning method that automatically identifies Wi-Fi access point locations and calibrates distance measurements, enhancing indoor positioning accuracy without ground truth data.
Contribution
It presents a novel unsupervised approach for estimating Wi-Fi access point coordinates and calibration curves, improving indoor positioning without requiring prior ground truth.
Findings
Accurately estimated access point locations in a practical indoor environment.
Effectively learned calibration curves to correct distance measurement distortions.
Enhanced positioning accuracy with more anchor nodes and calibration.
Abstract
Given that the accuracy of range-based positioning techniques generally increases with the number of available anchor nodes, it is important to secure more of these nodes. To this end, this paper studies an unsupervised learning technique to obtain the coordinates of unknown nodes that coexist with anchor nodes. As users use the location services in an area of interests, the proposed method automatically discovers unknown nodes and estimates their coordinates. In addition, this method learns an appropriate calibration curve to correct the distortion of raw distance measurements. As such, the positioning accuracy can be greatly improved using more anchor nodes and well-calibrated distance measurements. The performance of the proposed method was verified using commercial Wi-Fi devices in a practical indoor environment. The experiment results show that the coordinates of unknown nodes and…
| Device | Estimation method | Coefficients ( |
|---|---|---|
| Pixel 1 | Benchmark | |
| Proposed | ||
| Pixel 2 | Benchmark | |
| Proposed | ||
| Pixel 3 | Benchmark | |
| Proposed |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Unsupervised Learning Technique to Obtain the Coordinates of Wi-Fi Access Points
Jeongsik Choi
Intel Labs
*Intel Corporation
*Santa Clara, CA, USA
Yang-Seok Choi
Intel Labs
*Intel Corporation
*Hillsboro, OR, USA
Shilpa Talwar
Intel Labs
*Intel Corporation
*Santa Clara, CA, USA
Abstract
Given that the accuracy of range-based positioning techniques generally increases with the number of available anchor nodes, it is important to secure more of these nodes. To this end, this paper studies an unsupervised learning technique to obtain the coordinates of unknown nodes that coexist with anchor nodes. As users use the location services in an area of interests, the proposed method automatically discovers unknown nodes and estimates their coordinates. In addition, this method learns an appropriate calibration curve to correct the distortion of raw distance measurements. As such, the positioning accuracy can be greatly improved using more anchor nodes and well-calibrated distance measurements. The performance of the proposed method was verified using commercial Wi-Fi devices in a practical indoor environment. The experiment results show that the coordinates of unknown nodes and the calibration curve are simultaneously determined without any ground truth data.
Index Terms:
Fine timing measurement (FTM), IEEE 802.11, positioning, trilateration, unsupervised learning
I Introduction
Wi-Fi is one of the most widely deployed wireless communication technologies. In most indoor environments, a sufficient number of Wi-Fi access points (APs) are already installed to facilitate network connectivity. In addition to their original purpose, APs can be used as anchor nodes for estimating the location of mobile devices. One simple approach for positioning is to measure distances from adjacent APs using received signal strength (RSS) and to apply trilateration techniques [1, 2, 3, 4]. However, each environment has its own propagation characteristics, such as pathloss curve, which need to be investigated to achieve accurate positioning results. Furthermore, RSS is also affected by many factors other than the distance from the transmitter [5, 6]. Therefore, it is difficult to measure the exact distance from RSS.
To avoid using unreliable distance measurements in RSS, the Wi-Fi fingerprinting method has been widely studied as a range-free positioning technique [7, 8, 9, 10, 11]. Instead of measuring distance based on RSS, this method prepares a database called a radio map, which tabulates RSS measurements from all the neighboring APs at the selected coordinates. During the location estimation phase, the coordinates of the device can be obtained by identifying an entity in the pre-built radio map that has the closest RSS measurements, from the current measurements. The fingerprinting method also requires time and effort to prepare a radio map for each environment.
The aforementioned Wi-Fi-based positioning techniques are based on a prior standard that was not originally designed for positioning purposes. To enhance the positioning capability, the IEEE 802.11-2016 standard (also known as 802.11REVmc) introduced a new ranging protocol called fine timing measurement (FTM). This protocol measures the distance between two nodes based on the round trip time (RTT). Although the performance of the FTM protocol still depends on many factors such as the presence of a line-of-sight (LOS) path, obstacles in the LOS path, the transmission bandwidth, hardware calibration, and so on, it potentially provides more accurate ranging results compared to RSS [12, 13, 14].
Since FTM uses the flight time of radio waves for which the propagation speed is almost the same for all environments, the effort required to investigate the characteristics of each environment might be reduced. The most time-consuming task is to obtain the coordinates of APs for use as anchor nodes. For instance, if APs are installed in ceilings or in a restricted area, it is not easy to obtain their coordinates. In addition, in the case that non-fixed APs are used as anchor nodes, their coordinates should be regularly updated. More importantly, there may be a significant number of public APs deployed in an unplanned manner. Once their coordinates are available, the positioning accuracy is improved as the number of anchor nodes increases.
One approach for estimating the coordinates of unknown APs is to apply the trilateration technique using a mobile device as an anchor node [15]. By measuring distances to unknown APs at multiple known positions, the coordinates of each unknown APs can be obtained. This is a supervised method given that the coordinates of the device should be recorded whenever the measurement occurs. Another approach is to apply multidimensional scaling (MDS), which has been widely studied in wireless sensor networks [16, 17]. This method estimates the coordinates of unknown nodes from the coordinates of a few anchor nodes using the measured distances between nodes. To apply this method to Wi-Fi, it is necessary to change the configuration of some APs, because the current FTM protocol measures the distance between two nodes running in different modes. Moreover, this method does not work well if some APs are isolated from others.
In this paper, we study an unsupervised learning approach to obtain the coordinates of unknown APs that coexist with anchor APs. As long as users simply use the location service at the area of interest, the proposed method automatically discovers unknown APs and estimates their coordinates. Moreover, given that users can go anywhere in the area and connect multiple APs in the middle, the proposed method can be applied even if some unknown APs are isolated from others. To this end, the cost functions introduced in [14] are used to evaluate the geometric validity of the relationship between the coordinates of anchor nodes and distance measurements.
In addition to estimate the coordinates of unknown APs, we also focus on the calibration of the FTM protocol. According to the experiment results obtained in practical environments [13, 14], raw distance measurements using the FTM protocol yield biased or distorted results of the true distances. To resolve this, a calibration procedure is introduced in [18]. By manually acquiring distance measurements at various distances from the AP, a calibration curve can be obtained. However, the proposed method learns an appropriate calibration curve without any labeled data, even while estimating the coordinates of unknown APs.
The remainder of this paper is organized as follows. In the next section, we introduce the main assumptions and FTM measurement model. In Section III, we define cost functions for estimating unknown parameters in the system, including the coordinates of unknown APs. In Section IV, the performance of the proposed method is evaluated in a practical environment using off-the-shelf APs and devices. The conclusions of the paper are summarized in Section V.
II System Model
We consider a two-dimensional area with anchor APs and unknown APs installed. We denote as the set of all APs including both anchor and unknown APs, and as the number of elements in this set. To collect training data, mobile devices move arbitrarily around the area of interest while periodically measuring distances from all the adjacent APs using the FTM protocol. The biggest benefit of unsupervised learning is that the training data can be easily collected even when users are using the location services.
According to the experiment results in [13, 14] and this work, the uncalibrated FTM protocol provides distorted distance measurements of the true distances. This is due to the timing offset of the FTM packets, cable length, hardware offset, or even site-specific factors111The performance of the FTM protocol depends on how accurately detect the arrival time of the direct path. Therefore, it may be necessary to apply different calibration methods depending on the environment (e.g., free space and rich scattering environments).. Fig. 1(a) illustrates the relationship between true distances and raw distance measurements based on the FTM protocol. The details of the experiments are presented in Section IV. It is evident that there is a mismatch between the raw measurements and true distances.
Based on this observation, the distance measurements between the -th AP and a device at time step can be simply expressed as a function of true distance as follows:
[TABLE]
where is the coordinates vector of the AP and is the coordinates vector of the device at time step . In addition, is the 2-norm of a vector . The function represents the distortion between true distances and measured distances. The measurement noise is assumed to be a random variable with zero mean and a standard deviation of .
Using equation (1), we can estimate the distance between the -th AP and the device at time step as follows:
[TABLE]
where is the inverse function of and produces the largest value among all inputs. In this paper, we call the calibration curve or equation. If the distance measurements using the FTM protocol are assumed to have only a constant offset from the true distances, the calibration equation is simply expressed by
[TABLE]
where is the distance measurement offset of the device.
However, if the distance measurements are non-linear to the true distances, a polynomial can be used to compensate for this as follows:
[TABLE]
where is the highest order of the polynomial and is the coefficient for the -th order term. The calibration results using equation (3) and (4) are illustrated in Fig. 1(b) and (c), respectively. For simplicity, a second-order polynomial is used in this work. Note that the parameters in these equations were optimally selected for generating Fig. 1, but the proposed method will estimate the parameters using unlabeled training data.
To improve the ranging accuracy, the FTM protocol also supports a burst mode that performs multiple distance measurement processes for a single ranging request and reports the average of the measured distances as a single distance estimate. In this process, the FTM protocol can empirically obtain the standard deviation of multiple measured distances and also report this value. Therefore, this value can be used as an estimator of standard deviation of the noise as , where is the reported standard deviation of distance measurements between the -th AP and the device at time step .
III AP Coordinates Estimation Method
III-A Overview
The basic idea of the proposed method is illustrated in Fig. 2. Regardless of whether the coordinates of the unknown APs are correct or not (the initial coordinates of unknown APs are generally given arbitrarily), we can include all anchor and unknown APs in the location estimation phase and obtain the estimated coordinates of the device using trilateration algorithms. Subsequently, cost functions that indirectly evaluate whether current estimates of parameters (i.e., the coordinates of unknown APs and the parameters in the calibration equations) are geometrically valid can be defined. Using these cost functions, we can analyze the impact of a slight variation of each parameter on the cost functions, and iteratively update parameters in the direction of reducing the cost functions. For instance, the figure shows that if the estimated coordinates of AP 3 changes slightly, the estimated coordinates of the device and the cost functions will also change. Therefore, we can adjust the coordinates of AP 3 appropriately.
III-B Cost Functions
Let denote the estimated coordinates of the device using a trilateration algorithm at time step . The estimated coordinates are generally obtained from the following information: the coordinates of the APs, distance measurements from adjacent APs, and parameters in the calibration equation. Therefore, is expressed as a parameterized function as follows:
[TABLE]
where represents a tuple of FTM-related measurements between every AP and the device at time step . Specifically, the -th element of is a pair of distance and standard deviation measurements from the -th AP, i.e., . In addition, represents a tuple of coordinates of all APs and is the set of all parameters in the calibration equation. In equation (5), is given as the measurement data, and and are the set of trainable variables that should be appropriately optimized.
Note that some trilateration algorithms, e.g., those based on the Kalman filter [12, 14], estimate the coordinates of the device based on previous results. In this case, we can slightly modify equation (5) as to include the previous estimate. In addition, the FTM measurements from all APs are not always available. Therefore, we denote as the set of selected APs that are involved in the location estimation phase at time step .
A necessary condition for equation (5) is that should be differentiable with respect to all trainable variables. Therefore, we mainly focus on linear trilateration methods such as the linear-least square (LS) or the weighted linear-least (WLS) methods that calculate the coordinates of the device using matrix operations [19, 20, 21]. For the same reason, the extended Kalman filter (EKF) is also compatible with the proposed method, because it consists of matrix operations. We follow the EKF procedures introduced in [14].
Irrespective of which trilateration algorithm is used, we can obtain the estimated coordinates of the device and define cost functions that indirectly evaluate the accuracy of the parameters [14]. The most important cost function is the geometric cost function that is defined by
[TABLE]
where is the total length of the measurement time steps. This cost function measures how closely multiple circles intersect at a single point, and approaches to 0 if both the coordinates of unknown APs and distance measurements are perfect.
In addition to the geometric cost function, we can also consider other types of costs functions based on the assumption that the device cannot move far in a short time. Therefore, the position and velocity of the device do not significantly change between two consecutive time steps. The cost functions related to these restrictions are given by
[TABLE]
where is the estimated velocity between time step and , and is the measurement interval.
By combining all the cost functions, we can obtain a unified cost function that is used to optimize the parameters. This is represented by
[TABLE]
where , and are non-negative real numbers that control the balance between cost functions.
III-C Optimization using the Gradient Descent Method
Given that we define the unified cost as the function of unknown parameters in the system, we can iteratively update the parameters using the gradient descent method. In the case that multiple mobile devices participate in collecting training data, we can further combine the cost function of each device. Let denote the set of mobile devices involved in the training phase. The combined cost function is expressed by
[TABLE]
where is the cost for the -th device, obtained using equation (8), and is the set of parameters for this user. We denote as the set of all parameters for calibration.
Using the combined cost function, each trainable variable is updated using the gradient descent method as follows:
[TABLE]
where is the learning rate and is the set of unknown APs. These partial derivatives are simultaneously evaluated with the current estimates of parameters.
IV Experiment Results
The performance of the proposed method was evaluated with off-the-shelf APs and devices. Fig. 3 shows the floor plan of the experiment site in which 10 APs that support the FTM protocol are installed in office environment. These APs are equipped with the Intel Atom CPU and the Intel AC8260 Wi-Fi chipset that supports the FTM protocol. The center frequency of each AP is 5200 or 5240 MHz (i.e., Wi-Fi channel 40 or 48) and the bandwidth is 40 MHz. The index of each AP is also presented in the figure. We simply chose 4 APs around the corners of the testbed area as anchor APs (i.e., AP 1, 3, 7, and 9) and the remaining 6 as unknown.
The height of each AP is 1.5 m, which is similar to the height of the mobile device. For the mobile device, we used Google Pixel series (Pixel 1, 2, 3) that officially support the FTM protocol on Android version 9. For each device, the training data were collected for 5 minutes by randomly walking around the testbed area and test data were obtained by following the test path shown in the figure. The length of the test path is 235 m and it took approximately 4 minutes given that the speed of user was approximately 3 Km/h. While moving, each device measured the distance from all the adjacent APs every 500 ms using the FTM protocol.
For benchmark purposes, we also verify the the performance using perfect estimates of the parameters. To this end, we utilize a second-order polynomial to calibrate raw distance measurements and optimized coefficients to minimize the mean squared error (MSE), which is defined as , where and represent the calibrated distance using equation (4) and the true distance, respectively. Note that test data were used to optimize parameters for benchmark scenarios because the training data do not have measured ground truth coordinates. For this reason, the benchmark scenarios produce the best results with respect to the test data and it will be the upper bound of the proposed method. The optimal coefficients for each device are summarized in Table 1.
The proposed method estimates the coefficients of the calibration equation while estimating the coordinates of the unknown APs. For positioning, we exploit EKF techniques using distance measurements for up to the 5 closest APs. In addition, we assume . Fig. 4 visualizes the convergence behavior of the proposed method. At first, the coordinates of every unknown AP are initialized as the center of the anchor APs. As training iteration increases, the estimated coordinates approach their true coordinates. The estimated coordinates are plotted every 20 iterations. To accelerate the convergence speed, we randomly sample only 30 consecutive time steps from training data for each device for each training iteration. Therefore, 20 iterations are equivalent to 1 training epoch.
Fig. 5 shows further details. Fig. 5(a) shows that the cost of each device obtained using equation (8) decreases with the number of iterations. We ran 1000 training iterations for this experiment and selected parameters when the cost was minimized. Fig. 5(b), (c), and (d) represent the polynomial coefficients for each device. The initial coefficients were chosen as for each device. The accuracy of the estimated coordinates of unknown APs is shown in Fig. 5(e). The performance metric used in this figure is the mean absolute error (MAE) and the root mean squared error (RMSE) that are defined as and ), respectively. In this definition, and represent the estimated and the true coordinates respectively. The maximum estimation error is also presented. Finally, the positioning accuracy of the device using all the APs is presented in Fig. 5(f). For performance comparison purposes, we also evaluated the positioning performance using all APs with true coordinates. In addition, the positioning performance using only 4 anchor APs are also presented. We assume benchmark calibration for these scenarios. The proposed method uses all APs with estimated parameters (i.e., coordinates and calibration curve). The positioning accuracy of the proposed method closely approaches the best performance.
Fig. 6 illustrates the cumulative density function (CDF) of the distance estimation accuracy. As already shown in Fig. 1(a), the raw distance measurement using the FTM protocol produces significant errors. The proposed method automatically optimizes the coefficients in the calibration equations and achieves similar results to the benchmark scenario. Fig. 7 shows the CDF of the positioning accuracy. Even though well-calibrated distance measurements are used, the 4 anchor nodes are not sufficient to provide meaningful results.
Finally, Fig. 8 represents the estimated trajectory of the Pixel 3 device for different scenarios. The area in green represents a 1 m error region, which means that every point in the region is at most 1 m from the true path. Similar to the previous results, using only 4 anchor nodes cannot produce an accurate trajectory of the device. By securing for more anchor nodes in an unsupervised manner, the proposed method is able to produce accurate positioning results.
V Conclusion
In this paper, we proposed an unsupervised learning technique to estimate unknown parameters in a system, including the coordinates of unknown APs. By simply moving around an area where unknown APs coexist with anchor APs, the proposed method automatically determined the coordinates of unknown APs as well as an appropriate calibration curve for each device. Using the proposed algorithm, a greater number of anchor nodes can be secured for positioning purposes. Therefore, high accuracy positioning results can be obtained using dense anchor nodes. The proposed method can be applied in numerous ways. For instance, a service operator can temporarily deploy a few APs at the reference locations for which the coordinates are easily obtainable (e.g., corners of the building) to estimate the coordinates of unknown APs.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Y.-C. Wang, X. Jia, and H. K. Lee, “An indoor wireless positioning system based on wireless local area network infrastructure,” in Proc. 6th International Symposium on Satellite Navigation Technology Including Mobile Positioning and Location Services , 2003.
- 2[2] J. Yang and Y. Chen, “Indoor localization using improved rss-based lateration methods,” in Proc. IEEE Global Telecommunications Conference (GLOBECOM) , Nov. 2009, pp. 1–6.
- 3[3] B. Kim, W. Bong, and Y. C. Kim, “Indoor localization for Wi-Fi devices by cross-monitoring AP and weighted triangulation,” in Proc. IEEE Consumer Communications and Networking Conference , Jan. 2011.
- 4[4] Y. Wang, X. Yang, Y. Zhao, Y. Liu, and L. Cuthbert, “Bluetooth positioning using RSSI and triangulation methods,” in Proc. IEEE Consumer Communications and Networking Conference , Jan. 2013.
- 5[5] “Spatial channel model for multiple input multiple output (MIMO) simulations,” 3GPP TR 25.996 release 11, Sep. 2012.
- 6[6] “Study on 3D channel model for LTE,” 3GPP TR 36.873 release 12, Sep. 2014.
- 7[7] P. Bahl and V. N. Padmanabhan, “RADAR: An in-building RF-based user location and tracking system,” in Proc. IEEE Conference on Computer Communications (INFOCOM) , Mar. 2000.
- 8[8] P. Prasithsangaree, P. Krishnamurthy, and P. Chrysanthis, “On indoor position location with wireless LA Ns,” in Proc. 13th IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC) , Sep. 2002.
