A Big Data Driven Framework for Duplicate Device Detection from Multi-sourced Mobile Device Location Data
Aliakbar Kabiri, Aref Darzi, Saeed Saleh Namadi, Yixuan Pan, Guangchen, Zhao, Qianqian Sun, Mofeng Yang, Mohammad Ashoori

TL;DR
This paper presents a cost-effective, data-driven framework for identifying duplicate mobile devices in multi-sourced location datasets, enhancing data coverage and accuracy for large-scale mobility analysis.
Contribution
It introduces a novel methodology leveraging travel pattern uniqueness to detect duplicate devices across multiple data sources without bias.
Findings
Over 99.6% accuracy in matching devices sharing key location attributes
Effective integration of multi-sourced data improves spatial coverage
Successful application to national-level data for travel surveys
Abstract
Mobile Device Location Data (MDLD) has been popularly utilized in various fields. Yet its large-scale applications are limited because of either biased or insufficient spatial coverage of the data from individual data vendors. One approach to improve the data coverage is to leverage the data from multiple data vendors and integrate them to build a more representative dataset. For data integration, further treatments on the multi-sourced dataset are required due to several reasons. First, the possibility of carrying more than one device could result in duplicated observations from the same data subject. Additionally, when utilizing multiple data sources, the same device might be captured by more than one data provider. Our paper proposes a data integration methodology for multi-sourced data to investigate the feasibility of integrating data from several sources without introducing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Mobility and Location-Based Analysis · Transportation and Mobility Innovations · Urban Transport and Accessibility
