Estimation of Individual Micro Data from Aggregated Open Data
Han-mook Yoo, Han-joon Kim, Jonghoon Chun

TL;DR
This paper introduces a semi-supervised learning approach utilizing LSH and conditional probability to estimate individual micro data from aggregated open data, demonstrating about 59% accuracy in a fire incident case study.
Contribution
It presents a novel method combining LSH, semi-supervised learning, and conditional probability for micro data estimation from aggregated data.
Findings
Achieved 59.41% average accuracy in micro data estimation
Effectively identified individual data related to fire incidents
Demonstrated the method's applicability to real-world aggregated data
Abstract
In this paper, we propose a method of estimating individual micro data from aggregated open data based on semi-supervised learning and conditional probability. Firstly, the proposed method collects aggregated open data and support data, which are related to the individual micro data to be estimated. Then, we perform the locality sensitive hashing (LSH) algorithm to find a subset of the support data that is similar to the aggregated open data and then classify them by using the Ensemble classification model, which is learned by semi-supervised learning. Finally, we use conditional probability to estimate the individual micro data by finding the most suitable record for the probability distribution of the individual micro data among the classification results. To evaluate the performance of the proposed method, we estimated the individual building data where the fire occurred using the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Mobility and Location-Based Analysis · demographic modeling and climate adaptation · Recommender Systems and Techniques
