An Ensemble Scheme for Proactive Data Allocation in Distributed Datasets
T. Koukaras, K. Kolomvatsos

TL;DR
This paper introduces an ensemble-based proactive data allocation scheme for distributed IoT datasets in edge computing, aiming to improve data accuracy and reduce processing complexity by matching incoming data with existing dataset synopses.
Contribution
The paper proposes a novel ensemble scheme that proactively assigns data to storage locations based on similarity, using synopses to simplify processing in edge computing environments.
Findings
Improved dataset accuracy through proactive data placement.
Reduced processing overhead by using synopses for similarity matching.
Demonstrated effectiveness of the approach in experimental evaluations.
Abstract
The advent of the Internet of Things (IoT) gives the opportunity to numerous devices to interact with their environment, collect and process data. Data are transferred, in an upwards mode, to the Cloud through the Edge Computing (EC) infrastructure. A high number of EC nodes become the hosts of distributed datasets where various processing activities can be realized in close distance with end users. This approach can limit the latency in the provision of responses. In this paper, we focus on a model that proactively decides where the collected data should be stored in order to maximize the accuracy of datasets present at the EC infrastructure. We consider that the accuracy is defined by the solidity of datasets exposed as the statistical resemblance of data. We argue upon the similarity of the incoming data with the available datasets and select the most appropriate of them to store the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing · Mobile Crowdsensing and Crowdsourcing · Data Stream Mining Techniques
