Clustering to Reduce Spatial Data Set Size

Geoff Boeing

arXiv:1803.08101·cs.LG·March 23, 2018·1 cites

Clustering to Reduce Spatial Data Set Size

Geoff Boeing

PDF

Open Access 1 Repo

TL;DR

This paper presents a machine learning-based clustering method to compress large spatial datasets by reducing redundancy, enabling more efficient analysis and visualization of spatial features.

Contribution

It introduces a density-based clustering approach specifically designed to reduce spatial data size by identifying representative features, addressing data redundancy issues.

Findings

01

Effective reduction of spatial data size

02

Preservation of key spatial features

03

Improved data processing efficiency

Abstract

Traditionally it had been a problem that researchers did not have access to enough spatial data to answer pressing research questions or build compelling visualizations. Today, however, the problem is often that we have too much data. Spatially redundant or approximately redundant points may refer to a single feature (plus noise) rather than many distinct spatial features. We use a machine learning approach with density-based clustering to compress such spatial data into a set of representative features.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gboeing/urban-data-science
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Management and Algorithms · Data Mining Algorithms and Applications · Advanced Clustering Algorithms Research