# A scaled space-filling curve index applied to tropical rain forest tree   distributions

**Authors:** Markus Wilhelm Jahn, Patrick Erik Bradley

arXiv: 1904.08053 · 2019-04-26

## TL;DR

This paper introduces a scaled Gray-Hilbert space-filling curve index that improves data handling efficiency for high-dimensional spatial data, demonstrated on a large tropical rainforest dataset, outperforming static versions especially with varied data distributions.

## Contribution

It presents a scalable, data-driven scaled Gray-Hilbert curve index that adapts to different data distributions, enhancing spatial data processing in high-dimensional contexts.

## Key findings

- The scaled Gray-Hilbert index outperforms static versions in space efficiency.
- The index adapts to different data distributions through a local sparsity measure.
- Visualization of binary trees illustrates the influence of data tail distributions.

## Abstract

In order to be able to process the increasing amount of spatial data, efficient methods for their handling need to be developed. One major challenge for big spatial data is access. This can be achieved through space-filling curves, as they have the property that nearby points on the curve are also nearby in space. They are able to handle higher dimensional data, too. Higher dimensional data is widely used e.g. in CityGML and is becoming more and more important. In a laboratory experiment on a tropical rain forest tree data set of 2.5 million points taken from an 18-dimensional space, it is demonstrated that the recently constructed scaled Gray-Hilbert curve index performs better than its standard static version, saving a significant amount of space for a projection of the data set onto 8 attributes. The implementation is based on a binary tree in a data-driven process, in a similar way as e.g. the R-tree. Its scalability allows the handling of different kinds of data distributions which are reflected in the tree structure of the index. The relative efficiency of the scaled Gray-Hilbert curve in comparison with the best static version is seen to depend on the distribution of the point cloud. A local sparsity measure derived from properties of the corresponding trees can distinguish point clouds with different tail distributions. The different resulting binary trees are visualised to illustrate the influences of the different tail distributions they have been built on.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.08053/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1904.08053/full.md

## References

20 references — full list in the complete paper: https://tomesphere.com/paper/1904.08053/full.md

---
Source: https://tomesphere.com/paper/1904.08053