# Distributed Subtrajectory Join on Massive Datasets

**Authors:** Panagiotis Tampakis, Christos Doulkeridis, Nikos Pelekis, Yannis, Theodoridis

arXiv: 1903.07748 · 2020-02-07

## TL;DR

This paper presents three distributed algorithms for subtrajectory join processing on massive datasets using MapReduce, significantly improving performance over existing methods in large-scale trajectory data analysis.

## Contribution

It introduces three novel distributed solutions, including indexing, for efficient subtrajectory join processing on big datasets, addressing a key challenge in mobility data analytics.

## Key findings

- DTJi is up to 16x faster than DTJb
- DTJi is 10x faster than DTJr
- DTJi outperforms the state-of-the-art algorithm

## Abstract

Joining trajectory datasets is a significant operation in mobility data analytics and the cornerstone of various methods that aim to extract knowledge out of them. In the era of Big Data, the production of mobility data has become massive and, consequently, performing such an operation in a centralized way is not feasible. In this paper, we address the problem of Distributed Subtrajectory Join processing by utilizing the MapReduce programming model. Compared to traditional trajectory join queries, this problem is even more challenging since the goal is to retrieve all the "maximal" portions of trajectories that are "similar". We propose three solutions: (i) a well-designed basic solution, coined DTJb, (ii) a solution that uses a preprocessing step that repartitions the data, labeled DTJr, and (iii) a solution that, additionally, employs an indexing scheme, named DTJi. In our experimental study, we utilize a 56GB dataset of real trajectories from the maritime domain, which, to the best of our knowledge, is the largest real dataset used for experimentation in the literature of trajectory data management. The results show that DTJi performs up to 16x faster compared with DTJb, 10x faster than DTJr and 3x faster than the closest related state of the art algorithm.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.07748/full.md

## Figures

16 figures with captions in the complete paper: https://tomesphere.com/paper/1903.07748/full.md

## References

33 references — full list in the complete paper: https://tomesphere.com/paper/1903.07748/full.md

---
Source: https://tomesphere.com/paper/1903.07748