A Distributed Spatial Data Warehouse for AIS Data (DIPAAL)
Alex S. Klitgaard, Lau E. Josefsen, Mikael V. Mikkelsen, Kristian Torp

TL;DR
This paper introduces a distributed spatial data warehouse system for AIS ship data, featuring an efficient ETL process, a raster query approach, and scalable spatial partitioning, enabling fast analysis of large-scale ship trajectories.
Contribution
It presents a novel modular ETL pipeline and a distributed spatial data warehouse with raster-based querying for large AIS datasets, improving analysis speed and scalability.
Findings
Searching cell representations is faster than trajectory representations.
Spatially partitioned shards enable scalable analysis with up to 1164% performance improvement.
The system stores over 8 billion rows and 312 million kilometers of ship trajectories.
Abstract
AIS data from ships is excellent for analyzing single-ship movements and monitoring all ships within a specific area. However, the AIS data needs to be cleaned, processed, and stored before being usable. This paper presents a system consisting of an efficient and modular ETL process for loading AIS data, as well as a distributed spatial data warehouse storing the trajectories of ships. To efficiently analyze a large set of ships, a raster approach to querying the AIS data is proposed. A spatially partitioned data warehouse with a granularized cell representation and heatmap presentation is designed, developed, and evaluated. Currently the data warehouse stores ~312 million kilometers of ship trajectories and more than +8 billion rows in the largest table. It is found that searching the cell representation is faster than searching the trajectory representation. Further, we show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMaritime Navigation and Safety · Maritime Transport Emissions and Efficiency · Data Management and Algorithms
