Adaptive Processing of Spatial-Keyword Data Over a Distributed Streaming Cluster
Ahmed R. Mahmood, Anas Daghistani, Ahmed M. Aly, Walid G. Aref,, Mingjie Tang, Saleh Basalamah, Sunil Prabhakar

TL;DR
This paper introduces Tornado, a distributed system for real-time processing of geo-tagged textual data streams that adaptively balances workload and outperforms existing approaches significantly.
Contribution
The paper presents Tornado, a novel distributed spatial-keyword stream processing system with adaptive load balancing and efficient data-query co-location.
Findings
Tornado achieves up to 100x higher throughput than non-spatio-textually aware systems.
The system effectively reduces redundant communication by filtering irrelevant data updates.
Experimental results on Twitter data validate Tornado's efficiency and scalability.
Abstract
The widespread use of GPS-enabled smartphones along with the popularity of micro-blogging and social networking applications, e.g., Twitter and Facebook, has resulted in the generation of huge streams of geo-tagged textual data. Many applications require real-time processing of these streams. For example, location-based e-coupon and ad-targeting systems enable advertisers to register millions of ads to millions of users. The number of users is typically very high and they are continuously moving, and the ads change frequently as well. Hence sending the right ad to the matching users is very challenging. Existing streaming systems are either centralized or are not spatial-keyword aware, and cannot efficiently support the processing of rapidly arriving spatial-keyword data streams. This paper presents Tornado, a distributed spatial-keyword stream processing system. Tornado features…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
