Continuous Top-k Queries over Real-Time Web Streams
Nelly Vouzoukidou, Bernd Amann, Vassilis Christophides

TL;DR
This paper introduces a novel approach for processing continuous top-k queries over real-time web streams, efficiently handling dynamic scores and high data arrival rates in social media environments.
Contribution
It presents the first method for managing non-predictable, dynamic scores in continuous top-k queries over real-time web streams.
Findings
Developed a publish/subscribe indexing framework for real-time query updates
Achieved efficient processing of high-velocity data streams with dynamic scoring
Demonstrated the approach's effectiveness on social media data streams
Abstract
The Web has become a large-scale real-time information system forcing us to revise both how to effectively assess relevance of information for a user and how to efficiently implement information retrieval and dissemination functionality. To increase information relevance, Real-time Web applications such as Twitter and Facebook, extend content and social-graph relevance scores with "real-time" user generated events (e.g. re-tweets, replies, likes). To accommodate high arrival rates of information items and user events we explore a publish/subscribe paradigm in which we index queries and update on the fly their results each time a new item and relevant events arrive. In this setting, we need to process continuous top-k text queries combining both static and dynamic scores. To the best of our knowledge, this is the first work addressing how non-predictable, dynamic scores can be handled in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Data Management and Algorithms · Web Data Mining and Analysis
