A Distributed Real-Time Recommender System for Big Data Streams

Heidy Hazem; Ahmed Awad; Ahmed Hassan

arXiv:2204.04633·cs.DC·April 12, 2022

A Distributed Real-Time Recommender System for Big Data Streams

Heidy Hazem, Ahmed Awad, Ahmed Hassan

PDF

Open Access

TL;DR

This paper introduces a distributed streaming recommender system architecture that improves scalability, latency, and accuracy for big data streams by leveraging a splitting and replication mechanism inspired by shared-nothing architecture, implemented on Apache Flink.

Contribution

It proposes a novel distributed architecture for streaming recommender systems that addresses scalability, concept drift, and real-time processing, extending existing methods to handle big data volumes.

Findings

01

40% improvement in online recall

02

Over 50% reduction in memory consumption

03

Enhanced processing latency and throughput

Abstract

In today's data-driven world, recommender systems (RS) play a crucial role to support the decision-making process. As users become continuously connected to the internet, they become less patient and less tolerant to obsolete recommendations made by an RS, e.g., movie recommendations on Netflix or books to read on Amazon. This, in turn, requires continuous training of the RS to cope with both the online fashion of data and the changing nature of user tastes and interests, known as concept drift. Streaming (online) RS has to address three requirements: continuous training and recommendation, handling concept drifts, and ability to scale. Streaming recommender systems proposed in the literature mostly, address the first two requirements and do not consider scalability. That is because they run the training process on a single machine. Such a machine, no matter how powerful it is, will…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Caching and Content Delivery · Advanced Bandit Algorithms Research