Fast and Error-Adaptive Influence Maximization based on Count-Distinct   Sketches

Gokhan Gokturk; Kamer Kaya

arXiv:2105.04023·cs.SI·May 11, 2021

Fast and Error-Adaptive Influence Maximization based on Count-Distinct Sketches

Gokhan Gokturk, Kamer Kaya

PDF

Open Access

TL;DR

This paper introduces a fast, error-adaptive influence maximization algorithm using Count-Distinct sketches and hash-based sampling, significantly improving speed and seed set quality over existing methods.

Contribution

It presents a novel influence maximization approach that combines error-adaptive sketch rebuilding with efficient diffusion simulation, achieving high speed and accuracy.

Findings

01

Up to 119x faster than state-of-the-art algorithms.

02

Produces seed sets with 3%-12% better influence scores.

03

Maintains high-quality influence maximization with reduced computational effort.

Abstract

Influence maximization (IM) is the problem of finding a seed vertex set that maximizes the expected number of vertices influenced under a given diffusion model. Due to the NP-Hardness of finding an optimal seed set, approximation algorithms are frequently used for IM. In this work, we describe a fast, error-adaptive approach that leverages Count-Distinct sketches and hash-based fused sampling. To estimate the number of influenced vertices throughout a diffusion, we use per-vertex Flajolet-Martin sketches where each sketch corresponds to a sampled subgraph. To efficiently simulate the diffusions, the reach-set cardinalities of a single vertex are stored in memory in a consecutive fashion. This allows the proposed algorithm to estimate the number of influenced vertices in a single step for simulations at once. For a faster IM kernel, we rebuild the sketches in parallel only after…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Stochastic Gradient Optimization Techniques