Efficient algorithms for collecting the statistics of large-scale IP   address data

Hui Liu; Yi Cao; Zehan Cai; Hua Mao; and Jie Chen

arXiv:2108.04000·cs.CC·February 24, 2025

Efficient algorithms for collecting the statistics of large-scale IP address data

Hui Liu, Yi Cao, Zehan Cai, Hua Mao, and Jie Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces two efficient algorithms for large-scale IP address statistics collection, optimizing time and memory use by leveraging the sparse nature of IP data and dynamic hash indexing, with a parallel scheme for faster processing.

Contribution

The paper presents novel algorithms that avoid hash collisions and adapt hash index length, significantly improving efficiency in large-scale IP data analysis.

Findings

01

Outperforms baseline methods in time efficiency

02

Uses less memory than existing solutions

03

Supports parallel processing for faster computation

Abstract

Compiling the statistics of large-scale IP address data is an essential task in network traffic measurement. The statistical results are used to evaluate the potential impact of user behaviors on network traffic. This requires algorithms that are capable of storing and retrieving a high volume of IP addresses within time and memory constraints. In this paper, we present two efficient algorithms for collecting the statistics of large-scale IP addresses that balance time efficiency and memory consumption. The proposed solutions take into account the sparse nature of the statistics of IP addresses while building the hash function and maintain a dynamic balance among layered memory blocks. There are two layers in the first proposed method, each of which contains a limited number of memory blocks. Each memory block contains 256 elements of size $256 \times 8$ bytes for a 64-bit system. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chenjie20/ipstatistics
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNetwork Packet Processing and Optimization · Network Security and Intrusion Detection · Algorithms and Data Compression