Cardinality is Not Enough: Super Host Detection via Segmented Cardinality Estimation

Yilin Zhao; Jiawei Huang; Xianshi Su; Weihe Li; Xin Li; Yan Liu; Jiacheng Xie; Qichen Su; Jin Ye; Wanchun Jiang; Jianxin Wang

arXiv:2604.02379·cs.NI·April 6, 2026

Cardinality is Not Enough: Super Host Detection via Segmented Cardinality Estimation

Yilin Zhao, Jiawei Huang, Xianshi Su, Weihe Li, Xin Li, Yan Liu, Jiacheng Xie, Qichen Su, Jin Ye, Wanchun Jiang, Jianxin Wang

PDF

TL;DR

This paper introduces SegSketch, a memory-efficient method for super host detection that improves accuracy by estimating subnet cardinality using segmented hashing, outperforming existing approaches.

Contribution

The paper presents SegSketch, a novel segmented cardinality estimation technique that reduces false positives and enhances detection accuracy with limited memory.

Findings

01

SegSketch achieves up to 8.04x higher F1-Score than existing methods.

02

It effectively estimates subnet cardinality to improve super host detection.

03

Experiments on real-world data validate its superior performance under small memory budgets.

Abstract

Accurately detecting super host that establishes connections to a large number of distinct peers is significant for mitigating web attacks and ensuring high quality of web service. Existing sketch-based approaches estimate the number of distinct connections called flow cardinality according to full IP addresses, while ignoring the fact that a malicious or victim super host often communicates with hosts within the same subnet, resulting in high false positive rates and low accuracy. Though hierarchical-structure based approaches could capture flow cardinality in subnet, they inherently suffer from high memory usage. To address these limitations, we propose SegSketch, a segmented cardinality estimation approach that employs a lightweight halved-segment hashing strategy to infer common prefix lengths of IP addresses, and estimates cardinality within subnet to enhance detection accuracy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.