Linnaeus: A Hierarchical, Multi-Label Framework for Autonomous System Classification
Marcos Piotto, Ignacio Schuemer, Santiago T. Torres, Mariano G. Beir\'o, Esteban Carisimo, Fabi\'an E. Bustamante

TL;DR
Linnaeus is a hierarchical, multi-label classification framework for autonomous systems that combines network data and language models to improve semantic understanding of Internet infrastructure.
Contribution
It introduces a novel two-level taxonomy and multi-label classification approach integrating domain-adapted language models with traditional machine learning.
Findings
Achieves 0.83 precision and 0.76 recall on a dataset of nearly 2,000 ASes.
Supports multi-label assignments for hybrid roles of ASes.
Demonstrates practical utility through case studies revealing structural and semantic insights.
Abstract
Autonomous systems (ASes) play diverse roles in today's Internet, from community and research backbones to hyperscale content providers and submarine-cable operators. However, existing taxonomies based solely on network-level features fail to capture their semantic and operational heterogeneity. In this paper, we present Linnaeus, a hierarchical AS-classification framework that combines network-centric data (e.g., topology, BGP announcements) with rich non-network features and leverages domain-adapted large language models alongside traditional machine-learning techniques. Linnaeus provides a two-level taxonomy with 18 top-level and 38 second-level classes, supports multi-label assignments to reflect hybrid roles (e.g., research backbone and transit provider), and provides an end-to-end pipeline from data ingestion to label inference. On a manually annotated dataset of nearly 2,000…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Complex Network Analysis Techniques · Text and Document Classification Technologies
