Choice of Parallelism: Multi-GPU Driven Pipeline for Huge Academic Backbone Network
Ruo Ando, Youki Kadobayashi, Hiroki Takakura

TL;DR
This paper introduces a multi-GPU pipeline for processing massive session data from the Japanese SINET network, significantly accelerating data discrimination and grouping tasks using GPU and a two-stage map-reduce design.
Contribution
It presents a novel multi-GPU-driven pipeline with a tiling pattern for efficient large-scale session data processing in an academic network context.
Findings
Processed 1.2 to 1.6 billion session streams within 24 hours.
Achieved acceleration of session data discrimination using GPU.
Implemented a two-stage map-reduce with CPU and GPU for efficient data grouping.
Abstract
Science Information Network (SINET) is a Japanese academic backbone network for more than 800 research institutions and universities. In this paper, we present a multi-GPU-driven pipeline for handling huge session data of SINET. Our pipeline consists of ELK stack, multi-GPU server, and Splunk. A multi-GPU server is responsible for two procedures: discrimination and histogramming. Discrimination is dividing session data into ingoing/outgoing with subnet mask calculation and network address matching. Histogramming is grouping ingoing/outgoing session data into bins with map-reduce. In our architecture, we use GPU for the acceleration of ingress/egress discrimination of session data. Also, we use a tiling design pattern for building a two-stage map-reduce of CPU and GPU. Our multi-GPU-driven pipeline has succeeded in processing huge workloads of about 1.2 to 1.6 billion session streams…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
