Rudder: Steering Prefetching in Distributed GNN Training using LLM Agents

Aishwarya Sarkar; Sayan Ghosh; Nathan Tallent; Aman Chadha; Tanya Roosta; Ali Jannesari

arXiv:2602.23556·cs.LG·March 2, 2026

Rudder: Steering Prefetching in Distributed GNN Training using LLM Agents

Aishwarya Sarkar, Sayan Ghosh, Nathan Tallent, Aman Chadha, Tanya Roosta, Ali Jannesari

PDF

Open Access

TL;DR

Rudder is an adaptive prefetching module for distributed GNN training that leverages LLMs to dynamically optimize remote node fetching, significantly improving training performance and reducing communication overhead.

Contribution

This paper introduces Rudder, a novel LLM-based adaptive prefetching system integrated into AWS DistDGL for efficient distributed GNN training.

Findings

01

Achieves up to 91% performance improvement over baseline

02

Reduces communication by over 50%

03

Outperforms static prefetching strategies

Abstract

Large-scale Graph Neural Networks (GNNs) are typically trained by sampling a vertex's neighbors to a fixed distance. Because large input graphs are distributed, training requires frequent irregular communication that stalls forward progress. Moreover, fetched data changes with graph, graph distribution, sample and batch parameters, and caching polices. Consequently, any static prefetching method will miss crucial opportunities to adapt to different dynamic conditions. In this paper, we introduce Rudder, a software module embedded in the state-of-the-art AWS DistDGL framework, to autonomously prefetch remote nodes and minimize communication. Rudder's adaptation contrasts with both standard heuristics and traditional ML classifiers. We observe that the generative AI found in contemporary Large Language Models (LLMs) exhibits emergent properties like In-Context Learning (ICL) for zero-shot…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Multimodal Machine Learning Applications · Topic Modeling