HPC-LLM: Practical Domain Adaptation and Retrieval-Augmented Generation for HPC Support

Nourin Shahin; Izzat Alsmadi

arXiv:2605.16347·cs.LG·May 19, 2026

HPC-LLM: Practical Domain Adaptation and Retrieval-Augmented Generation for HPC Support

Nourin Shahin, Izzat Alsmadi

PDF

TL;DR

HPC-LLM is a domain-specific, retrieval-augmented language model fine-tuned for HPC workflows, achieving high performance with lower resource needs compared to larger general models.

Contribution

The paper introduces HPC-LLM, a domain-adapted LLM for HPC support, utilizing retrieval augmentation and lightweight fine-tuning to improve operational assistance.

Findings

01

The adapted 8B model performs comparably to larger models like Qwen 2.5 14B.

02

HPC-LLM operates with significantly lower GPU memory and inference latency.

03

Constructed a large HPC-focused dataset from documentation and synthetic data.

Abstract

Modern scientific research increasingly depends on High-Performance Computing (HPC) infrastructures, yet many researchers face significant operational barriers when interacting with cluster environments, job schedulers, GPU resources, and parallel computing frameworks. General-purpose large language models (LLMs) provide useful coding assistance but often lack the domain-specific operational knowledge required for reliable HPC support. This paper presents HPC-LLM, a retrieval augmented and domain-adapted assistant designed to support common HPC workflows including Slurm scheduling, MPI execution, GPU utilization, filesystem management, and cluster troubleshooting. The proposed framework integrates automated documentation ingestion, dense retrieval, lightweight domain adaptation using QLoRA, and local inference within a modular orchestration pipeline. To support domain adaptation, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.