HPC-LLM: Practical Domain Adaptation and Retrieval-Augmented Generation for HPC Support
Nourin Shahin, Izzat Alsmadi

TL;DR
HPC-LLM is a domain-specific, retrieval-augmented language model fine-tuned for HPC workflows, achieving high performance with lower resource needs compared to larger general models.
Contribution
The paper introduces HPC-LLM, a domain-adapted LLM for HPC support, utilizing retrieval augmentation and lightweight fine-tuning to improve operational assistance.
Findings
The adapted 8B model performs comparably to larger models like Qwen 2.5 14B.
HPC-LLM operates with significantly lower GPU memory and inference latency.
Constructed a large HPC-focused dataset from documentation and synthetic data.
Abstract
Modern scientific research increasingly depends on High-Performance Computing (HPC) infrastructures, yet many researchers face significant operational barriers when interacting with cluster environments, job schedulers, GPU resources, and parallel computing frameworks. General-purpose large language models (LLMs) provide useful coding assistance but often lack the domain-specific operational knowledge required for reliable HPC support. This paper presents HPC-LLM, a retrieval augmented and domain-adapted assistant designed to support common HPC workflows including Slurm scheduling, MPI execution, GPU utilization, filesystem management, and cluster troubleshooting. The proposed framework integrates automated documentation ingestion, dense retrieval, lightweight domain adaptation using QLoRA, and local inference within a modular orchestration pipeline. To support domain adaptation, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
