Comparative Study of Large Language Model Architectures on Frontier
Junqi Yin, Avishek Bose, Guojing Cong, Isaac Lyngaas, Quentin Anthony

TL;DR
This study compares GPT-NeoX and LLaMA architectures trained on the same materials science data using Frontier supercomputer, revealing insights into their performance, efficiency, and guiding future LLM development on HPC systems.
Contribution
It provides a controlled comparative analysis of two open-source GPT models on HPC, including performance, efficiency, and a new architecture design method.
Findings
Achieved state-of-the-art results on materials science benchmark.
Compared computational and energy efficiency of models.
Proposed a new efficient architecture design method.
Abstract
Large language models (LLMs) have garnered significant attention in both the AI community and beyond. Among these, the Generative Pre-trained Transformer (GPT) has emerged as the dominant architecture, spawning numerous variants. However, these variants have undergone pre-training under diverse conditions, including variations in input data, data preprocessing, and training methodologies, resulting in a lack of controlled comparative studies. Here we meticulously examine two prominent open-sourced GPT architectures, GPT-NeoX and LLaMA, leveraging the computational power of Frontier, the world's first Exascale supercomputer. Employing the same materials science text corpus and a comprehensive end-to-end pipeline, we conduct a comparative analysis of their training and downstream performance. Our efforts culminate in achieving state-of-the-art performance on a challenging materials…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
