TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators
Jianling Li, Shangzhan Li, Zhenye Gao, Qi Shi, Yuxuan Li, Zefan Wang,, Jiacheng Huang, Haojie Wang, Jianrong Wang, Xu Han, Zhiyuan Liu, Maosong Sun

TL;DR
TritonBench is a new benchmark designed to evaluate large language models' ability to generate efficient Triton GPU kernels, revealing current models' limitations in high-performance code generation.
Contribution
This work introduces TritonBench, the first comprehensive benchmark for Triton operator generation, combining correctness and efficiency evaluation on real-world and PyTorch-aligned operators.
Findings
Current LLMs struggle to generate efficient Triton code.
TritonBench provides a systematic evaluation framework.
Significant gap identified in high-performance Triton code generation.
Abstract
Triton, a high-level Python-like language designed for building efficient GPU kernels, is widely adopted in deep learning frameworks due to its portability, flexibility, and accessibility. However, programming and parallel optimization still require considerable trial and error from Triton developers. Despite advances in large language models (LLMs) for conventional code generation, these models struggle to generate accurate, performance-optimized Triton code, as they lack awareness of its specifications and the complexities of GPU programming. More critically, there is an urgent need for systematic evaluations tailored to Triton. In this work, we introduce TritonBench, the first comprehensive benchmark for Triton operator generation. TritonBench features two evaluation channels: a curated set of 184 real-world operators from GitHub and a collection of operators aligned with PyTorch…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Handwritten Text Recognition Techniques · Natural Language Processing Techniques
MethodsSparse Evolutionary Training
