AKG kernel Agent: A Multi-Agent Framework for Cross-Platform Kernel Synthesis
Jinye Du, Quan Yuan, Zuyao Zhang, Yanzhi Yi, Jiahui Hu, Wangyi Chen, Yiyang Zhu, Qishui Zheng, Wenxiang Zou, Xiangyu Chang, Zuohe Zheng, Zichun Ye, Chao Liu, Shanni Li, Renwei Zhang, Yiping Deng, Xinwei Hu, Xuefeng Jin, Jie Zhao

TL;DR
AKG kernel agent is an AI-driven multi-agent system that automates the creation, migration, and optimization of computational kernels across diverse hardware platforms, significantly speeding up AI workload development.
Contribution
It introduces a modular multi-agent framework supporting multiple DSLs and hardware targets for automated kernel generation and tuning, addressing manual optimization bottlenecks.
Findings
Achieves 1.46× speedup over PyTorch Eager baselines on KernelBench
Supports multiple DSLs including Triton, TileLang, CPP, CUDA-C
Demonstrates effective cross-platform kernel optimization
Abstract
Modern AI models demand high-performance computation kernels. The growing complexity of LLMs, multimodal architectures, and recommendation systems, combined with techniques like sparsity and quantization, creates significant computational challenges. Moreover, frequent hardware updates and diverse chip architectures further complicate this landscape, requiring tailored kernel implementations for each platform. However, manual optimization cannot keep pace with these demands, creating a critical bottleneck in AI system development. Recent advances in LLM code generation capabilities have opened new possibilities for automating kernel development. In this work, we propose AKG kernel agent (AI-driven Kernel Generator), a multi-agent system that automates kernel generation, migration, and performance tuning. AKG kernel agent is designed to support multiple domain-specific languages (DSLs),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Neural Network Applications · Embedded Systems Design Techniques
