Understanding LLM-Centric Challenges for Deep Learning Frameworks: An Empirical Analysis
Yanzhou Mu, Rong Wang, Juan Zhai, Chunrong Fang, Xiang Chen, Jiacong Wu, An Guo, Jiawei Shen, Bingzhuo Li, Zhenyu Chen

TL;DR
This paper investigates the unique challenges faced by deep learning frameworks in supporting large language models, analyzing issue reports and user feedback to identify technical gaps and mismatches in requirements.
Contribution
It provides the first comprehensive taxonomy of LLM-centric framework bugs, user needs, and challenges through empirical analysis and interviews.
Findings
Identifies key technical challenges in DL frameworks for LLMs.
Highlights mismatches between user needs and developer priorities.
Provides insights to improve framework usability and support for LLMs.
Abstract
Large language models (LLMs) have driven significant progress across a wide range of real-world applications. Realizing such models requires substantial system-level support. Deep learning (DL) frameworks provide this foundation by enabling efficient model construction, distributed execution, and optimized deployment. The large parameter scale and extended execution cycles impose exacting demands on deep learning frameworks, particularly in terms of scalability, stability, and efficiency. Therefore, poor usability, limited functionality, and subtle bugs in DL frameworks may hinder development efficiency and cause severe failures or resource waste. However, a fundamental question has not been thoroughly investigated in previous studies, i.e., what challenges do DL frameworks face in supporting LLMs? To answer this question, we analyze issue reports from three major DL frameworks (i.e.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing
