GCoder: Improving Large Language Model for Generalized Graph Problem Solving
Qifan Zhang, Xiaobin Hong, Jianheng Tang, Nuo Chen, Yuhan Li, Wenzhong, Li, Jing Tang, Jia Li

TL;DR
GCoder is a novel code-based large language model that significantly improves generalized graph problem-solving capabilities, handling diverse formats and large-scale graphs more effectively than previous reasoning-based approaches.
Contribution
Introduces GCoder, a code-based LLM trained on a diverse graph dataset with multi-stage fine-tuning, enhancing generalization and scalability in graph problem-solving.
Findings
Outperforms GPT-4o with 16.42% accuracy gain across graph tasks.
Handles large graphs with millions of nodes efficiently.
Overcomes limitations of reasoning steps paradigm in graph computation.
Abstract
Large Language Models (LLMs) have demonstrated strong reasoning abilities, making them suitable for complex tasks such as graph computation. Traditional reasoning steps paradigm for graph problems is hindered by unverifiable steps, limited long-term reasoning, and poor generalization to graph variations. To overcome these limitations, we introduce GCoder, a code-based LLM designed to enhance problem-solving in generalized graph computation problems. Our method involves constructing an extensive training dataset, GraphWild, featuring diverse graph formats and algorithms. We employ a multi-stage training process, including Supervised Fine-Tuning (SFT) and Reinforcement Learning from Compiler Feedback (RLCF), to refine model capabilities. For unseen tasks, a hybrid retrieval technique is used to augment performance. Experiments demonstrate that GCoder outperforms GPT-4o, with an average…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms
