Towards Mitigating API Hallucination in Code Generated by LLMs with Hierarchical Dependency Aware
Yujia Chen, Mingyu Chen, Cuiyun Gao, Zhihan Jiang, Zhongqi Li, Yuchi Ma

TL;DR
This paper introduces MARIN, a hierarchical dependency aware framework that significantly reduces API hallucination in code generated by LLMs by analyzing dependencies and constraining generation, validated on a new benchmark and proprietary projects.
Contribution
The paper presents MARIN, a novel framework with dependency analysis and constrained decoding to mitigate API hallucination in LLM-generated code, outperforming retrieval-based methods.
Findings
MARIN reduces API hallucination by over 67% in MiHN and 73% in MaHR on six LLMs.
MARIN achieves over 57% reduction in hallucinations on proprietary projects.
The new benchmark APIHulBench and metrics MiHN and MaHR effectively evaluate API hallucination.
Abstract
Application Programming Interfaces (APIs) are crucial in modern software development. Large Language Models (LLMs) assist in automated code generation but often struggle with API hallucination, including invoking non-existent APIs and misusing existing ones in practical development scenarios. Existing studies resort to Retrieval-Augmented Generation (RAG) methods for mitigating the hallucination issue, but tend to fail since they generally ignore the structural dependencies in practical projects and do not indeed validate whether the generated APIs are available or not. To address these limitations, we propose MARIN, a framework for mitigating API hallucination in code generated by LLMs with hierarchical dependency aware. MARIN consists of two phases: Hierarchical Dependency Mining, which analyzes local and global dependencies of the current function, aiming to supplement comprehensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Advanced Software Engineering Methodologies
