SemanticForge: Repository-Level Code Generation through Semantic Knowledge Graphs and Constraint Satisfaction
Wuyang Zhang, Chenkai Zhang, Zhen Luo, Jianming Ma, Wangming Yuan, Chuqiao Gu, Chenwei Feng

TL;DR
SemanticForge enhances code generation by integrating repository-wide semantic knowledge graphs with advanced algorithms, significantly reducing logical and schematic errors in large language model outputs.
Contribution
The paper introduces four novel algorithms for semantically-aware code generation, unifying semantics, improving query accuracy, enabling real-time constraint checking, and efficiently maintaining knowledge graphs.
Findings
Achieved 73% precision in graph query generation from natural language.
Developed a real-time constraint verification method during code generation.
Maintained semantic equivalence with $O(| riangle R| imes \log n)$ update time.
Abstract
Large language models (LLMs) have transformed software development by enabling automated code generation, yet they frequently suffer from systematic errors that limit practical deployment. We identify two critical failure modes: \textit{logical hallucination} (incorrect control/data-flow reasoning) and \textit{schematic hallucination} (type mismatches, signature violations, and architectural inconsistencies). These errors stem from the absence of explicit, queryable representations of repository-wide semantics. This paper presents \textbf{SemanticForge}, which introduces four fundamental algorithmic advances for semantically-aware code generation: (1) a novel automatic reconciliation algorithm for dual static-dynamic knowledge graphs, unifying compile-time and runtime program semantics; (2) a neural approach that learns to generate structured graph queries from natural language,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Model-Driven Software Engineering Techniques
