HDLxGraph: Bridging Large Language Models and HDL Repositories via HDL Graph Databases

Pingqing Zheng; Jiayin Qin; Fuqi Zhang; Niraj Chitla; Zishen Wan; Shang Wu; Yu Cao; Caiwen Ding; Yang (Katie) Zhao

arXiv:2505.15701·cs.AR·March 10, 2026

HDLxGraph: Bridging Large Language Models and HDL Repositories via HDL Graph Databases

Pingqing Zheng, Jiayin Qin, Fuqi Zhang, Niraj Chitla, Zishen Wan, Shang Wu, Yu Cao, Caiwen Ding, Yang (Katie) Zhao

PDF

Open Access 1 Repo

TL;DR

HDLxGraph enhances large language model tasks on HDL code repositories by integrating graph structures like ASTs and DFGs, significantly improving search and debugging accuracy over existing methods.

Contribution

This paper introduces HDLxGraph, the first framework combining HDL graph structures with RAGs, and presents HDLSearch, a new HDL code benchmark dataset.

Findings

01

HDLxGraph improves search accuracy by over 12%.

02

HDLxGraph enhances debugging and completion tasks.

03

HDLSearch provides a new benchmark for HDL code retrieval.

Abstract

Retrieval Augmented Generation (RAG) is an essential agent for Large Language Model (LLM) aided Description Language (HDL) tasks, addressing the challenges of limited training data and prohibitively long prompts. However, its performance in handling ambiguous queries and real-world, repository-level HDL projects containing thousands or even tens of thousands of code lines remains limited. Our analysis demonstrates two fundamental mismatches, structural and vocabulary, between conventional semantic similarity-based RAGs and HDL codes. To this end, we propose HDLxGraph, the first framework that integrates the inherent graph characteristics of HDLs with RAGs for LLM-assisted tasks. Specifically, HDLxGraph incorporates Abstract Syntax Trees (ASTs) to capture HDLs' hierarchical structures and Data Flow Graphs (DFGs) to address the vocabulary mismatch. In addition, to overcome the lack of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nick-zheng-q/hdlxgraph
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Machine Learning and Algorithms · Topic Modeling

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay · Softmax · Attention Dropout · WordPiece · Linear Layer · Residual Connection · Byte Pair Encoding · Weight Decay