HLStrans: Dataset for C-to-HLS Hardware Code Synthesis

Qingyun Zou; Nuo Chen; Yao Chen; Bingsheng He; WengFei Wong

arXiv:2507.04315·cs.AR·December 5, 2025

HLStrans: Dataset for C-to-HLS Hardware Code Synthesis

Qingyun Zou, Nuo Chen, Yao Chen, Bingsheng He, WengFei Wong

PDF

Open Access 3 Reviews

TL;DR

HLStrans is a large-scale, structured dataset designed to advance LLM-driven C-to-HLS hardware code synthesis, enabling better automation and optimization of FPGA design transformations.

Contribution

The paper introduces HLStrans, the first extensive dataset with paired C and HLS code, testbenches, and annotations, facilitating research in LLM-based hardware synthesis.

Findings

01

Retrieval and fine-tuning improve LLM success rates.

02

Automated augmentation enhances dataset quality.

03

Benchmark results show current LLMs benefit from the dataset.

Abstract

High-Level Synthesis (HLS) enables hardware design from C/C++ kernels but requires extensive transformations, such as restructuring code, inserting pragmas, adapting data types, and repairing non-synthesizable constructs, to achieve efficient FPGA implementations. While large language models (LLMs) show promise in automating these transformations, progress has been limited by the absence of large-scale, well-structured datasets. Existing HLS datasets focus primarily on resource estimation, lack paired C and HLS examples with testbenches, and cover only a narrow set of optimizations. We introduce HLStrans, the first benchmark-scale dataset for LLM-driven C-to-HLS synthesis. HLStrans contains over 124K paired C and HLS programs for real-world applications, with full testbenches and synthesis-based annotations of latency and resource usage. The dataset systematically captures five…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 3

Strengths

1. It proposes the first large-scale benchmark for C-to-HLS transformation, filling a missing piece in the LLM-assisted EDA field. 2. The automated augmentation framework demonstrates good soundness, and the integration of MCTS and DSE may provide insights for other LLM-assisted EDA tasks. 3. It conducts comprehensive benchmarking of multiple models and prompting strategies using meaningful synthesis-level metrics.

Weaknesses

1. The major concern lies in the limited generalizable insights and scientific contributions this work provides to the community. Although it successfully demonstrates the application of LLMs, MCTS, and DSE tools in crafting an EDA dataset, it remains unclear how the findings can generalize to other problems or advance the broader fields of LLM and EDA research. 2. There already exist LLM4EDA datasets created through strategic prompting or design space search. The authors are expected to provid

Reviewer 02Rating 2Confidence 5

Strengths

* Datasets for HLS are important and urgently needed for the EDA community * Well-written background about HLS toolflows

Weaknesses

* This work specifically evaluates a single HLS tool; how about other HLS tools, such as Bambu HLS, Cadence Stratus HLS, and Siemens Catapult HLS? * The paper is missing reasoning about functional failure and synthesis failure. Are they caused by the same problem? It would be helpful to have classifications of failures and break down the importance of these failures over the benchmarks. * This work tries to cover the whole HLS flow but misses a lot of ablation studies at each HLS pipeline stage.

Reviewer 03Rating 4Confidence 4

Strengths

1. HLStrans provides a benchmark-scale, well-structured dataset for C-to-HLS transformation that includes paired pre/post-HLS code, testbenches, and synthesis-based resource/latency feedback. Compared to previous datasets (Table 1), HLStrans has greater diversity, size, and supports a broader range of transformation tasks. 2. Figure 1 concretely illustrates the transformation space covered (T1–T5), moving beyond mere pragma insertion (the focus of most prior works) to include code restructuring

Weaknesses

1. While empirical results are solid, there is a lack of explicit theoretical analysis linking the diversity/scope of the dataset to expected generalization properties of LLMs trained/fine-tuned on it. There is little quantitative discussion on statistical diversity or representativeness of the underlying C/HLS patterns, which is crucial if the dataset is to become a benchmark standard. 2. The paper overlooks Wan et al. (2024), which introduced the Chrysalis dataset — an LLM-aided framework for

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · VLSI and Analog Circuit Testing