LLM-FSM: Scaling Large Language Models for Finite-State Reasoning in RTL Code Generation

Yuheng Wu; Berk Gokmen; Zhouhua Xie; Peijing Li; Caroline Trippel; Priyanka Raina; Thierry Tambe

arXiv:2602.07032·cs.AI·February 10, 2026

LLM-FSM: Scaling Large Language Models for Finite-State Reasoning in RTL Code Generation

Yuheng Wu, Berk Gokmen, Zhouhua Xie, Peijing Li, Caroline Trippel, Priyanka Raina, Thierry Tambe

PDF

Open Access

TL;DR

This paper introduces LLM-FSM, a new benchmark for evaluating large language models' ability to understand and generate finite-state machine behavior from natural language, with a focus on hardware RTL code generation.

Contribution

The paper presents LLM-FSM, an automated, scalable benchmark for finite-state reasoning in RTL code generation, and analyzes LLM performance with scaling and fine-tuning.

Findings

01

LLMs' accuracy declines with increased FSM complexity.

02

Supervised fine-tuning improves out-of-distribution generalization.

03

Test-time compute scaling enhances reasoning reliability.

Abstract

Finite-state reasoning, the ability to understand and implement state-dependent behavior, is central to hardware design. In this paper, we present LLM-FSM, a benchmark that evaluates how well large language models (LLMs) can recover finite-state machine (FSM) behavior from natural-language specifications and translate it into correct register transfer-level (RTL) implementations. Unlike prior specification-to-RTL benchmarks that rely on manually constructed examples, LLM-FSM is built through a fully automated pipeline. LLM-FSM first constructs FSM with configurable state counts and constrained transition structures. It then prompts LLMs to express each FSM in a structured YAML format with an application context, and to further convert that YAML into a natural-language (NL) specification. From the same YAML, our pipeline synthesizes the reference RTL and testbench in a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmbedded Systems Design Techniques · Formal Methods in Verification · Parallel Computing and Optimization Techniques