ReacTOD: Bounded Neuro-Symbolic Agentic NLU for Zero-Shot Dialogue State Tracking
Yanjun Lin, Zimo Xiao, Kartik Natarajan, Mahesh Sankaranarayanan, Niraj Nawanit, Rakshit Parashar, Austin Zhang, Karthik Konaraddi, Rishita Mote, Wei Niu

TL;DR
ReacTOD introduces a neuro-symbolic, self-correcting approach for zero-shot dialogue state tracking, significantly improving accuracy and generalization in task-oriented dialogue systems.
Contribution
The paper presents ReacTOD, a bounded neuro-symbolic architecture with a self-correcting ReAct loop and symbolic validation, achieving state-of-the-art zero-shot performance without task-specific training.
Findings
Up to 9.3% accuracy improvement over single-pass inference.
93.1% self-correction rate on intercepted errors.
Achieves new zero-shot SOTA on MultiWOZ 2.1 with 52.71% JGA.
Abstract
Task-oriented dialogue systems -- handling transactions, reservations, and service requests -- require predictable behavior, yet the moderately-sized LLMs needed for practical latency are prone to hallucination and format errors that cascade into incorrect actions (e.g., a hotel booked for the wrong date). We propose ReacTOD, a bounded neuro-symbolic architecture that reformulates NLU as discrete tool calls within a self-correcting ReAct loop governed by deterministic validation. A bounded ReAct loop enables iterative self-correction, improving accuracy by up to 9.3 percentage points over single-pass inference on MultiWOZ. A symbolic validator enforces action compliance, schema conformance, and coreference consistency on every dialogue state update, achieving a 93.1% self-correction rate on intercepted errors and producing structured execution traces. Incremental state prediction and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
