PyTOD: Programmable Task-Oriented Dialogue with Execution Feedback
Alexandru Coca, Bo-Hsiang Tseng, Pete Boothroyd, Jianpeng Cheng, Mark Gaynor, Zhenxing Zhang, Joe Stacey, Tristan Guigue, H\'ector Martinez Alonso, Diarmuid \'O S\'eaghdha, Anders Johannsen

TL;DR
PyTOD introduces a programmable dialogue agent that generates executable code for state tracking, leveraging execution feedback and constrained decoding to improve accuracy and robustness in task-oriented dialogues.
Contribution
PyTOD is the first to use language models for executable code generation in dialogue state tracking, achieving state-of-the-art results with a simple, flexible approach.
Findings
Achieves state-of-the-art performance on SGD benchmark.
Outperforms baselines in accuracy and goal estimation.
Demonstrates robustness with execution-aware state tracking.
Abstract
Programmable task-oriented dialogue (TOD) agents enable language models to follow structured dialogue policies, but their effectiveness hinges on accurate state tracking. We present PyTOD, an agent that generates executable code to track dialogue state and uses policy and execution feedback for efficient error correction. To this end, PyTOD employs a simple constrained decoding approach, using a language model instead of grammar rules to follow API schemata. This leads to state-of-the-art state tracking performance on the challenging SGD benchmark. Our experiments show that PyTOD surpasses strong baselines in both accuracy and robust user goal estimation as the dialogue progresses, demonstrating the effectiveness of execution-aware state tracking.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Topic Modeling · Multimodal Machine Learning Applications
