DFEE: Interactive DataFlow Execution and Evaluation Kit

Han He; Song Feng; Daniele Bonadiman; Yi Zhang; Saab Mansour

arXiv:2212.08099·cs.CL·December 19, 2022

DFEE: Interactive DataFlow Execution and Evaluation Kit

Han He, Song Feng, Daniele Bonadiman, Yi Zhang, Saab Mansour

PDF

Open Access 1 Repo

TL;DR

DFEE is an interactive toolkit for executing, visualizing, and benchmarking DataFlow-based semantic parsers in dialogue systems, supporting complex tasks like event scheduling and providing diagnostic tools for developers.

Contribution

It introduces DFEE, a comprehensive toolkit for DataFlow execution and evaluation, including a new benchmark and metric for complex dialogue tasks.

Findings

01

Supports execution and visualization of DataFlow in dialogue tasks

02

Provides diagnostic tools for semantic parser analysis

03

Introduces a new benchmark and success metric for event scheduling

Abstract

DataFlow has been emerging as a new paradigm for building task-oriented chatbots due to its expressive semantic representations of the dialogue tasks. Despite the availability of a large dataset SMCalFlow and a simplified syntax, the development and evaluation of DataFlow-based chatbots remain challenging due to the system complexity and the lack of downstream toolchains. In this demonstration, we present DFEE, an interactive DataFlow Execution and Evaluation toolkit that supports execution, visualization and benchmarking of semantic parsers given dialogue input and backend database. We demonstrate the system via a complex dialog task: event scheduling that involves temporal reasoning. It also supports diagnosing the parsing results via a friendly interface that allows developers to examine dynamic DataFlow and the corresponding execution results. To illustrate how to benchmark SoTA…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amazon-science/dataflow-evaluation-toolkit
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · AI in Service Interactions