Evaluating Agent Interactions Through Episodic Knowledge Graphs

Selene B\'aez Santamar\'ia; Piek Vossen; Thomas Baier

arXiv:2209.11746·cs.AI·September 27, 2022

Evaluating Agent Interactions Through Episodic Knowledge Graphs

Selene B\'aez Santamar\'ia, Piek Vossen, Thomas Baier

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel evaluation method for conversational agents using episodic Knowledge Graphs, capturing knowledge accumulation over time to provide deeper qualitative insights into agent behavior.

Contribution

The paper presents a new graph-based evaluation framework that interprets conversational signals to assess agent performance beyond traditional metrics.

Findings

01

Knowledge-Graph-based evaluation offers richer qualitative insights.

02

The method correlates well with existing evaluation metrics.

03

It effectively captures the evolution of agent knowledge during interactions.

Abstract

We present a new method based on episodic Knowledge Graphs (eKGs) for evaluating (multimodal) conversational agents in open domains. This graph is generated by interpreting raw signals during conversation and is able to capture the accumulation of knowledge over time. We apply structural and semantic analysis of the resulting graphs and translate the properties into qualitative measures. We compare these measures with existing automatic and manual evaluation metrics commonly used for conversational agents. Our results show that our Knowledge-Graph-based evaluation provides more qualitative insights into interaction and the agent's behavior.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

selbaez/evaluating-conversations-as-ekg
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems