Enabling Dialogue Management with Dynamically Created Dialogue Actions

Juliana Miehle; Louisa Pragst; Wolfgang Minker; Stefan Ultes

arXiv:1907.00684·cs.HC·July 2, 2019

Enabling Dialogue Management with Dynamically Created Dialogue Actions

Juliana Miehle, Louisa Pragst, Wolfgang Minker, Stefan Ultes

PDF

Open Access

TL;DR

This paper introduces an extension to the OwlSpeak Dialogue Manager that allows for dynamic creation of dialogue actions, enhancing system flexibility and adaptability in spoken dialogue systems.

Contribution

The paper presents a novel extension enabling dynamic dialogue actions in OwlSpeak, improving flexibility for user-adaptive dialogue management.

Findings

01

Participants conducted meaningful dialogues

02

System performed satisfactorily

03

Implementation was successful

Abstract

In order to take up the challenge of realising user-adaptive system behaviour, we present an extension for the existing OwlSpeak Dialogue Manager which enables the handling of dynamically created dialogue actions. This leads to an increase in flexibility which can be used for adaptation tasks. After the implementation of the modifications and the integration of the Dialogue Manager into a full Spoken Dialogue System, an evaluation of the system has been carried out. The results indicate that the participants were able to conduct meaningful dialogues and that the system performs satisfactorily, showing that the implementation of the Dialogue Manager was successful.

Figures2

Click any figure to enlarge with its caption.

Tables1

Table 1. Table 1: Evaluation of the implemented system. The statements had to be rated on a five-point Likert scale (1 = completely agree, 5 = completely disagree).

	Mean	Median
The system returns sufficient information.	2.88	3
The system returns relevant information with the question.	2.85	3
The system returns reliable/trustworthy information.	2.68	3
The system returns meaningful responses.	2.49	2

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Multi-Agent Systems and Negotiation · AI in Service Interactions

Full text

11institutetext: Juliana Miehle, Louisa Pragst, Wolfgang Minker, Stefan Ultes22institutetext: Institute of Communications Engineering, Ulm University, Germany

Enabling Dialogue Management with Dynamically Created Dialogue Actions

Juliana Miehle

Louisa Pragst

Wolfgang Minker

Stefan Ultes

Abstract

In order to take up the challenge of realising user-adaptive system behaviour, we present an extension for the existing OwlSpeak Dialogue Manager which enables the handling of dynamically created dialogue actions. This leads to an increase in flexibility which can be used for adaptation tasks. After the implementation of the modifications and the integration of the Dialogue Manager into a full Spoken Dialogue System, an evaluation of the system has been carried out. The results indicate that the participants were able to conduct meaningful dialogues and that the system performs satisfactorily, showing that the implementation of the Dialogue Manager was successful.

1 Introduction

One of the main challenges in Spoken Dialogue Systems is to realise effective dialogue strategies for coherent interactions and user-adaptive system behaviour. In general, the Dialogue Manager (DM) gets the user’s input in form of a dialogue action which is a semantic representation of the user’s utterance. Afterwards, it decides on the system’s response based on the discourse context and outputs the semantic representation of the next system action. However, current DMs are often restricted to predefined dialogue actions leading to a loss in flexibility and robustness. Our aim is to increase the flexibility of the DM by the use of dynamically created dialogue actions in order to adapt the system’s behaviour to the user, as proposed for example in honold2014 ; miehleYoshinoPragstUltesNakamuraMinker2016 ; pragstMinkerUltes2017 ; ultesKrausSchmittMinker2015 . Thus, the conversation agent may appear more familiar and trustworthy and the dialogue may be more effective.

In this work, we describe the implementation and evaluation of the extension of the already existing OwlSpeak DM heinrothDenichSchmitt2010 ; ultesMinker2014 in order to handle dynamically created user and system actions, utilising general dialogue actions combined with ontology semantics to determine the system behaviour based on meditskosDasiopoulouPragstUltesVrochidisKompatsiarisWanner2016 ; pragstUltesKrausMinker2015 . The structure of the paper is as follows: In Section 2, the original OwlSpeak DM is introduced. Subsequently, we present our extension in order to handle dynamically created dialogue actions in Section 3 and an evaluation of the resulting system in Section 4. In Section 5, we discuss related work before concluding in Section 6.

2 The OwlSpeak Dialogue Manager

In the following, the original OwlSpeak DM will be described based on the work by Ultes and Minker ultesMinker2014 . OwlSpeak is an ontology-based DM which enables adaptive spoken dialogue within Intelligent Environments. The concept underlying OwlSpeak incorporates the Model-View-Presenter design pattern potel1996 whereby data management, dialogue logic and dialogue interface are separated, as can be seen in Figure 1.

Spoken Dialogue Ontology (Model) The Model is implemented as a Spoken Dialogue Ontology and consists of a static Speech part which contains the concepts of the dialogue (e.g. pre-defined Grammar Moves for the user and Utterance Moves for the system) and a dynamic State part which comprises concepts for the current state of the dialogue system (e.g. Agendas representing one system action containing zero or one Utterance Moves and one or more Grammar Moves and the WorkSpace storing all Agendas that might be executed in following turns).

Dialogue Generation (Presenter) The Presenter constitutes the dialogue control logic and thus the computational part of OwlSpeak and consists of a JAVA Servlet. It selects an Agenda out of the WorkSpace, creates a View and then processes the user input that eventually is passed back by the View.

Dialogue Interface (View) The View is realised as a VoiceXML document which is created by the Presenter and passed to the speech recogniser. There, the VoiceXML document is interpreted and output to the user. Then, the user input is passed back to the Presenter.

3 Handling Dynamically Created Dialogue Actions with OwlSpeak

As described by Meditskos et al. meditskosDasiopoulouPragstUltesVrochidisKompatsiarisWanner2016 , modules for advanced techniques in the fields of language analysis as well as knowledge interpretation and reasoning need to be integrated in order to support dynamically created dialogue actions both by the user and the system. To facilitate the interaction with such modules, the Model and the View of OwlSpeak need to be adapted. The Presenter including the dialogue control logic does not need to be modified as the general concept of selecting an Agenda out of the WorkSpace is still valid. The necessary modifications are depicted in Figure 2 and described in the following.

Spoken Dialogue Ontology (Model) The Model in form of a Spoken Dialogue Ontology needs to be modified in order to allow the communication with an external knowledge integration (KI) module which feeds the OwlSpeak DM with contextual information. This is implemented in such way that the external KI module sends relevant information that might be output to the user in the current situation. These information snippets are marked either as informable or requestable. Afterwards, OwlSpeak dynamically creates new Agendas and stores them in the WorkSpace. Each Agenda contains exactly one DialogueAction, either request or inform, which is a new concept that has been added to the Spoken Dialogue Ontology. In contrast to a Move, it does not contain any pre-defined Utterance. However, the ontology semantics provided by the external KI module are also added to the Agenda. Moreover, the Agendas are assigned an age indicating at which point of the dialogue they have been added to the WorkSpace. This information might be used by the presenter during the process of selecting the next Agenda out of the current WorkSpace. In addition to the dynamically created Agendas based on the input of the external KI module, the WorkSpace holds some Agendas containing general DialogueActions like greet, acknowledge and thank which might be selected without asking the external KI module as they are self-contained and can be used in any dialogue domain.

Dialogue Interface (View) In order to allow a communication with an external language analysis module, a new interface has to be implemented which is based on a purely semantic representation of the user input and system output (rather than a sequence of fixed system utterances and corresponding user responds in form of grammars). Therefore, we introduced the Agenda Document for the system output containing the selected Agenda which should be performed, including its DialogueAction and the corresponding ontology semantics in form of RDF Triples provided by the external KI module. The Agenda Document is passed to the external language analysis module which extracts the semantic information and generates the corresponding system utterances. On the other hand, the language analysis module creates RDF Triples from the user input and passes them back to the Presenter which extracts the DialogueAction and decides on how to proceed.

4 Evaluation

After implementing the presented extension in order to handle dynamically created dialogue actions with OwlSpeak and integrating the DM into the overall framework described in kristina2017 , several evaluation sessions with human users were carried out. In total, 41 participants tested the system. The assessment was performed in accordance with the Guideline for Good Clinical Practice gcp2016 . The procedure was as follows: First of all, the participants got a short introduction about the system as well as the functionalities. Moreover, they were informed about the evaluation process, the collected data and the internal use of data. Afterwards, each participant conducted a guided conversation with the system. At the end, a questionnaire had to be completed, containing statements about the overall system. Each statement had to be rated on a five-point Likert scale (1 = completely agree, 5 = completely disagree). Some results of the evaluation are shown in Table 1.

It can be seen that the participants were able to conduct meaningful dialogues. There is a tendency that our system returns sufficient information. The information seems to be quite relevant with the question as well as reliable and trustworthy. Moreover, the participants agree that the responses returned by the system are meaningful. We can conclude that, overall, the systems performs satisfactorily, even if there is still room for improvement. However, we think that the relevance, the meaningfulness as well as the reliability of the information does not only rely on the DM, but depends very much on the performance of the external knowledge integration module.

5 Related Work

There have been alternative approaches to separate the domain knowledge from the DM. The RavenClaw framework bohusRudnicky2003 introduces a clear separation between task and discourse behaviour specification, allowing for a rapid development of DM components for goal-oriented domains. It consists of a Dialogue Task Specification layer which models the domain-specific dialogue logic, and a Dialogue Engine, which is domain-independent and executes the Dialogue Task Specification. Nothdurft et al. nothdurftRichterMinker2014 present an architecture where a planner is used in order to provide explanations for the system’s proposed course of action. The task-oriented dialogues are thereby modelled as a finite-state machine, while the planner outputs a decision tree. The DM compares the resulting actions and inserts pre-defined explanations when potential points of distrust are identified. The LS-SDS dialogue system papangelis2017 is envisioned to support user requests over multiple, complex, rich, and open-domain data sources that will leverage the wealth of the available Linked Data. It is connected with an exploratory search system that supports the previously defined hard and soft restriction actions that allow the user to order facets, values, and objects. However, all of these systems are implemented in order to support slot-filling dialogues. In contrast, the approach presented in this work comprises both dialogues that consist of different slots that need to be filled and dialogues that are more chat-oriented and do not depend on any pre-defined slots. The scope and the topics of the dialogue depend on the external knowledge integration module, while the OwlSpeak DM can handle all kind of informable or requestable information.

6 Conclusion

We presented the implementation of an extension for the existing OwlSpeak DM to enable the handling of dynamically created dialogue actions. This leads to an increase in flexibility which can be used for adaptation tasks. In order to support dynamically created dialogue actions both by the user and the system, modules for advanced techniques in the fields of language analysis as well as knowledge interpretation and reasoning need to be integrated. To facilitate the interaction with these modules, the model and the view of OwlSpeak needed to be adapted. After the implementation of the modifications and the integration of the DM into the overall framework, an evaluation of the system has been presented. The results indicate that the participants were able to conduct meaningful dialogues and that the system performs satisfactorily, showing that the implementation of the DM was successful.

Acknowledgements.

This work is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 645012. We thank our colleagues from the University of Tübingen, the German Red Cross in Tübingen and semFYC in Barcelona for organizing and carrying out the evaluation.

Bibliography14

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1) Bohus, D., Rudnicky, A.I.: Ravenclaw: Dialog management using hierarchical task decomposition and an expectation agenda (2003)
2(2) European Medicines Agency: Guideline for Good Clinical Practice E 6(R 2). EMA/CHMP/ ICH/135/1995 (2016). Available from: http://www.ich.org/products/guidelines/efficacy/ article/efficacy-guidelines.html. Accessed on: January 11, 2018
3(3) Heinroth, T., Denich, D., Schmitt, A.: Owl Speak - Adaptive Spoken Dialogue within Intelligent Environments. In: Pervasive Computing and Communications Workshops (PERCOM Workshops), 2010 8th IEEE International Conference on, pp. 666–671. IEEE (2010)
4(4) Honold, F., Bercher, P., Richter, F., Nothdurft, F., Geier, T., Barth, R., Hoernle, T., Schüssel, F., Reuter, S., Rau, M., Bertrand, G., Seegebarth, B., Kurzok, P., Schattenberg, B., Minker, W., Weber, M., Biundo, S.: Companion-technology: Towards user- and situation-adaptive functionality of technical systems. In: 10th International Conference on Intelligent Environments (IE 2014), pp. 378–381. IEEE (2014)
5(5) Meditskos, G., Dasiopoulou, S., Pragst, L., Ultes, S., Vrochidis, S., Kompatsiaris, I., Wanner, L.: Towards an ontology-driven adaptive dialogue framework. In: Proceedings of the 1st International Workshop on Multimedia Analysis and Retrieval for Multimodal Interaction, pp. 15–20. ACM (2016)
6(6) Miehle, J., Yoshino, K., Pragst, L., Ultes, S., Nakamura, S., Minker, W.: Cultural communication idiosyncrasies in human-computer interaction. In: 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, p. 74 (2016)
7(7) Nothdurft, F., Richter, F., Minker, W.: Probabilistic human-computer trust handling. In: SIGDIAL Conference, pp. 51–59 (2014)
8(8) Papangelis, A., Papadakos, P., Kotti, M., Stylianou, Y., Tzitzikas, Y., Plexousakis, D.: Ld-sds: Towards an expressive spoken dialogue system based on linked-data. ar Xiv preprint ar Xiv:1710.02973 (2017)