# Animating an Autonomous 3D Talking Avatar

**Authors:** Dominik Borer, Dominik Lutz, Martin Guay

arXiv: 1903.05448 · 2019-03-14

## TL;DR

This paper introduces a simplified, faster method for annotating and organizing motions in 3D talking avatars, which could lead to automated labeling and more varied, natural conversational agents.

## Contribution

It presents a compact taxonomy and an efficient interface that significantly speeds up motion labeling, paving the way for automated annotation of avatar animations.

## Key findings

- Labeling time reduced by 7 times using the new interface.
- Potential for automated labeling through learned predictions.
- Improved scalability for avatar motion annotation.

## Abstract

One of the main challenges with embodying a conversational agent is annotating how and when motions can be played and composed together in real-time, without any visual artifact. The inherent problem is to do so---for a large amount of motions---without introducing mistakes in the annotation. To our knowledge, there is no automatic method that can process animations and automatically label actions and compatibility between them. In practice, a state machine, where clips are the actions, is created manually by setting connections between the states with the timing parameters for these connections. Authoring this state machine for a large amount of motions leads to a visual overflow, and increases the amount of possible mistakes. In consequence, conversational agent embodiments are left with little variations and quickly become repetitive. In this paper, we address this problem with a compact taxonomy of chit chat behaviors, that we can utilize to simplify and partially automate the graph authoring process. We measured the time required to label actions of an embodiment using our simple interface, compared to the standard state machine interface in Unreal Engine, and found that our approach is 7 times faster. We believe that our labeling approach could be a path to automated labeling: once a sub-set of motions are labeled (using our interface), we could learn a prediction that could attribute a label to new clips---allowing to really scale up virtual agent embodiments.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.05448/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/1903.05448/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/1903.05448/full.md

---
Source: https://tomesphere.com/paper/1903.05448