# Vienna Talking Faces (ViTaFa): A multimodal person database with synchronized videos, images, and voices

**Authors:** Christina Krumpholz, Cliodhna Quigley, Leonida Fusani, Helmut Leder

PMC · DOI: 10.3758/s13428-023-02264-5 · Behavior Research Methods · 2023-11-10

## TL;DR

ViTaFa is a high-quality, multimodal database of synchronized videos, images, and voices for studying social perception through audiovisual signals.

## Contribution

ViTaFa introduces a unique, standardized, and freely accessible multimodal database with synchronized audiovisual stimuli and human ratings for social perception research.

## Key findings

- ViTaFa includes 40 individuals with diverse spoken content and emotional expressions recorded under standardized conditions.
- Over 200 human raters validated the emotion expressions in the database.
- The database is freely available for academic non-profit research after signing a confidentiality agreement.

## Abstract

Social perception relies on different sensory channels, including vision and audition, which are specifically important for judgements of appearance. Therefore, to understand multimodal integration in person perception, it is important to study both face and voice in a synchronized form. We introduce the Vienna Talking Faces (ViTaFa) database, a high-quality audiovisual database focused on multimodal research of social perception. ViTaFa includes different stimulus modalities: audiovisual dynamic, visual dynamic, visual static, and auditory dynamic. Stimuli were recorded and edited under highly standardized conditions and were collected from 40 real individuals, and the sample matches typical student samples in psychological research (young individuals aged 18 to 45). Stimuli include sequences of various types of spoken content from each person, including German sentences, words, reading passages, vowels, and language-unrelated pseudo-words. Recordings were made with different emotional expressions (neutral, happy, angry, sad, and flirtatious). ViTaFa is freely accessible for academic non-profit research after signing a confidentiality agreement form via https://osf.io/9jtzx/ and stands out from other databases due to its multimodal format, high quality, and comprehensive quantification of stimulus features and human judgements related to attractiveness. Additionally, over 200 human raters validated emotion expression of the stimuli. In summary, ViTaFa provides a valuable resource for investigating audiovisual signals of social perception.

## Full-text entities

- **Diseases:** depressive (MESH:D003866), happy (MESH:D017204), aggressive (MESH:D010554), AT&amp;T (MESH:D001260), facial deformities (MESH:D005153)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11133183/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11133183/full.md

## References

112 references — full list in the complete paper: https://tomesphere.com/paper/PMC11133183/full.md

---
Source: https://tomesphere.com/paper/PMC11133183