# Deep learning evaluation using deep linguistic processing

**Authors:** Alexander Kuhnle, Ann Copestake

arXiv: 1706.01322 · 2018-05-15

## TL;DR

This paper advocates for using linguistically processed artificial datasets to better evaluate multimodal deep learning models' language understanding, addressing limitations of traditional static datasets.

## Contribution

It introduces a method to create challenging abstract datasets using deep linguistic processing to evaluate multimodal models more thoroughly.

## Key findings

- Artificial datasets reveal detailed language understanding capabilities.
- Deep linguistic processing enables the creation of challenging evaluation datasets.
- Improved evaluation methods complement existing practices.

## Abstract

We discuss problems with the standard approaches to evaluation for tasks like visual question answering, and argue that artificial data can be used to address these as a complement to current practice. We demonstrate that with the help of existing 'deep' linguistic processing technology we are able to create challenging abstract datasets, which enable us to investigate the language understanding abilities of multimodal deep learning models in detail, as compared to a single performance value on a static and monolithic dataset.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.01322/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1706.01322/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/1706.01322/full.md

---
Source: https://tomesphere.com/paper/1706.01322