# On zero-shot recognition of generic objects

**Authors:** Tristan Hascoet, Yasuo Ariki, Tetsuya Takiguchi

arXiv: 1904.04957 · 2019-04-11

## TL;DR

This paper critiques current zero-shot learning benchmarks in computer vision, revealing their flaws and biases, and proposes a new, more reliable benchmark to better evaluate ZSL models' true capabilities.

## Contribution

It identifies structural flaws and biases in existing ZSL benchmarks and introduces a semi-automated method to construct a more robust benchmark.

## Key findings

- Existing ZSL models perform better than previously thought when flaws are accounted for.
- Current benchmarks are biased, allowing trivial solutions.
- A new benchmark is proposed to more accurately assess ZSL models.

## Abstract

Many recent advances in computer vision are the result of a healthy competition among researchers on high quality, task-specific, benchmarks. After a decade of active research, zero-shot learning (ZSL) models accuracy on the Imagenet benchmark remains far too low to be considered for practical object recognition applications. In this paper, we argue that the main reason behind this apparent lack of progress is the poor quality of this benchmark. We highlight major structural flaws of the current benchmark and analyze different factors impacting the accuracy of ZSL models. We show that the actual classification accuracy of existing ZSL models is significantly higher than was previously thought as we account for these flaws. We then introduce the notion of structural bias specific to ZSL datasets. We discuss how the presence of this new form of bias allows for a trivial solution to the standard benchmark and conclude on the need for a new benchmark. We then detail the semi-automated construction of a new benchmark to address these flaws.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.04957/full.md

## Figures

20 figures with captions in the complete paper: https://tomesphere.com/paper/1904.04957/full.md

## References

21 references — full list in the complete paper: https://tomesphere.com/paper/1904.04957/full.md

---
Source: https://tomesphere.com/paper/1904.04957