# Unified Embedding and Metric Learning for Zero-Exemplar Event Detection

**Authors:** Noureldien Hussein, Efstratios Gavves, Arnold W.M. Smeulders

arXiv: 1705.02148 · 2017-05-08

## TL;DR

This paper introduces a unified embedding and metric learning approach for zero-exemplar event detection in videos, enabling effective retrieval of relevant videos based on textual descriptions without prior video examples.

## Contribution

It proposes a joint embedding space for visual and textual data, improving zero-exemplar event detection by learning to measure event-video similarity end-to-end.

## Key findings

- Outperforms state-of-the-art on TRECVID dataset
- Effective in zero-exemplar event detection
- Joint embedding improves retrieval accuracy

## Abstract

Event detection in unconstrained videos is conceived as a content-based video retrieval with two modalities: textual and visual. Given a text describing a novel event, the goal is to rank related videos accordingly. This task is zero-exemplar, no video examples are given to the novel event.   Related works train a bank of concept detectors on external data sources. These detectors predict confidence scores for test videos, which are ranked and retrieved accordingly. In contrast, we learn a joint space in which the visual and textual representations are embedded. The space casts a novel event as a probability of pre-defined events. Also, it learns to measure the distance between an event and its related videos.   Our model is trained end-to-end on publicly available EventNet. When applied to TRECVID Multimedia Event Detection dataset, it outperforms the state-of-the-art by a considerable margin.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.02148/full.md

## Figures

20 figures with captions in the complete paper: https://tomesphere.com/paper/1705.02148/full.md

## References

53 references — full list in the complete paper: https://tomesphere.com/paper/1705.02148/full.md

---
Source: https://tomesphere.com/paper/1705.02148