# Transductive Zero-Shot Learning with Adaptive Structural Embedding

**Authors:** Yunlong Yu, Zhong Ji, Jichang Guo, and Yanwei Pang

arXiv: 1703.08897 · 2017-03-28

## TL;DR

This paper introduces a novel transductive zero-shot learning framework combining adaptive structural embedding and self-paced learning to improve recognition of unseen classes, achieving superior results on benchmark datasets.

## Contribution

It proposes ASTE and SPASS methods to address visual-semantic embedding and domain shift challenges, respectively, and introduces a fast training strategy for efficiency.

## Key findings

- Superior performance on AwA, CUB, and aPY datasets.
- Fast training strategy speeds up existing methods by 4-300 times.
- Effective handling of domain shift and unreliable instances.

## Abstract

Zero-shot learning (ZSL) endows the computer vision system with the inferential capability to recognize instances of a new category that has never seen before. Two fundamental challenges in it are visual-semantic embedding and domain adaptation in cross-modality learning and unseen class prediction steps, respectively. To address both challenges, this paper presents two corresponding methods named Adaptive STructural Embedding (ASTE) and Self-PAsed Selective Strategy (SPASS), respectively. Specifically, ASTE formulates the visualsemantic interactions in a latent structural SVM framework to adaptively adjust the slack variables to embody the different reliableness among training instances. In this way, the reliable instances are imposed with small punishments, wheras the less reliable instances are imposed with more severe punishments. Thus, it ensures a more discriminative embedding. On the other hand, SPASS offers a framework to alleviate the domain shift problem in ZSL, which exploits the unseen data in an easy to hard fashion. Particularly, SPASS borrows the idea from selfpaced learning by iteratively selecting the unseen instances from reliable to less reliable to gradually adapt the knowledge from the seen domain to the unseen domain. Subsequently, by combining SPASS and ASTE, we present a self-paced Transductive ASTE (TASTE) method to progressively reinforce the classification capacity. Extensive experiments on three benchmark datasets (i.e., AwA, CUB, and aPY) demonstrate the superiorities of ASTE and TASTE. Furthermore, we also propose a fast training (FT) strategy to improve the efficiency of most of existing ZSL methods. The FT strategy is surprisingly simple and general enough, which can speed up the training time of most existing methods by 4~300 times while holding the previous performance.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.08897/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1703.08897/full.md

## References

50 references — full list in the complete paper: https://tomesphere.com/paper/1703.08897/full.md

---
Source: https://tomesphere.com/paper/1703.08897