Hybrid Generative/Discriminative Learning for Automatic Image Annotation
Shuang Hong Yang, Jiang Bian, Hongyuan Zha

TL;DR
This paper introduces a hybrid generative-discriminative model for automatic image annotation that effectively handles data ambiguity and large label vocabularies, improving annotation accuracy and scalability.
Contribution
It proposes an Exponential-Multinomial Mixture model combined with discriminative learning to address ambiguity and overfitting in image annotation tasks.
Findings
Achieves superior annotation performance
Handles large label vocabularies effectively
Improves scalability of image annotation
Abstract
Automatic image annotation (AIA) raises tremendous challenges to machine learning as it requires modeling of data that are both ambiguous in input and output, e.g., images containing multiple objects and labeled with multiple semantic tags. Even more challenging is that the number of candidate tags is usually huge (as large as the vocabulary size) yet each image is only related to a few of them. This paper presents a hybrid generative-discriminative classifier to simultaneously address the extreme data-ambiguity and overfitting-vulnerability issues in tasks such as AIA. Particularly: (1) an Exponential-Multinomial Mixture (EMM) model is established to capture both the input and output ambiguity and in the meanwhile to encourage prediction sparsity; and (2) the prediction ability of the EMM model is explicitly maximized through discriminative learning that integrates variational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Text and Document Classification Technologies
