Aesthetic Image Captioning From Weakly-Labelled Photographs

Koustav Ghosal; Aakanksha Rana; Aljosa Smolic

arXiv:1908.11310·cs.CV·August 30, 2019

Aesthetic Image Captioning From Weakly-Labelled Photographs

Koustav Ghosal, Aakanksha Rana, Aljosa Smolic

PDF

1 Repo

TL;DR

This paper introduces a new large-scale dataset for aesthetic image captioning created through an automatic cleaning process, and proposes a weakly supervised method to learn aesthetic features without needing detailed annotations.

Contribution

It presents a novel dataset, AVA-Captions, and a weakly supervised training strategy for aesthetic feature extraction in image captioning.

Findings

01

The dataset contains 230,000 images with 5 captions each.

02

The weakly supervised method effectively learns aesthetic representations.

03

Automatic metrics and subjective evaluations validate the approach.

Abstract

Aesthetic image captioning (AIC) refers to the multi-modal task of generating critical textual feedbacks for photographs. While in natural image captioning (NIC), deep models are trained in an end-to-end manner using large curated datasets such as MS-COCO, no such large-scale, clean dataset exists for AIC. Towards this goal, we propose an automatic cleaning strategy to create a benchmarking AIC dataset, by exploiting the images and noisy comments easily available from photography websites. We propose a probabilistic caption-filtering method for cleaning the noisy web-data, and compile a large-scale, clean dataset "AVA-Captions", (230, 000 images with 5 captions per image). Additionally, by exploiting the latent associations between aesthetic attributes, we propose a strategy for training the convolutional neural network (CNN) based visual feature extractor, the first component of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

V-Sense/Aesthetic-Image-Captioning-ICCVW-2019
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.