Image Captioning using Deep Neural Architectures

Parth Shah; Vishvajit Bakarola; Supriya Pati

arXiv:1801.05568·cs.CV·October 3, 2018

Image Captioning using Deep Neural Architectures

Parth Shah, Vishvajit Bakarola, Supriya Pati

PDF

1 Repo

TL;DR

This paper reviews various deep neural network models for image captioning, highlighting recent improvements due to advances in object recognition and machine translation, and discusses implementation and evaluation methods.

Contribution

It provides a comprehensive overview of current deep learning approaches for image captioning and discusses how recent technological advances have enhanced model performance.

Findings

01

Improved image captioning performance with recent deep learning models

02

Object recognition advancements have significantly contributed to captioning accuracy

03

Evaluation using standard metrics confirms progress in the field

Abstract

Automatically creating the description of an image using any natural languages sentence like English is a very challenging task. It requires expertise of both image processing as well as natural language processing. This paper discuss about different available models for image captioning task. We have also discussed about how the advancement in the task of object recognition and machine translation has greatly improved the performance of image captioning model in recent years. In addition to that we have discussed how this model can be implemented. In the end, we have also evaluated the performance of model using standard evaluation matrices.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shivani-raul/Image-Captioning-VGG16
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.