# Punny Captions: Witty Wordplay in Image Descriptions

**Authors:** Arjun Chandrasekaran, Devi Parikh, Mohit Bansal

arXiv: 1704.08224 · 2018-06-01

## TL;DR

This paper explores computational methods to generate witty, pun-based image descriptions, employing retrieval and neural network generation, and demonstrates that models can produce descriptions perceived as slightly wittier than human ones under constrained conditions.

## Contribution

It introduces two novel approaches for generating witty image descriptions using puns, and provides human evaluation showing the models' effectiveness compared to baselines.

## Key findings

- Models produce wittier descriptions than baseline methods.
- Humans are generally wittier than models when unrestricted.
- Model-generated descriptions are slightly wittier under constrained conditions.

## Abstract

Wit is a form of rich interaction that is often grounded in a specific situation (e.g., a comment in response to an event). In this work, we attempt to build computational models that can produce witty descriptions for a given image. Inspired by a cognitive account of humor appreciation, we employ linguistic wordplay, specifically puns, in image descriptions. We develop two approaches which involve retrieving witty descriptions for a given image from a large corpus of sentences, or generating them via an encoder-decoder neural network architecture. We compare our approach against meaningful baseline approaches via human studies and show substantial improvements. We find that when a human is subject to similar constraints as the model regarding word usage and style, people vote the image descriptions generated by our model to be slightly wittier than human-written witty descriptions. Unsurprisingly, humans are almost always wittier than the model when they are free to choose the vocabulary, style, etc.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1704.08224/full.md

## Figures

24 figures with captions in the complete paper: https://tomesphere.com/paper/1704.08224/full.md

---
Source: https://tomesphere.com/paper/1704.08224