OxfordTVG-HIC: Can Machine Make Humorous Captions from Images?
Runjia Li, Shuyang Sun, Mohamed Elhoseiny, Philip Torr

TL;DR
This paper introduces OxfordTVG-HIC, a large-scale dataset for training and evaluating models on humorous image captioning, addressing the challenge of generating and understanding humor in AI.
Contribution
It provides the first extensive dataset with humor scores for image captioning, enabling deep learning models to generate and interpret humorous content.
Findings
The dataset contains approximately 2.9 million image-text pairs.
Humour cues identified align with cognitive psychology theories.
Models trained on this dataset can predict and generate humorous captions.
Abstract
This paper presents OxfordTVG-HIC (Humorous Image Captions), a large-scale dataset for humour generation and understanding. Humour is an abstract, subjective, and context-dependent cognitive construct involving several cognitive factors, making it a challenging task to generate and interpret. Hence, humour generation and understanding can serve as a new task for evaluating the ability of deep-learning methods to process abstract and subjective information. Due to the scarcity of data, humour-related generation tasks such as captioning remain under-explored. To address this gap, OxfordTVG-HIC offers approximately 2.9M image-text pairs with humour scores to train a generalizable humour captioning model. Contrary to existing captioning datasets, OxfordTVG-HIC features a wide range of emotional and semantic diversity resulting in out-of-context examples that are particularly conducive to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHumor Studies and Applications · Comics and Graphic Narratives
