A request for clarity over the End of Sequence token in the   Self-Critical Sequence Training

Jia Cheng Hu; Roberto Cavicchioli; Alessandro Capotondi

arXiv:2305.12254·cs.CV·December 27, 2023·1 cites

A request for clarity over the End of Sequence token in the Self-Critical Sequence Training

Jia Cheng Hu, Roberto Cavicchioli, Alessandro Capotondi

PDF

Open Access 2 Repos

TL;DR

This paper highlights the critical issue of <Eos> token omission in Self-Critical Sequence Training for image captioning, which can artificially inflate performance metrics and hampers fair evaluation, proposing a solution with the SacreEOS library.

Contribution

It raises awareness about the <Eos> token omission problem and introduces SacreEOS to promote transparency and consistency in evaluation.

Findings

01

Omission of <Eos> can increase CIDEr-D scores by up to 4.1 points.

02

The lack of <Eos> awareness affects fair comparison of models.

03

SacreEOS helps standardize <Eos> handling in training and evaluation.

Abstract

The Image Captioning research field is currently compromised by the lack of transparency and awareness over the End-of-Sequence token (<Eos>) in the Self-Critical Sequence Training. If the <Eos> token is omitted, a model can boost its performance up to +4.1 CIDEr-D using trivial sentence fragments. While this phenomenon poses an obstacle to a fair evaluation and comparison of established works, people involved in new projects are given the arduous choice between lower scores and unsatisfactory descriptions due to the competitive nature of the research. This work proposes to solve the problem by spreading awareness of the issue itself. In particular, we invite future works to share a simple and informative signature with the help of a library called SacreEOS. Code available at \emph{\href{https://github.com/jchenghu/sacreeos}{https://github.com/jchenghu/sacreeos}}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning

MethodsLib