Joint Generation of Captions and Subtitles with Dual Decoding

Jitao Xu; Fran\c{c}ois Buet; Josep Crego; Elise Bertin-Lem\'ee,; Fran\c{c}ois Yvon

arXiv:2205.06522·cs.CL·May 16, 2022

Joint Generation of Captions and Subtitles with Dual Decoding

Jitao Xu, Fran\c{c}ois Buet, Josep Crego, Elise Bertin-Lem\'ee,, Fran\c{c}ois Yvon

PDF

1 Repo

TL;DR

This paper proposes a dual decoding approach to jointly generate captions and subtitles, improving their consistency and synchronization with minimal additional computational cost.

Contribution

It introduces a dual decoding scheme that tightly couples captioning and subtitling tasks, enhancing quality without increasing model size or training complexity.

Findings

01

Improved caption and subtitle consistency and synchronization

02

Virtually no increase in model size or training complexity

03

Effective joint generation of captions and subtitles

Abstract

As the amount of audio-visual content increases, the need to develop automatic captioning and subtitling solutions to match the expectations of a growing international audience appears as the only viable way to boost throughput and lower the related post-production costs. Automatic captioning and subtitling often need to be tightly intertwined to achieve an appropriate level of consistency and synchronization with each other and with the video signal. In this work, we assess a dual decoding scheme to achieve a strong coupling between these two tasks and show how adequacy and consistency are increased, with virtually no additional cost in terms of model size and training complexity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jitao-xu/dual-decoding
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.