Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence Models for Fill-in-the-Blank Image Captioning
Qing Sun, Stefan Lee, Dhruv Batra

TL;DR
This paper introduces Bidirectional Beam Search, an approximate inference algorithm for bidirectional neural sequence models, enabling better fill-in-the-blank image captioning by reasoning about both past and future context.
Contribution
It presents the first efficient approximate inference algorithm for bidirectional neural sequence models, extending Beam Search to incorporate forward and backward dependencies.
Findings
Outperforms baseline methods on fill-in-the-blank image captioning
Demonstrates effectiveness on Visual Madlibs dataset
Enables bidirectional reasoning in sequence decoding
Abstract
We develop the first approximate inference algorithm for 1-Best (and M-Best) decoding in bidirectional neural sequence models by extending Beam Search (BS) to reason about both forward and backward time dependencies. Beam Search (BS) is a widely used approximate inference algorithm for decoding sequences from unidirectional neural sequence models. Interestingly, approximate inference in bidirectional models remains an open problem, despite their significant advantage in modeling information from both the past and future. To enable the use of bidirectional models, we present Bidirectional Beam Search (BiBS), an efficient algorithm for approximate bidirectional inference.To evaluate our method and as an interesting problem in its own right, we introduce a novel Fill-in-the-Blank Image Captioning task which requires reasoning about both past and future sentence structure to reconstruct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
