Hierarchical Phrase-based Sequence-to-Sequence Learning

Bailin Wang; Ivan Titov; Jacob Andreas; Yoon Kim

arXiv:2211.07906·cs.CL·November 17, 2022·1 cites

Hierarchical Phrase-based Sequence-to-Sequence Learning

Bailin Wang, Ivan Titov, Jacob Andreas, Yoon Kim

PDF

Open Access 1 Repo

TL;DR

This paper introduces a hierarchical phrase-based neural transducer that combines a discriminative parser with a seq2seq model, improving translation quality by explicitly modeling phrase hierarchies during training and inference.

Contribution

It presents a novel hierarchical phrase-based neural transducer with two inference modes, integrating a bracketing transduction grammar with seq2seq models for improved translation.

Findings

01

Both inference modes outperform baselines on small machine translation benchmarks.

02

The model effectively incorporates hierarchical phrase structures into neural translation.

03

Decoding with the CKY algorithm enables flexible use of translation rules during inference.

Abstract

We describe a neural transducer that maintains the flexibility of standard sequence-to-sequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference. Our approach trains two models: a discriminative parser based on a bracketing transduction grammar whose derivation tree hierarchically aligns source and target phrases, and a neural seq2seq model that learns to translate the aligned phrases one-by-one. We use the same seq2seq model to translate at all phrase scales, which results in two inference modes: one mode in which the parser is discarded and only the seq2seq component is used at the sequence-level, and another in which the parser is combined with the seq2seq model. Decoding in the latter mode is done with the cube-pruned CKY algorithm, which is more involved but can make use of new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

berlino/btg-seq2seq
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Sequence to Sequence · Variational Inference