Enriching Transformers with Structured Tensor-Product Representations   for Abstractive Summarization

Yichen Jiang; Asli Celikyilmaz; Paul Smolensky; Paul Soulos; Sudha; Rao; Hamid Palangi; Roland Fernandez; Caitlin Smith; Mohit Bansal; Jianfeng; Gao

arXiv:2106.01317·cs.CL·June 3, 2021

Enriching Transformers with Structured Tensor-Product Representations for Abstractive Summarization

Yichen Jiang, Asli Celikyilmaz, Paul Smolensky, Paul Soulos, Sudha, Rao, Hamid Palangi, Roland Fernandez, Caitlin Smith, Mohit Bansal, Jianfeng, Gao

PDF

Open Access 1 Repo

TL;DR

This paper introduces a structured tensor-product representation-enhanced Transformer model for abstractive summarization, improving content control and interpretability by encoding syntactic and semantic information separately.

Contribution

It adapts TP-TRANSFORMER with explicit structural bias for better summarization performance and interpretability, a novel approach in this task.

Findings

01

Outperforms standard Transformer models on multiple datasets

02

Shows improved syntactic and semantic interpretability

03

Demonstrates emergent structural information in role vectors

Abstract

Abstractive summarization, the task of generating a concise summary of input documents, requires: (1) reasoning over the source document to determine the salient pieces of information scattered across the long document, and (2) composing a cohesive text by reconstructing these salient facts into a shorter summary that faithfully reflects the complex relations connecting these facts. In this paper, we adapt TP-TRANSFORMER (Schlag et al., 2019), an architecture that enriches the original Transformer (Vaswani et al., 2017) with the explicitly compositional Tensor Product Representation (TPR), for the task of abstractive summarization. The key feature of our model is a structural bias that we introduce by encoding two separate representations for each token to represent the syntactic structure (with role vectors) and semantic content (with filler vectors) separately. The model then binds…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jiangycTarheel/TPT-Summ
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Label Smoothing · Layer Normalization · Residual Connection