FastTrees: Parallel Latent Tree-Induction for Faster Sequence Encoding

Bill Tuck Weng Pung; Alvin Chan

arXiv:2111.14031·cs.CL·November 30, 2021

FastTrees: Parallel Latent Tree-Induction for Faster Sequence Encoding

Bill Tuck Weng Pung, Alvin Chan

PDF

Open Access 1 Repo

TL;DR

FASTTREES introduces a parallel, non-autoregressive neural module for faster sequence encoding that induces latent tree structures, outperforming existing models on various NLP tasks and enhancing Transformer performance.

Contribution

It presents FASTTREES, a novel parallel tree induction method that improves sequence encoding speed and performance, and can be integrated into Transformer models for better results.

Findings

01

Achieves competitive or superior performance to ON-LSTM on four sequence tasks.

02

Enhances Transformer models, improving performance on three sequence transduction tasks.

03

Outperforms state-of-the-art models on logical inference (+4%) and mathematical language understanding (+8%).

Abstract

Inducing latent tree structures from sequential data is an emerging trend in the NLP research landscape today, largely popularized by recent methods such as Gumbel LSTM and Ordered Neurons (ON-LSTM). This paper proposes FASTTREES, a new general purpose neural module for fast sequence encoding. Unlike most previous works that consider recurrence to be necessary for tree induction, our work explores the notion of parallel tree induction, i.e., imbuing our model with hierarchical inductive biases in a parallelizable, non-autoregressive fashion. To this end, our proposed FASTTREES achieves competitive or superior performance to ON-LSTM on four well-established sequence modeling tasks, i.e., language modeling, logical inference, sentiment analysis and natural language inference. Moreover, we show that the FASTTREES module can be applied to enhance Transformer models, achieving performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

billptw/fasttrees
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Sigmoid Activation · Label Smoothing · Softmax · Residual Connection · Layer Normalization · Adam