Accelerating Natural Language Understanding in Task-Oriented Dialog

Ojas Ahuja; Shrey Desai

arXiv:2006.03701·cs.CL·June 9, 2020

Accelerating Natural Language Understanding in Task-Oriented Dialog

Ojas Ahuja, Shrey Desai

PDF

1 Repo

TL;DR

This paper presents a lightweight convolutional model for task-oriented dialog understanding that maintains high accuracy while significantly reducing model size and increasing inference speed, enabling on-device deployment.

Contribution

The authors introduce a simple convolutional model with structured pruning that rivals BERT's performance on benchmarks and accelerates inference on CPUs by over 60 times.

Findings

01

Compressed convolutional model achieves comparable results to BERT on ATIS and Snips.

02

Model with under 100K parameters performs nearly as well as large transformers.

03

Multi-task model predicts intents and slots 63x faster than DistilBERT.

Abstract

Task-oriented dialog models typically leverage complex neural architectures and large-scale, pre-trained Transformers to achieve state-of-the-art performance on popular natural language understanding benchmarks. However, these models frequently have in excess of tens of millions of parameters, making them impossible to deploy on-device where resource-efficiency is a major concern. In this work, we show that a simple convolutional model compressed with structured pruning achieves largely comparable results to BERT on ATIS and Snips, with under 100K parameters. Moreover, we perform acceleration experiments on CPUs, where we observe our multi-task model predicts intents and slots nearly 63x faster than even DistilBERT.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

oja/pruned-nlu
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning · Linear Layer · DistilBERT · Weight Decay · Softmax · Adam · Multi-Head Attention · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout