Generation with Dynamic Vocabulary

Yanting Liu; Tao Ji; Changzhi Sun; Yuanbin Wu; Xiaoling Wang

arXiv:2410.08481·cs.CL·October 14, 2024

Generation with Dynamic Vocabulary

Yanting Liu, Tao Ji, Changzhi Sun, Yuanbin Wu, Xiaoling Wang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper presents a dynamic vocabulary approach for language models that allows arbitrary text spans during generation, improving quality and efficiency, and enabling versatile, training-free domain adaptation and citation generation.

Contribution

Introduction of a dynamic vocabulary mechanism that incorporates multi-token spans, enhancing language model performance and flexibility across applications.

Findings

01

25% increase in MAUVE score

02

20% reduction in latency

03

Improved citation generation in QA tasks

Abstract

We introduce a new dynamic vocabulary for language models. It can involve arbitrary text spans during generation. These text spans act as basic generation bricks, akin to tokens in the traditional static vocabularies. We show that, the ability to generate multi-tokens atomically improve both generation quality and efficiency (compared to the standard language model, the MAUVE metric is increased by 25%, the latency is decreased by 20%). The dynamic vocabulary can be deployed in a plug-and-play way, thus is attractive for various downstream applications. For example, we demonstrate that dynamic vocabulary can be applied to different domains in a training-free manner. It also helps to generate reliable citations in question answering tasks (substantially enhancing citation results without compromising answer accuracy).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Maniyantingliu/generation_with_dynamic_vocabulary
noneOfficial

Videos

Generation with Dynamic Vocabulary· underline

Taxonomy

TopicsSpeech and dialogue systems · Natural Language Processing Techniques