An Empirical Study on Context Length for Open-Domain Dialog Generation

Xinyi Shen; Zuoquan Lin

arXiv:2409.00315·cs.CL·September 4, 2024

An Empirical Study on Context Length for Open-Domain Dialog Generation

Xinyi Shen, Zuoquan Lin

PDF

Open Access 1 Repo

TL;DR

This study investigates how varying context lengths influence Transformer-based open-domain dialog models, revealing that context length significantly impacts model training and performance, with implications for optimizing dialog systems.

Contribution

It provides empirical insights into the effects of context length on dialog model training and performance, addressing a previously overlooked aspect.

Findings

01

Longer context can improve model training.

02

Different dialogs have varying optimal context lengths.

03

Context length choice affects model effectiveness.

Abstract

Transformer-based open-domain dialog models have become increasingly popular in recent years. These models typically represent context as a concatenation of a dialog history. However, there is no criterion to decide how many utterances should be kept adequate in a context. We try to figure out how the choice of context length affects the model. We experiment on three questions from coarse to fine: (i) Does longer context help model training? (ii) Is it necessary to change the training context length when dealing with dialogs of different context lengths? (iii) Do different dialog samples have the same preference for context length? Our experimental results show that context length, an often overlooked setting, deserves attention when implementing Transformer-based dialog models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

PKUAI-LINGroup/context-study
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems

MethodsSoftmax · Attention Is All You Need