LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text   Understanding and Generation

Jian Guan; Zhuoer Feng; Yamei Chen; Ruilin He; Xiaoxi Mao; Changjie; Fan; Minlie Huang

arXiv:2108.12960·cs.CL·January 19, 2022

LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation

Jian Guan, Zhuoer Feng, Yamei Chen, Ruilin He, Xiaoxi Mao, Changjie, Fan, Minlie Huang

PDF

Open Access 2 Repos 3 Models

TL;DR

This paper introduces LOT, a comprehensive Chinese long text benchmark with new datasets and a pretrained model LongLM, to evaluate and improve long text understanding and generation capabilities.

Contribution

The paper presents a new story-centric benchmark LOT and a large-scale Chinese long text pretrained model LongLM, addressing the lack of standardized evaluation for Chinese long text modeling.

Findings

01

LongLM outperforms similar-sized models on LOT tasks.

02

LOT effectively evaluates Chinese long text understanding and generation.

03

New datasets based on Chinese stories facilitate comprehensive assessment.

Abstract

Standard multi-task benchmarks are essential for developing pretraining models that can generalize to various downstream tasks. Existing benchmarks for natural language processing (NLP) usually focus only on understanding or generating short texts. However, long text modeling requires many distinct abilities in contrast to short texts, such as the modeling of long-range discourse and commonsense relations, and the coherence and controllability of generation. The lack of standardized benchmarks makes it difficult to assess these abilities of a model and fairly compare different models, especially Chinese models. Therefore, we propose a story-centric benchmark named LOT for evaluating Chinese long text modeling, which aggregates two understanding tasks and two generation tasks. We construct new datasets for these tasks based on human-written Chinese stories with hundreds of words.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods