Thresh: A Unified, Customizable and Deployable Platform for Fine-Grained   Text Evaluation

David Heineman; Yao Dou; Wei Xu

arXiv:2308.06953·cs.CL·October 17, 2023

Thresh: A Unified, Customizable and Deployable Platform for Fine-Grained Text Evaluation

David Heineman, Yao Dou, Wei Xu

PDF

Open Access 1 Repo

TL;DR

Thresh is a versatile platform enabling customizable, unified, and deployable fine-grained text evaluation with easy configuration, community sharing, and support for multiple NLP tasks and deployment scales.

Contribution

It introduces Thresh, a flexible platform that simplifies building, deploying, and sharing fine-grained annotation tools for various NLP evaluation tasks.

Findings

01

Supports rapid setup with a single YAML file

02

Provides a community hub for sharing annotation frameworks

03

Offers multiple deployment options for different project scales

Abstract

Fine-grained, span-level human evaluation has emerged as a reliable and robust method for evaluating text generation tasks such as summarization, simplification, machine translation and news generation, and the derived annotations have been useful for training automatic metrics and improving language models. However, existing annotation tools implemented for these evaluation frameworks lack the adaptability to be extended to different domains or languages, or modify annotation settings according to user needs; and, the absence of a unified annotated data format inhibits the research in multi-task learning. In this paper, we introduce Thresh, a unified, customizable and deployable platform for fine-grained evaluation. With a single YAML configuration file, users can build and test an annotation interface for any framework within minutes -- all in one web browser window. To facilitate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sebajoe/thresh
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research

MethodsLib