Parachute: Evaluating Interactive Human-LM Co-writing Systems

Hua Shen; Tongshuang Wu

arXiv:2303.06333·cs.HC·March 28, 2023·5 cites

Parachute: Evaluating Interactive Human-LM Co-writing Systems

Hua Shen, Tongshuang Wu

PDF

Open Access

TL;DR

This paper introduces Parachute, a comprehensive evaluation framework for assessing interactive human-LM co-writing systems, addressing the lack of systematic evaluation methods in this emerging area.

Contribution

The paper presents Parachute, a novel human-centered evaluation framework with categorized metrics for assessing interactive co-writing systems involving language models.

Findings

01

Demonstrates how to evaluate co-writing systems using Parachute

02

Provides a structured approach to compare different systems

03

Highlights the importance of interaction evaluation in co-writing

Abstract

A surge of advances in language models (LMs) has led to significant interest in using LMs to build co-writing systems, in which humans and LMs interactively contribute to a shared writing artifact. However, there is a lack of studies assessing co-writing systems in interactive settings. We propose a human-centered evaluation framework, Parachute, for interactive co-writing systems. Parachute showcases an integrative view of interaction evaluation, where each evaluation aspect consists of categorized practical metrics. Furthermore, we present Parachute with a use case to demonstrate how to evaluate and compare co-writing systems using Parachute.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems