Validating a forced-choice method for eliciting quality-of-reasoning judgments

Alexandru Marcoci; Margaret E. Webb; Luke Rowe; Ashley Barnett; Tamar Primoratz; Ariel Kruger; Christopher W. Karvetski; Benjamin Stone; Michael L. Diamond; Morgan Saletta; Tim van Gelder; Philip E. Tetlock; Simon Dennis

PMC · DOI:10.3758/s13428-023-02234-x·October 13, 2023

Validating a forced-choice method for eliciting quality-of-reasoning judgments

Alexandru Marcoci, Margaret E. Webb, Luke Rowe, Ashley Barnett, Tamar Primoratz, Ariel Kruger, Christopher W. Karvetski, Benjamin Stone, Michael L. Diamond, Morgan Saletta, Tim van Gelder, Philip E. Tetlock, Simon Dennis

PDF

Open Access

TL;DR

This paper shows that a forced-choice method can effectively and efficiently assess the quality of reasoning in written arguments, even for novices.

Contribution

The study introduces a validated forced-choice method for evaluating reasoning quality with high reliability and efficiency.

Findings

01

Novices and experts can reliably choose higher-quality arguments using forced-choice comparisons.

02

An AVL tree method and regression model improve the efficiency of quality-of-reasoning assessments.

03

Forced-choice judgments achieve high inter-rater reliability and accuracy beyond chance.

Abstract

In this paper we investigate the criterion validity of forced-choice comparisons of the quality of written arguments with normative solutions. Across two studies, novices and experts assessing quality of reasoning through a forced-choice design were both able to choose arguments supporting more accurate solutions—62.2% (SE = 1%) of the time for novices and 74.4% (SE = 1%) for experts—and arguments produced by larger teams—up to 82% of the time for novices and 85% for experts—with high inter-rater reliability, namely 70.58% (95% CI = 1.18) agreement for novices and 80.98% (95% CI = 2.26) for experts. We also explored two methods for increasing efficiency. We found that the number of comparative judgments needed could be substantially reduced with little accuracy loss by leveraging transitivity and producing quality-of-reasoning assessments using an AVL tree method. Moreover, a regression…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens(human · species)

Figures6

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDecision-Making and Behavioral Economics