ContraDoc: Understanding Self-Contradictions in Documents with Large   Language Models

Jierui Li; Vipul Raheja; Dhruv Kumar

arXiv:2311.09182·cs.CL·April 16, 2024·1 cites

ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models

Jierui Li, Vipul Raheja, Dhruv Kumar

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces ContraDoc, a new dataset for studying self-contradictions in long documents, and evaluates the capabilities of leading large language models, revealing that even the best models are still unreliable on nuanced contradictions.

Contribution

The paper presents the first human-annotated dataset for self-contradictions in long documents and analyzes the performance of top LLMs on this challenging task.

Findings

01

GPT4 outperforms other models and humans on the dataset.

02

All models struggle with nuanced and context-dependent contradictions.

03

Models are unreliable in detecting complex self-contradictions.

Abstract

In recent times, large language models (LLMs) have shown impressive performance on various document-level tasks such as document classification, summarization, and question-answering. However, research on understanding their capabilities on the task of self-contradictions in long documents has been very limited. In this work, we introduce ContraDoc, the first human-annotated dataset to study self-contradictions in long documents across multiple domains, varying document lengths, self-contradictions types, and scope. We then analyze the current capabilities of four state-of-the-art open-source and commercially available LLMs: GPT3.5, GPT4, PaLM2, and LLaMAv2 on this dataset. While GPT4 performs the best and can outperform humans on this task, we find that it is still unreliable and struggles with self-contradictions that require more nuance and context. We release the dataset and all the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ddhruvkr/contradoc
pytorchOfficial

Videos

ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques