Temporally Consistent Factuality Probing for Large Language Models

Ashutosh Bajpai; Aaryan Goyal; Atif Anwer; Tanmoy Chakraborty

arXiv:2409.14065·cs.CL·December 11, 2024

Temporally Consistent Factuality Probing for Large Language Models

Ashutosh Bajpai, Aaryan Goyal, Atif Anwer, Tanmoy Chakraborty

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces TeCFaP, a new benchmark for evaluating the temporal consistency of factual information in large language models, and proposes a novel training framework CoTSeLF to improve their temporal factuality.

Contribution

The study presents TeCFaP, a new dataset and metric extension for temporal factuality, and introduces CoTSeLF, a training method to enhance temporal consistency in LLMs.

Findings

01

Most LLMs perform poorly on TeCFaP.

02

CoTSeLF improves temporal factuality in LLMs.

03

Extended metrics effectively measure temporal consistency.

Abstract

The prolific use of Large Language Models (LLMs) as an alternate knowledge base requires them to be factually consistent, necessitating both correctness and consistency traits for paraphrased queries. Recently, significant attempts have been made to benchmark datasets and metrics to evaluate LLMs for these traits. However, structural simplicity (subject-relation-object) and contemporary association in their query formulation limit the broader definition of factuality and consistency. In this study, we introduce TeCFaP, a novel Temporally Consistent Factuality Probe task to expand the consistent factuality probe in the temporal dimension. To this end, we propose TEMP-COFAC, a high-quality dataset of prefix-style English query paraphrases. Subsequently, we extend the definitions of existing metrics to represent consistent factuality across temporal dimension. We experiment with a diverse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ab-iitd/tecfap
pytorchOfficial

Videos

Temporally Consistent Factuality Probing for Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management

MethodsSparse Evolutionary Training · Balanced Selection