Unveiling Divergent Inductive Biases of LLMs on Temporal Data
Sindhu Kishore, Hangfeng He

TL;DR
This paper investigates the differing biases of GPT-3.5 and GPT-4 in understanding temporal data, revealing significant disparities in their performance and preferences for certain temporal relationships and truth values.
Contribution
It provides a detailed analysis of how two major LLMs exhibit divergent inductive biases on temporal data, highlighting complexities in their temporal understanding.
Findings
GPT-3.5 prefers 'AFTER' in QA format
GPT-4 prefers 'BEFORE' in QA format
GPT-3.5 tends towards 'TRUE', GPT-4 towards 'FALSE' in TE format
Abstract
Unraveling the intricate details of events in natural language necessitates a subtle understanding of temporal dynamics. Despite the adeptness of Large Language Models (LLMs) in discerning patterns and relationships from data, their inherent comprehension of temporal dynamics remains a formidable challenge. This research meticulously explores these intrinsic challenges within LLMs, with a specific emphasis on evaluating the performance of GPT-3.5 and GPT-4 models in the analysis of temporal data. Employing two distinct prompt types, namely Question Answering (QA) format and Textual Entailment (TE) format, our analysis probes into both implicit and explicit events. The findings underscore noteworthy trends, revealing disparities in the performance of GPT-3.5 and GPT-4. Notably, biases toward specific temporal relationships come to light, with GPT-3.5 demonstrating a preference for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNeural Networks and Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Layer · Layer Normalization · Multi-Head Attention · Weight Decay · Adam · Cosine Annealing
