Generic Temporal Reasoning with Differential Analysis and Explanation
Yu Feng, Ben Zhou, Haoyu Wang, Helen Jin, Dan Roth

TL;DR
This paper introduces TODAY, a new temporal reasoning task that tests models' understanding of how subtle contextual changes affect event temporal relations, revealing current models' reliance on superficial cues and proposing methods to improve reasoning capabilities.
Contribution
The paper presents TODAY, a novel benchmark for evaluating and training temporal reasoning models with differential analysis and explanation annotations, enhancing generalizability and reasoning accuracy.
Findings
Existing models, including GPT-3.5, perform at chance on TODAY.
TODAY's supervision and explanations improve model performance on benchmarks.
Models trained with TODAY's approach can leverage noisy supervision sources.
Abstract
Temporal reasoning is the task of predicting temporal relations of event pairs. While temporal reasoning models can perform reasonably well on in-domain benchmarks, we have little idea of these systems' generalizability due to existing datasets' limitations. In this work, we introduce a novel task named TODAY that bridges this gap with temporal differential analysis, which as the name suggests, evaluates whether systems can correctly understand the effect of incremental changes. Specifically, TODAY introduces slight contextual changes for given event pairs, and systems are asked to tell how this subtle contextual change would affect relevant temporal relation distributions. To facilitate learning, TODAY also annotates human explanations. We show that existing models, including GPT-3.5, drop to random guessing on TODAY, suggesting that they heavily rely on spurious information rather…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Bayesian Modeling and Causal Inference
Methods{Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · Attention Is All You Need · Weight Decay · 15 Ways to Contact How can i speak to someone at Delta Airlines · Cosine Annealing · Dropout · Linear Layer · Layer Normalization · Adam
