XForecast: Evaluating Natural Language Explanations for Time Series Forecasting
Taha Aksu, Chenghao Liu, Amrita Saha, Sarah Tan, Caiming Xiong, Doyen, Sahoo

TL;DR
This paper introduces new metrics to evaluate natural language explanations for time series forecasts, demonstrating their effectiveness and analyzing how large language models generate explanations based on numerical reasoning.
Contribution
It proposes simulatability-based metrics for assessing forecast explanations and evaluates LLMs' ability to generate high-quality natural language explanations for time series data.
Findings
Metrics effectively distinguish explanation quality
Human judgments align with the proposed metrics
Numerical reasoning influences explanation quality more than model size
Abstract
Time series forecasting aids decision-making, especially for stakeholders who rely on accurate predictions, making it very important to understand and explain these models to ensure informed decisions. Traditional explainable AI (XAI) methods, which underline feature or temporal importance, often require expert knowledge. In contrast, natural language explanations (NLEs) are more accessible to laypeople. However, evaluating forecast NLEs is difficult due to the complex causal relationships in time series data. To address this, we introduce two new performance metrics based on simulatability, assessing how well a human surrogate can predict model forecasts using the explanations. Experiments show these metrics differentiate good from poor explanations and align with human judgments. Utilizing these metrics, we further evaluate the ability of state-of-the-art large language models (LLMs)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods · Time Series Analysis and Forecasting · Scientific Computing and Data Management
MethodsALIGN
