Prediction Model For Wordle Game Results With High Robustness
Jiaqi Weng, Chunlin Feng

TL;DR
This paper develops a robust predictive framework for Wordle game results using data analysis and machine learning, including neural networks and clustering, to estimate difficulty and submission patterns.
Contribution
It introduces a novel combination of time series, neural networks, and clustering to predict Wordle outcomes and difficulty levels, addressing data biases and overfitting.
Findings
Estimated daily submissions around 12,884 on March 1, 2023
Average attempts for 'eerie' are 4.8, indicating high difficulty
Models confirmed to be robust through sensitivity analyses
Abstract
In this study, we delve into the dynamics of Wordle using data analysis and machine learning. Our analysis initially focused on the correlation between the date and the number of submitted results. Due to initial popularity bias, we modeled stable data using an ARIMAX model with coefficient values of 9, 0, 2, and weekdays/weekends as the exogenous variable. We found no significant relationship between word attributes and hard mode results. To predict word difficulty, we employed a Backpropagation Neural Network, overcoming overfitting via feature engineering. We also used K-means clustering, optimized at five clusters, to categorize word difficulty numerically. Our findings indicate that on March 1st, 2023, around 12,884 results will be submitted and the word "eerie" averages 4.8 attempts, falling into the hardest difficulty cluster. We further examined the percentage of loyal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods in Epidemiology · Energy Load and Power Forecasting
