Prediction Model For Wordle Game Results With High Robustness

Jiaqi Weng; Chunlin Feng

arXiv:2309.14250·stat.AP·September 26, 2023

Prediction Model For Wordle Game Results With High Robustness

Jiaqi Weng, Chunlin Feng

PDF

Open Access

TL;DR

This paper develops a robust predictive framework for Wordle game results using data analysis and machine learning, including neural networks and clustering, to estimate difficulty and submission patterns.

Contribution

It introduces a novel combination of time series, neural networks, and clustering to predict Wordle outcomes and difficulty levels, addressing data biases and overfitting.

Findings

01

Estimated daily submissions around 12,884 on March 1, 2023

02

Average attempts for 'eerie' are 4.8, indicating high difficulty

03

Models confirmed to be robust through sensitivity analyses

Abstract

In this study, we delve into the dynamics of Wordle using data analysis and machine learning. Our analysis initially focused on the correlation between the date and the number of submitted results. Due to initial popularity bias, we modeled stable data using an ARIMAX model with coefficient values of 9, 0, 2, and weekdays/weekends as the exogenous variable. We found no significant relationship between word attributes and hard mode results. To predict word difficulty, we employed a Backpropagation Neural Network, overcoming overfitting via feature engineering. We also used K-means clustering, optimized at five clusters, to categorize word difficulty numerically. Our findings indicate that on March 1st, 2023, around 12,884 results will be submitted and the word "eerie" averages 4.8 attempts, falling into the hardest difficulty cluster. We further examined the percentage of loyal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods in Epidemiology · Energy Load and Power Forecasting