Raising the WORDLE first-guess bar!

via boardgamegeek.com

I had seen the term WORDLE mentioned in Discord servers, but didn’t check it out until a few days ago. It’s a fun game that is essentially a Word Mastermind variant for 5 letters:

https://www.powerlanguage.co.uk/wordle/

My first question was obviously “what’s the best first word?” — surely there exists some optimal first guess (in a given word-list).

I grabbed the British English word-list in /usr/share/dict and trimmed it down to just 5-letter alpha-only words (5872 of them). A few helper functions for gameplay components: the ability to create a game object (with a defined or random word), generate feedback for a guess, automatically cull impossible words based on this feedback, print out still-possible words, etc. I then realized the scale of the problem and how it would not be easily optimally solved — each game state needs to include:

Case #2 vs #3 highlights a small but important distinction for situations where there are repeated letters. For example if the answer is PUPPY but we guess PAPAL we know that P_P__ is correct, but we still need to track that P may exist in the answer at positions [0, 1, 2, 4] => i.e. {PUPPY, POPPY}.

I then took a break from coding and did some research — reading this article by Bertrand Fan. His analysis of the game was very similar and he explained that there was a different dictionary used, which was split into two sub-lists: ‘possible answers’ and ‘valid dictionary words that are guaranteed not answers’. He looked at frequency analysis (however his code ignores the fact that words can have multiple copies of the same letter, which could skew certain letters much more than others). His final analysis is the most key — sorting possible first guesses based on how much information they gave in terms of green/yellow/black characters. He does admit that the definition of “best” is arbitrary, but I think it can be improved.

I decided on just evaluating the number of possible words in (average case, worst case) left after each possible first guess. After all, the optimal solutions to this game is one which can always determine the answer in X guesses or fewer, but has the lowest expected # of guesses for all possible answers. Since we’re not doing an exhaustive search (so there’s no guarantee there exists an optimal solution for X=6 even) using the remaining search-space seems more appropriate than just valuing the information by Bertrand’s heuristic “number of greens * 2 + number of yellows)”.

Brute forcing all the possible first guesses took a little over 3 hours running on 7 cores of an i7–6700K.

Best possible words to use as your first guess.

Raw data output (in the form WORD: AVG (min: MIN, max: MAX)): https://gist.github.com/Noxville/d17d04e196b146a2c18eda67aa4d2e13

As said above we want our initial guess to be one which, on average, reduces the remaining pool the most — but we also care about the worst case. With this in mind, there are two clear favourites for best first guess — ROATE has the lowest average search-space but a relatively high worst case, and RAISE has a slightly higher average worse case than ROATE but a much lower worst case. Guessing ROATE decreases the remaining possible word pool to just ~60.42 words on average, just 1.03% of the original 5872 valid words.

Interesting to note that ARISE, AESIR and REAIS also appear in this shortlist (all anagrams of RAISE) — however because of the letter ordering they do a less efficient job at eliminating words than RAISE (at least, for the provided word-list).

The worst first guesses include the likely suspects with repeated letters and no E’s like PUPPY, FUZZY, KIBBI, and (my favourite) SUSUS. These have an average and worst case remaining word count ~11.5x —12.5x and 6.5x respectively when compared to the best first guesses.

Some final/random observations:

Might spend some more time and see if there’s nice short-cuts to solve the game fully! Until then, I’m starting with RAISE!

- Noxville

--

--

Dota 2 statsman and occasional caster | runs @datdota

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store