Word Map – A Game About Hill Climbing and Stepping Stones

by Malte Skarupke

I really like Semantle because I noticed that progress is similar to the description in Why Greatness Cannot Be Planned: The Myth of the Objective by Kenneth Stanley. The objective function is very clear and you can hill-climb your way to the top, but hill-climbing is actually very difficult in high-dimensional spaces, so you need to explore to find stepping stones and then exploit based on those stepping stones. I have long looked for a game where I can practice that behavior and Semantle almost was that game, so I decided to evolve it to make it easier to practice doing the right behaviors.

The result is Word Map. Give it a play.

At a high level it’s a daily word game similar to Wordle. You start off by guessing randomly and trying to find patterns to what the target word should be. When you get close to the goal the titular word map pops up. This map addresses the biggest flaw of Semantle: The protracted grind to get to the end. With the help of the map, Word Map is closer in difficulty to Wordle. You won’t guess right in six tries obviously, but here I solve the puzzle on most days, where in Semantle I don’t.

Development – Trying out Vibe Coding

This was my first non-trivial project that’s mostly vibe-coded. Probably 98% of the code is written by Claude Code. It was overall very pleasant. I especially like how easy it is to polish things. E.g. while doing the final edits on this blog post I thought “man the text should really increase and decrease in size as you zoom in and out, but it also shouldn’t be too big when you keep on zooming out” so I just asked Claude to do it and it was done in seconds. I could have also done it myself in minutes, but there’s so much less friction when you can just tell an AI to do things. You end up doing more iterations.

The biggest win is that I don’t know CSS and don’t know React and don’t know Typescript and don’t know what to use to draw a 3D map in a browser, but I am still able to ship a web app using those technologies. And I learned a bit about React and Typescript while I was at it, which are things I had been meaning to do for a while. (I’ll never learn CSS though. I’ve tried often enough and have concluded that I’ll just leave that one to the AIs)

It is still flawed and I like to think my programming skills are still relevant, at least for a few more months as this thing gets better. I intervened in a few places:

Claude wanted to use cosine similarity as the distance metric (meaning normalize then do dot product). This is the most obvious thing to do but the size of the vectors contains meaning, so you really want some distance metric that takes that into account. I previously had good experiences with the Tanimoto coefficient so I used that: $\frac{x\cdot y}{x\cdot x + y \cdot y - x \cdot y}$
It stumbled over the map selection logic a few times. First it wrote a raycast, which makes sense. You can do two ways of selecting with raycasts, “cast with a radius and select the first hit point” or “select the point closest to the ray”, both of which have annoying edge cases. So I asked it to implement “first snap to the ground then select the nearest,” which is better but also has a few edge cases. So I started writing a very detailed description of how the approaches should be combined until I realized that this is silly and it would be faster for me to make the last little changes myself. Another problem was that this evolved over several sessions so I’d point out a selection bug to Claude and it would have forgotten why the code was the way it was and would say “oh this is complicated let me just delete all this and make it a simple raycast”. Maybe it just needed better comments, a thing that Claude does not yet do on its own. But overall this actually went fairly well and I still got this done faster and to a higher quality than if I had written it all myself.
It couldn’t figure out a bug where the input field kept on losing keyboard focus. Turns out it was disabling the input field while the guess is being submitted. This makes sense in case the server is laggy, but also kills keyboard focus. I had to step in and debug this one because Claude couldn’t figure it out after I asked it three or four times.

But those are details, and Claude figured out many more tricky details in this than I did. The mistakes it makes now still follow clear patterns but the error-area is much smaller than it was a year ago. A year ago it would get itself confused halfway through a long piece of code. Now it gets itself confused when there are complex interactions of things.

This project is big enough where you still have to come up with a plan and make Claude do the work step by step. It was able to create a website that looked and worked like Semantle in one go (with no map). But then I still needed backend functionality to batch-generate puzzles and to deploy this to a server and to enable caching and to test performance (I still expect the server to go down if this becomes popular, but it’s fast enough that I’m not worried about hundred of users). It helped with all of this but you have to ask it. Also I had to ask it to clean up messes that it made by copy+pasting the same code five times. I noticed this because I asked it to fix a bug (the “I give up” button would still show even though you had already solved a puzzle), but I was still able to trigger the bug by doing something slightly different. Then I finally looked at the code and noticed that there were five places doing the same thing, three of which still had the bug. It can clean that up much quicker than I can, but you have to ask it to do so. I have seen people post prompts online that tell Claude to clean up duplication on its own, and I’ll have to figure out how to set that up.

This was not a quick project even though I only wrote like 2% of the code. It took three months before it was good. Since I started with an existing game, this was mostly an exercise in design and taste.

I had a sense early on that I wanted to somehow visualize which directions you need to go. I thought of drawing arrows next to guesses but then you immediately want the landscape in which the arrows make sense.
I wanted a rugged mountain landscape made out of words where you can see when nearby words climb up walls or lead down into valleys. The target word would be the peak of the mountain. But UMAP just wouldn’t give me layouts that made sense, and I had a hard time trying to nudge it to arrange everything around one central point. I would have had to design every single mountain by hand. (or come up with a way to get an AI to design every single mountain by hand)
In the end I had to settle for a 1D UMAP projection and arrange all the words in a circle, with the distance to the center given by the similarity metric. It means you lose most of the semantic meaning of the words, but it’s much easier to understand and play. You still get some neat effects out of it where as you explore around the mountain you discover new aspects of the target word.

But once again overall this was a very pleasant experience. I can definitely tell that I’m running into some limits of vibe-coding already, but not in a hard way. There is always a workaround where you break down the problem a bit more. I think if I didn’t have Claude code, I would have never gotten this project done. I often just have an hour in the evening or thirty minutes in the morning and there is no way that I’d attempt something as intimidating as “figure out how to draw a map of word-embeddings” in that little time. But Claude just goes for it and gets it surprisingly right on the first try, and then you can always iterate on the details in the following days.

Mission Accomplished?

Did I actually accomplish my goal of making a game about hill climbing and stepping stones? Not fully. But there are a few things you can learn from playing this game:

Hill climbing works really well in high-dimensional spaces.
But hill climbing is also really difficult in high-dimensional spaces. At least when they’re visualized in low dimensions. It’s really hard to figure out the gradient that would allow you to take the next step, and to come up with a word from that.
Stepping stones help enormously. I really like how exploring the map worked out. At some point you get a bunch of words that have an aspect of the target word that you haven’t incorporated at all, and that gets the mental gears turning.
Near the end of the game it’s still hard to find stepping stones, mostly because I projected the high-dimensional word vector down to 1D. The “Smart Hint” feature is a bit of a clumsy way out of that, because it tells you the missing parts of the high-dimensional vector. It usually makes it pretty easy to come up with the target word. I wish I had come up with a less-big-hammer to solve that. (my best idea for that is to somehow use higher-dimensional UMAP and to draw the other dimensions as other colors or other symbols or something)

I still think this is a good game for practicing the approach for solving hard problems. To work as a game it had to be difficult in some aspects, but not too difficult, and easy in other aspects, but not too easy. I wanted a game where a human would have to use Novelty Search (the algorithm by Kenneth Stanley, who wrote the “Why Greatness Cannot Be Planned” book I referred to at the beginning), and I think I got that. And maybe it can serve as a stepping stone for someone else to make an interesting game that further explores gameplay that requires this kind of thinking.

Probably Dance