Probably Dance

March 7, 2026

I’m Getting a Whiff of Iain Banks’ Culture

The US has been acting powerful recently and it reminded me of this question: What does it feel like to fight against a powerful AI? Not for normal people for whom there’s no difference between competing against a strong human or a strong AI, (you lose hard either way) but for the world’s best humans. We got a sense of the answer before LLMs were a thing, when the frontier research labs were working on game RL:

Fighting against a powerful AI feels like you’re weirdly underpowered somehow. Everything the AI does just works slightly better than it should.

If you’re not a strong human player, the closest feeling is when you play a game with lots of randomness against a really strong player. It will appear as if that strong player just keeps on getting lucky somehow.

I’m getting a similar sense for the recent US foreign interventions and wars. They all seem to work slightly better than they should. It finally clicked for me when Dario Amodei said “This technology can radically accelerate what our military can do. I’ve talked to admirals, I’ve talked to generals, I’ve talked to combatant commanders who say this has revolutionized what we can do.”

Read the rest of this entry »

Leave a comment

March 1, 2026

Word Map – A Game About Hill Climbing and Stepping Stones

I really like Semantle because I noticed that progress is similar to the description in Why Greatness Cannot Be Planned: The Myth of the Objective by Kenneth Stanley. The objective function is very clear and you can hill-climb your way to the top, but hill-climbing is actually very difficult in high-dimensional spaces, so you need to explore to find stepping stones and then exploit based on those stepping stones. I have long looked for a game where I can practice that behavior and Semantle almost was that game, so I decided to evolve it to make it easier to practice doing the right behaviors.

The result is Word Map. Give it a play.

Read the rest of this entry »

3 Comments

February 10, 2026

How Programmers Spend Their Time

I submitted a tiny patch to flash attention. The necessary typing for the change takes less ten seconds, but the overall change took more than ten hours So where does the time go?

It started when coworker had a bug where cudnn attention would crash randomly. We looked at his unreleased changes and concluded that they couldn’t possibly cause this, so we suspected that we had a lingering bug that was exposed by making harmless changes to related code.

Step 1, a few hours: My coworker tried to figure this out just by running the code repeatedly, trying out various theories. The bug was hard to reproduce so this took hours without much progress.

Step 2, 1 hour: I thought this is a good reason to try out compute sanitizer. It would be easiest to just run it on our existing tests to see if it finds any issues without my coworker’s changes. But the tests run on another box because they require certain GPUs, which means you have to run the tests through some layers. Unfortunately compute sanitizer really wants to be in charge of the program, so we have to convince those layers to let compute sanitizer run the whole thing. It keeps on failing and we can’t figure out why, until eventually I suspect that the issue is that the tests run in a sandbox, and the sandbox is strict enough that it breaks compute sanitizer somehow. This turned out to be true and we probably wasted an hour together.

Read the rest of this entry »

Leave a comment

January 31, 2026

How LLMs Keep on Getting Better

If you look at the source code of a modern open source LLM, it looks very similar to the transformer described in the “Attention is all you need” paper from 2017. It’s just a stack of exactly three components: attention blocks, matmuls, and norm layers. The big algorithmic changes, like Mamba 2 or linear attention variants, aren’t really used yet. But look closer and almost everything has changed in the details.

The story of how LLMs keep on getting better is one of pushing for big and little improvements in a hundred different directions. Turns out hill climbing can get you to a really good place if you just climb along enough dimensions. This makes it hard to notice changes as they’re happening because they’re so small, so lets look at the last two years and see how many small changes there were to add up to the big improvements we saw.

Read the rest of this entry »

4 Comments

November 3, 2025

How to Adjust Your Kids Sleep Schedule for Daylight Savings in 15 Minute Increments

Daylight savings just ended so this post is coming exactly too late. But I was talking to a friend about how we just adjust 15 minutes per day, which is quite easy. She was saying “oh I wish I had that kind of foresight.” But we were talking Friday evening, and daylight savings didn’t end until Sunday, so there was plenty of time for adjustment. So since people don’t seem to realize that, here is the very simple plan for ending daylight savings:

Autumn

Lets say your kids bedtime is at 8pm (I wish) then

On Friday go to bed 15 minutes late, 8:15pm
On Saturday go to bed 30 minutes late, 8:30pm
On Sunday go to bed 15 minutes early, 7:45pm
On Monday go to bed at your usual time, 8pm

And with that you’re done. If you realize too late and it’s Saturday already, you can just do 20 minute increments:

Read the rest of this entry »

Leave a comment

September 28, 2025

Avalanche Studios NYC Retrospective – An Ambitious Company Ruined by Bad Development Practices

I’ve wanted to write about this since the studio closed a year ago. Now that Contraband is also canceled I think it’s time, especially since Contraband was one of the big reasons why I left the company. The blog post turned out much bigger than expected though. There was a lot to get off my chest…

I worked at Avalanche Studios NYC from July 2012 to December 31st 2019, seven and a half years. I wasn’t there for most of Contraband’s development but it was obvious early on that it was going to be a very difficult project. If anything I’m surprised it lasted that long before being canceled.

The studio was born out of ambition. It failed because it could not deliver on that ambition. So this will necessarily be negative. But we had a good run and made two good games. I have so many memories and thoughts that I need to get written down somewhere, so lets celebrate the good and talk about the troubles.

Read the rest of this entry »

9 Comments

August 4, 2025

Installing a Mini-Split AC in a Brooklyn Apartment

Last year, 2024, we replaced four PTACs with a mini-split AC. I’ve been asked about it often enough (by neighbors, coworkers, friends) that I decided to write up the experience. Hopefully it’s useful for you, too.

Overall this cost us about $40k, including the cost for closing up the PTAC holes. We’ll probably never make the money back on electricity cost savings, so the main benefits are that we have more quiet and more stable temperatures now and overall I’m glad that we did it.

(I’ll use the term “heat pump” and “AC” interchangeably. Every “AC” mentioned in here can do both heating and cooling)

Read the rest of this entry »

13 Comments

June 19, 2025

Revisiting Knuth’s “Premature Optimization” Paper

The most famous quote from Knuth’s paper “Structured Programming with go to Statements” is this:

There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

People always use this quote wrong, and to get a feeling for that we just have to look at the original paper, and the context in which it was written. The paper isn’t actually about optimization. It’s about when we have to use goto statements because structured programming couldn’t express certain things at the time. Or at least it couldn’t do so efficiently, requiring extra checks, and that’s why Knuth has to talk about performance: The topic he is addressing is “Can we get rid of goto statements without sacrificing performance?”

Read the rest of this entry »

3 Comments

May 31, 2025

I’m Open-Sourcing my Custom Benchmark GUI

I think one of the reasons why I was able to do good performance work over the years is that at some point I started taking benchmarking seriously enough to write my own library. I used to use Google Benchmark, which is a fine library, but at some point you realize that you need a GUI to really scale up benchmarks¹. Here is a github link, and this video gives a quick intro:

The main problems it tries to address is:

Getting good numbers by running benchmarks repeatedly, visualizing them in context, picking a single opinionated good visualization, handling noise and even adding a bit of well-justified noise, and being careful about what statistics to do on the numbers.
Dealing with the inevitable combinatorial explosion of benchmarks when you want to try different data structures (min-max heap vs interval heap vs binary heap) with different operations (make_heap, push, pop) on different types (int vs string), different compilers, debug build vs release build, different variants of the code (e.g. trying loop unrolling), different input lengths etc. The full combinatorial explosion might be millions or billions of possible benchmarks. I want to be able to get a first impression for a subset in a few minutes. And then if I want less noisy results I can let it run overnight. And then I can try a new variation and visualize it together with the overnight results in under a minute.
Various ergonomic issues. Making it easy to select which numbers are together on the screen. Having the numbers as a graph first, CSV second. Being robust to the code crashing halfway through a long run: Record the partial results and be able to resume the same run. Making it easy to attach a profiler to one specific benchmark that I’m interested in.

This sounds complicated, and I have to admit that this is very much an app written by a programmer for a programmer, but the whole point of a GUI is that I can make this both more powerful and easier to use at the same time. In fact I think the patterns might be more widely useful for people who do slow-running experiments of other kinds (like training a ML model).

Read the rest of this entry »

Leave a comment

February 8, 2025

Why Does Integer Addition Approximate Float Multiplication?

Here is a rough approximation of float multiplication (source):

float rough_float_multiply(float a, float b) {
    constexpr uint32_t bias = 0x3f76d000;
    return bit_cast<float>(bit_cast<uint32_t>(a) + bit_cast<uint32_t>(b) - bias);
}

We’re casting the floats to ints, adding them, adjusting the exponent, and returning as float. If you think about it for a second you will realize that since the float contains the exponent, this won’t be too wrong: You can multiply two numbers by adding their exponents. So just with the exponent-addition you will be within a factor of 2 of the right result. But this will actually be much better and get within 7.5% of the right answer. Why?

Read the rest of this entry »

8 Comments