Probably Dance

I can program and like games

Category: Programming

Using TLA+ in the Real World to Understand a Glibc Bug

TLA+ is a formal specification language that you can use to verify programs. It’s different from other formal verification systems in that it’s very pragmatic. Instead of writing proofs, it works using the simple method of running all possible executions of a program. You can write assertions and if they’re not true for any possible execution, it tells you the shortest path through your program that breaks your assertion.

In fact it’s so pragmatic that it even allows you to write your code in a language that looks similar to C.

I recently heard of a bug in the glibc condition variable implementation and since I had used TLA+ before to verify my own mutexes and condition variables, I thought I would investigate. Can you use it to find this bug in real-world complex code? Yes you can, barely, and it wasn’t easy, but it gives me hope that program verification is getting really good and is already able to deal with big and messy code:

Read the rest of this entry »

On Modern Hardware the Min-Max Heap beats a Binary Heap

The heap is a data structure that I use all the time and that others somehow use rarely. (I once had a coworker tell me that he knew some code was mine because it used a heap) Recently I was writing code that could really benefit from using a heap (as most code can) but I needed to be able to pop items from both ends. So I read up on double-ended priority queues and how to implement them. These are rare, but the most common implementation is the “Interval Heap” that can be explained quickly, has clean code and is only slightly slower than a binary heap. But there is an alternative called the “Min-Max Heap” that doesn’t have pretty code, but it has shorter dependency chains, which is important on modern hardware. As a result it often ends up faster than a binary heap, even though it allows you to pop from both ends. Which means there might be no reason to ever use a binary heap again.

Read the rest of this entry »

Partial Scaling – How to do Half a Multiplication

Programmers have a habit of over-generalizing things, and so it happened that I found myself writing more generalized versions of rotation, translation and scaling, the three most common operations that you’d want to do to 3D objects. These were more generalized in that they had a parameter for how much you want to do a certain operation. Like “translate by (10, 0, 0), but only apply the operation 20%”. This is easy to do for translation: Just multiply by the percentage and only translate by (2, 0, 0). Rotations are also easy in many representations: If the angles are explicit, like in Euler angles, you can just multiply those by the percentage; if you’re using quaternions, you can slerp.

But scaling is more complicated. Internally scaling is just multiplication, but how do you do half a multiplication? What does it mean to say “multiply by 4, but only apply the operation 50%”? My first approach was to multiply by the power, so you’d get “multiply by 4^{0.5}=2” (or 4^{0.1} if you only want to apply 10% of the operation) and that seems to work when you’re close to 1, but the further away you are from 1, the more wrong it gets. The answer ends up being to take the median of multiplication, division and exponentiation, but let me further explain the problem first:

Read the rest of this entry »

Why Video Game AI does not Use Machine Learning

I used to be an AI programmer working on video games, and I’m currently trying to learn machine learning. As part of this I find myself having to repeatedly explain why video games don’t use machine learning. People seem to find it interesting enough because it’s not just the obvious reasons (machine learning is hard and far from solved for game playing) but it’s also about developer control and about making an understandable game for the player. Video game AI is designed to deliver a certain experience, which is more difficult to do with machine learning. So this blog post lists the main reasons why video game AI does not use machine learning.

Read the rest of this entry »

Measuring Mutexes, Spinlocks and how Bad the Linux Scheduler Really is

This blog post is one of those things that just blew up. From a tiny observation at work about odd behaviors of spinlocks I spent months trying to find good benchmarks, (still not entirely successful) writing my own spinlocks, mutexes and condition variables and even contributing a patch to the Linux kernel. The main thing I’ll try to answer is to give some more informed guidance on the endless discussion of mutex vs spinlock. Besides that I found that most mutex implementations are really good, that most spinlock implementations are pretty bad, and that the Linux scheduler is OK but far from ideal. The most popular replacement, the MuQSS scheduler has other problems instead. (the Windows scheduler is pretty good though)

So this all started like this: I overheard somebody at work complaining about mysterious stalls while porting Rage 2 to Stadia. (edit disclaimer: This blog post got more attention than anticipated, so I decided to clarify that I didn’t work on the Rage 2 port to Stadia. As far as I know that port was no more or less difficult than a port to any other platform. I am only aware of this problem because I was working in the same office as the people who were working on the port. And the issue was easily resolved by using mutexes instead of spinlocks, which will become clear further down in the blog. All I did was further investigation on my own afterwards. edit end) The only thing those mysterious stalls had in common was that they were all using spinlocks. I was curious about that because I happened to be the person who wrote the spinlock we were using. The problem was that there was a thread that spent several milliseconds trying to acquire a spinlock at a time when no other thread was holding the spinlock. Let me repeat that: The spinlock was free to take yet a thread took multiple milliseconds to acquire it. In a video game, where you have to get a picture on the screen every 16 ms or 33 ms (depending on if you’re running at 60hz or 30hz) a stall that takes more than a millisecond is terrible. Especially if you’re literally stalling all threads. (as was happening here) In our case we were able to make the problem go away by replacing spinlocks with mutexes, but that leads to the question: How do you even measure whether a spinlock is better than a mutex, and what makes a good spinlock?

Read the rest of this entry »

A New Algorithm for Controlled Randomness

I don’t know if this problem has a proper name, but in game development you usually don’t want truly random numbers. I’ve seen the solution named a “pseudo random distribution” but “pseudo random” already has a different meaning outside of game design, so I will just call it controlled randomness for this blog post.

The description of the problem is this: An enemy is supposed to drop a crucial item 50% of the time. Players fight this enemy over and over again, and after fighting the enemy ten times the item still hasn’t dropped. They think the game is broken. Or an attack is supposed to succeed 90% of the time but misses three times in a row. Players will think the game is broken.

In this blog post I want to expand that problem to the situation where you not only have two choices (success or fail) but many choices. For example you want to create traffic on a road and spawn a bunch of random cars without having the same car too many times. The problem was already partially solved for the success/fail case, and in this blog post I will improve on that solution and present the solution for the case where there are many choices.

I will also allow you to control exactly how random or non-random you want the result to be. If you’re fine with a 90% success chance to fail three times in a row in certain situations, but want it to be more reliable in other situations, you will be able to tweak that with a number.

Read the rest of this entry »

Treasure Hunting Systems Found in the History of Video Games

A treasure hunting system is a system that unexpectedly puts out really good stuff. Proper treasure that makes people an enormous amount of money. An example is the Warcraft III modding community which invented several new genres of games and sprouted DotA, whose clones and offspring made their creators rich. (I don’t know how much money exactly, but Riot Games got acquired for $400 million, and their only product is a DotA-clone)

This has happened several times in the history of video games, but I didn’t link these together until I recently saw a talk about the Czechoslovakian game developer community before the iron curtain fell. The presenter talked about how the small country of Czechoslovakia had a thriving video game community despite the fact that you couldn’t buy computers in Czechoslovakia. But when I saw the talk I couldn’t help but think that “this reminds me of the Warcraft 3 modding community,” so I figured I should write up what these and other historical examples have in common so that we can build more systems that generate treasures.

Read the rest of this entry »

On a Future of Screen-less Computers

The current problem with computers was well articulated in the piece The Machine Stops by the late Oliver Sacks:

I cannot get used to seeing myriads of people in the street peering into little boxes or holding them in front of their faces, walking blithely in the path of moving traffic, totally out of touch with their surroundings. I am most alarmed by such distraction and inattention when I see young parents staring at their cell phones and ignoring their own babies as they walk or wheel them along. Such children, unable to attract their parents’ attention, must feel neglected, and they will surely show the effects of this in the years to come.


These gadgets […] have now immersed us in a virtual reality far denser, more absorbing, and even more dehumanizing. I am confronted every day with the complete disappearance of the old civilities. Social life, street life, and attention to people and things around one have largely disappeared, at least in big cities, where a majority of the population is now glued almost without pause to phones or other devices—jabbering, texting, playing games, turning more and more to virtual reality of every sort.

It reminded me of this quote by Wilson Miner:

The car shaped our environment in the 20th century in this huge, tectonic way. I don’t think it’s a stretch to say that the screen will be as important to shaping our environment in the 21st century.

I’m not sure if he meant this as a warning, but considering how little we like being in neighborhoods that are built more for cars than for pedestrians, I think it should be interpreted as one.

Read the rest of this entry »

A Programmers Take on “Six Memos for the Next Millennium”

Six Memos for the Next Millennium is a collection of five lectures that Italo Calvino was going to give in 1985. Unfortunately he passed away before he was able to deliver the lectures. Because of that the book is just a collection of his notes. He also hadn’t started on the sixth one, so the book only contains five. I became aware of the book because Jonathan Blow gave a great talk about it, and about how Italo Calvino inspired him:

The reason why I’m writing about the book is that while I think that they are great memos about writing, the more I think about them, the more they apply to programming. Which is a weird coincidence, because they were supposed to be memos for writers in the next millennium, and programming is kind of a new form of writing that’s becoming more important in this millennium.

Read the rest of this entry »

Fibonacci Hashing: The Optimization that the World Forgot (or: a Better Alternative to Integer Modulo)

I recently posted a blog post about a new hash table, and whenever I do something like that, I learn at least one new thing from my comments. In my last comment section Rich Geldreich talks about his hash table which uses “Fibonacci Hashing”, which I hadn’t heard of before. I have worked a lot on hash tables, so I thought I have at least heard of all the big important tricks and techniques, but I also know that there are so many small tweaks and improvements that you can’t possibly know them all. I thought this might be another neat small trick to add to the collection.

Turns out I was wrong. This is a big one. And everyone should be using it. Hash tables should not be prime number sized and they should not use an integer modulo to map hashes into slots. Fibonacci hashing is just better. Yet somehow nobody is using it and lots of big hash tables (including all the big implementations of std::unordered_map) are much slower than they should be because they don’t use Fibonacci Hashing. So let’s figure this out.

Read the rest of this entry »