Probably Dance

I can program and like games

Looking for Voter Fraud (in old elections) with Data Visualization

The 2020 US election is finally over, and all the election excitement of the last week reminded me of something I had been meaning to look into: Sergey Shpilkin’s data visualizations that clearly show fraud in Russian elections.

I generated the same visualizations for all US presidential elections from 2000 to 2016. The result is that I can’t find any evidence of fraud in any of those elections. But the visualizations show clear evidence of voter suppression of democrat voters. On closer inspection that turns out to be the effects of the electoral college system, which leads to a very interesting conclusion: You might already know that if the president was elected by popular vote, the US would only have had four years of republican presidencies from 1992 to 2024, with the rest being democrat. But these visualizations suggest that just looking at the popular vote actually underestimates the distortion of the electoral college. It also acts akin to voter suppression of democrats, without which national politics would swing even stronger to the left than the popular vote suggests. But lets start by looking at Sergey Shpilkin’s work:

Read the rest of this entry »

Using TLA+ in the Real World to Understand a Glibc Bug

TLA+ is a formal specification language that you can use to verify programs. It’s different from other formal verification systems in that it’s very pragmatic. Instead of writing proofs, it works using the simple method of running all possible executions of a program. You can write assertions and if they’re not true for any possible execution, it tells you the shortest path through your program that breaks your assertion.

In fact it’s so pragmatic that it even allows you to write your code in a language that looks similar to C.

I recently heard of a bug in the glibc condition variable implementation and since I had used TLA+ before to verify my own mutexes and condition variables, I thought I would investigate. Can you use it to find this bug in real-world complex code? Yes you can, barely, and it wasn’t easy, but it gives me hope that program verification is getting really good and is already able to deal with big and messy code:

Read the rest of this entry »

On Modern Hardware the Min-Max Heap beats a Binary Heap

The heap is a data structure that I use all the time and that others somehow use rarely. (I once had a coworker tell me that he knew some code was mine because it used a heap) Recently I was writing code that could really benefit from using a heap (as most code can) but I needed to be able to pop items from both ends. So I read up on double-ended priority queues and how to implement them. These are rare, but the most common implementation is the “Interval Heap” that can be explained quickly, has clean code and is only slightly slower than a binary heap. But there is an alternative called the “Min-Max Heap” that doesn’t have pretty code, but it has shorter dependency chains, which is important on modern hardware. As a result it often ends up faster than a binary heap, even though it allows you to pop from both ends. Which means there might be no reason to ever use a binary heap again.

Read the rest of this entry »

Partial Scaling – How to do Half a Multiplication

Programmers have a habit of over-generalizing things, and so it happened that I found myself writing more generalized versions of rotation, translation and scaling, the three most common operations that you’d want to do to 3D objects. These were more generalized in that they had a parameter for how much you want to do a certain operation. Like “translate by (10, 0, 0), but only apply the operation 20%”. This is easy to do for translation: Just multiply by the percentage and only translate by (2, 0, 0). Rotations are also easy in many representations: If the angles are explicit, like in Euler angles, you can just multiply those by the percentage; if you’re using quaternions, you can slerp.

But scaling is more complicated. Internally scaling is just multiplication, but how do you do half a multiplication? What does it mean to say “multiply by 4, but only apply the operation 50%”? My first approach was to multiply by the power, so you’d get “multiply by 4^{0.5}=2” (or 4^{0.1} if you only want to apply 10% of the operation) and that seems to work when you’re close to 1, but the further away you are from 1, the more wrong it gets. The answer ends up being to take the median of multiplication, division and exponentiation, but let me further explain the problem first:

Read the rest of this entry »

Concepts for the Current US Mess

An unforeseen disaster is never the consequence of a single factor, but rather is like a whirlwind, a point of cyclonic depression in the consciousness of the world, towards which a whole multiplicity of converging causalities have conspired

– Carlo Emilio Gadda (in That Awful Mess on the Via Merulana)

It’s hard for me to write a focused blog post at the moment because there just seem to be too many active problems. I could have written a focused blog post about a programming topic, but that feels tone-deaf. So instead this will be a scatter-shot blog post about ways of thinking that could help us out of this mess. Also, since I usually write about programming, I will try to feed the lessons back to programming.

For context (if you’re reading this in the future or from another country) the US has had a really bad year. We nearly started a war with Iran, we impeached our president but couldn’t get him out of office, and then we completely failed our response to the global pandemic. After initially doing nothing and hoping it would just go away, the US decided to react in the most costly way possible, causing mass unemployment while still proving mostly impotent in fighting the virus. Now, after that huge sunk cost, we have mostly given up on fighting the coronavirus, just in time for a new problem to arise: Massive amounts of protests all over the country, some of which even turned into riots. The immediate cause is that the police killed another unarmed black man because he was briefly resisting them. But of course it’s pent-up anger from years of police brutality. And of course it couldn’t have come at a worse time with mass-unemployment and a pandemic still raging through the country.

All of this didn’t have to be, so here are some helpful tools of thought:

Read the rest of this entry »

The Covid-Shutdowns are Actually a Great Civics Lesson

Currently much of the country is shut down to stop the spread of the coronavirus and there is very active debate about how soon we should open up again. Some people say as soon as possible, others are saying immediately. Those might sound like similar viewpoints, but “as soon as possible” might be anything from two weeks to two months, depending on who you ask. There’s also a lot of debate about how deadly a second wave would actually be if we opened up the country with few or no restrictions. What percentage would get the virus? How many of those would die?

Uncertainty about all of those numbers is slowly decreasing and it seems like the reopening will happen sooner rather than later.

But I want to frame the debate about how it’s actually a great civics lesson. It shows how the government is really of the people, by the people and for the people, and how it can only do things that the people allow it to do. It also neatly shows how we need the government to do things that everyone wants to happen, but that they can’t make happen on their own.

Read the rest of this entry »

A New York History of Covid-19, Written at the Half-Way Point

You have to write these things down while you still remember them. I was already beginning to forget that there was a toilet paper shortage. Similarly right now the popular thing is to point out that this was predictable and we should have listened to the experts. But the experts were predicting that this would be much less bad:

The consensus forecast generated by the individual responses indicates that we should expect roughly 19,000 reported cases by March 29

To be fair, they thought that the curve would behave similar as in other countries. They didn’t expect the US government to mess up its response this badly.

Another thing that people are already forgetting is what “flatten the curve” meant. It was supposed to be a strategy to avoid the quarantine lockdown that we all now live in. Western countries didn’t want to do what China did, and “flattening the curve” was the appealing alternative. A lot of these things can only be understood in context, because things are changing so incredibly quickly that it feels like we’re living in a whole new world every couple weeks, and we forget. So lets start at the beginning while we still remember:

Read the rest of this entry »

A New Strategy Genre Grows Up: Survival Chaos, my New Favorite Game

I’ve had an obsession recently with a mod for Warcraft 3. It’s called Survival Chaos. I want to talk about it because it’s part of a genre of strategy games that hasn’t had a big success yet, and this feels like a big evolution, maybe even a breakthrough. It’s rare to see a new video game genre emerge like this, and nobody ever writes about this while it’s happening. The history of Auto Chess, the other recent genre to come out of Warcraft 3, is almost completely lost. (I was able to find a very similar map called “Pokemon Defense” from 2010, but that’s about it…)

I am not sure if the genre has a good name yet since it’s never been big. In StarCraft 2 it’s called “Tug Of War” so I’ll go with that. The basic idea is to make a RTS where you don’t control your units. You just build buildings, the buildings automatically make units, and you watch the units fight automatically. You mostly make decisions about the macro: When to invest in your economy, what units you should invest in, what upgrades you should get. Before we go any further though, let’s just watch a video of someone playing the game:

Read the rest of this entry »

Why Video Game AI does not Use Machine Learning

I used to be an AI programmer working on video games, and I’m currently trying to learn machine learning. As part of this I find myself having to repeatedly explain why video games don’t use machine learning. People seem to find it interesting enough because it’s not just the obvious reasons (machine learning is hard and far from solved for game playing) but it’s also about developer control and about making an understandable game for the player. Video game AI is designed to deliver a certain experience, which is more difficult to do with machine learning. So this blog post lists the main reasons why video game AI does not use machine learning.

Read the rest of this entry »

Measuring Mutexes, Spinlocks and how Bad the Linux Scheduler Really is

This blog post is one of those things that just blew up. From a tiny observation at work about odd behaviors of spinlocks I spent months trying to find good benchmarks, (still not entirely successful) writing my own spinlocks, mutexes and condition variables and even contributing a patch to the Linux kernel. The main thing I’ll try to answer is to give some more informed guidance on the endless discussion of mutex vs spinlock. Besides that I found that most mutex implementations are really good, that most spinlock implementations are pretty bad, and that the Linux scheduler is OK but far from ideal. The most popular replacement, the MuQSS scheduler has other problems instead. (the Windows scheduler is pretty good though)

So this all started like this: I overheard somebody at work complaining about mysterious stalls while porting Rage 2 to Stadia. (edit disclaimer: This blog post got more attention than anticipated, so I decided to clarify that I didn’t work on the Rage 2 port to Stadia. As far as I know that port was no more or less difficult than a port to any other platform. I am only aware of this problem because I was working in the same office as the people who were working on the port. And the issue was easily resolved by using mutexes instead of spinlocks, which will become clear further down in the blog. All I did was further investigation on my own afterwards. edit end) The only thing those mysterious stalls had in common was that they were all using spinlocks. I was curious about that because I happened to be the person who wrote the spinlock we were using. The problem was that there was a thread that spent several milliseconds trying to acquire a spinlock at a time when no other thread was holding the spinlock. Let me repeat that: The spinlock was free to take yet a thread took multiple milliseconds to acquire it. In a video game, where you have to get a picture on the screen every 16 ms or 33 ms (depending on if you’re running at 60hz or 30hz) a stall that takes more than a millisecond is terrible. Especially if you’re literally stalling all threads. (as was happening here) In our case we were able to make the problem go away by replacing spinlocks with mutexes, but that leads to the question: How do you even measure whether a spinlock is better than a mutex, and what makes a good spinlock?

Read the rest of this entry »