Probably Dance

I can program and like games

C++ Coroutines Do Not Spark Joy

C++20 added minimal support for coroutines. I think they’re done in a way that really doesn’t fit into C++, mostly because they don’t follow the zero-overhead principle. Calling a coroutine can be very expensive (requiring calls to new() and delete()) in a way that’s not entirely under your control, and they’re designed to make it extra hard for you to control how expensive they are. I think they were inspired by C# coroutines, and the design does fit into C# much better. But in C++ I don’t know who they are for, or who asked for this…

Before we get there I’ll have to explain what they are and what they’re useful for. Briefly, they’re very useful for code with concurrency. The classic example is if your code has multiple state machines that change state at different times: E.g. the state machine for reading data from the network is waiting for more bytes, and the code that provides bytes is also a state machine (maybe it’s decompressing) which in turn gets its bytes from another state machine (maybe it’s handling the TCP/IP layer). This is easier to do when all of these can pretend to operate on their own, as if in separate threads, maybe communicating through pipes. But the code looks nicer if the streamer can just call into the decompressor using a normal synchronous function call that returns bytes. Coroutines allow you to do that without blocking your entire system when more bytes aren’t immediately available, because code can pause and resume in the middle of the function.

One of the best things the C++ standard did is to define the word “coroutine” as different from related concepts like “fibers” or “green threads”. (this very much went against existing usage, so for example Lua coroutines are not the same thing as C++ coroutines. I think that’s fine, since the thing that was added to C++ could be useful, and there is a good case to be made that the existing usage was wrong) In the standard, a coroutine is simply a function that behaves differently when called multiple times: Instead of restarting from the start on each call, it will continue running after the return statement that last returned. To do this, it needs some state to store the information of where it last returned, and what the state of the local variables was at that point. In existing usage that meant that you need a program stack to store this state, but in C++ this is just stored in a struct.

To illustrate all of this, lets build a coroutine in plain C++, without using the language coroutines:

Read the rest of this entry »

A Logarithm is Just the Number of Digits

If you don’t have a good intuition for how logarithms work, here is how you can explain it to a third grader: It’s the number of digits (roughly). For example it’s the number of zeros in these:

log_{10}(100) = 2

log_{10}(1000) =3

log_{10}(0.01) = -2

log_{10}(0.001) = -3

For this to work out we have to transition through zero: log(1)=0

If it bothers you that I’m off by one, because clearly 100 has 3 digits, not 2, you can get the actual number of digits by rounding away from 0 to the next integer.

Where it gets complicated is when you have a number between two round numbers:

log_{10}(123) = ?

We can’t just count the number of digits. It’s got to be somewhere between the other two numbers, coming out to roughly 2.09. But lets not worry about those odd cases for now. Where this really helps is when trying to build an intuition for the logarithm rules:

Read the rest of this entry »

Happy Easter! I hid an Easter Egg in Just Cause 4 for You

One of my favorite things to have worked on was the Getting Over It easter egg in Just Cause 4. It was quite popular. Since I love Getting Over It, I decided to make three more easter eggs, one for each DLC. Turns out that was impossibly ambitious, but I did manage to get one out. One of the DLCs contains a second Getting Over It easter egg that nobody has found yet:

I already did a subtle leak of this a while ago, so you can find the coordinates for this on some websites. But I must have been too subtle… Last easter didn’t feel right to do this bigger reveal (I had only recently left Avalanche, plus everyone’s brain was kinda fried thinking about covid) so this easter seems more appropriate. So find the coordinates online, or go looking based on the video. Happy easter egg hunting!

Read the rest of this entry »

Understanding why the CDC Gives the Guidelines that it Does

The CDC continues to issue very puzzling guidelines about the coronavirus. The most recent strange recommendation is that even if you already had covid, you should still get the vaccine. Which… makes no sense?

Let’s assume our goal is to reduce the number of deaths. Currently in the US, something like 26 million people have tested positive for SARS-CoV2, and roughly 440,000 people have died of COVID19. I’m using these terms to make a distinction: Not everyone who gets the virus will get sick, and not everyone who gets sick will die. Let’s be generous and assume that 100 million people got the virus so far, then mortality rate is 0.44%.

What do those same numbers look like for people who already had the virus before? Obviously the first number, the chance of getting the virus, should be roughly the same. But your body has fought this virus (or a very similar virus) before, so your chance of getting sick is much lower. For a long time it was uncertain whether you can get sick a second time at all, but now there are enough confirmed cases. How many of those have died? At least one, who was on chemotherapy at the time. Let’s estimate the number at 10, which would get us to a mortality rate of 10/26,000,000 = 0.000038%.

Should we really be giving people with that second risk the vaccine when not everyone in the first group has had a vaccine yet? Either the CDC worked with different numbers that are ten thousand times bigger, or their goal is not to reduce the number of deaths. What could that other goal be?

Read the rest of this entry »

Looking for Voter Fraud (in old elections) with Data Visualization

The 2020 US election is finally over, and all the election excitement of the last week reminded me of something I had been meaning to look into: Sergey Shpilkin’s data visualizations that clearly show fraud in Russian elections.

I generated the same visualizations for all US presidential elections from 2000 to 2016. The result is that I can’t find any evidence of fraud in any of those elections. But the visualizations show clear evidence of voter suppression of democrat voters. On closer inspection that turns out to be the effects of the electoral college system, which leads to a very interesting conclusion: You might already know that if the president was elected by popular vote, the US would only have had four years of republican presidencies from 1992 to 2024, with the rest being democrat. But these visualizations suggest that just looking at the popular vote actually underestimates the distortion of the electoral college. It also acts akin to voter suppression of democrats, without which national politics would swing even stronger to the left than the popular vote suggests. But lets start by looking at Sergey Shpilkin’s work:

Read the rest of this entry »

Using TLA+ in the Real World to Understand a Glibc Bug

TLA+ is a formal specification language that you can use to verify programs. It’s different from other formal verification systems in that it’s very pragmatic. Instead of writing proofs, it works using the simple method of running all possible executions of a program. You can write assertions and if they’re not true for any possible execution, it tells you the shortest path through your program that breaks your assertion.

In fact it’s so pragmatic that it even allows you to write your code in a language that looks similar to C.

I recently heard of a bug in the glibc condition variable implementation and since I had used TLA+ before to verify my own mutexes and condition variables, I thought I would investigate. Can you use it to find this bug in real-world complex code? Yes you can, barely, and it wasn’t easy, but it gives me hope that program verification is getting really good and is already able to deal with big and messy code:

Read the rest of this entry »

On Modern Hardware the Min-Max Heap beats a Binary Heap

The heap is a data structure that I use all the time and that others somehow use rarely. (I once had a coworker tell me that he knew some code was mine because it used a heap) Recently I was writing code that could really benefit from using a heap (as most code can) but I needed to be able to pop items from both ends. So I read up on double-ended priority queues and how to implement them. These are rare, but the most common implementation is the “Interval Heap” that can be explained quickly, has clean code and is only slightly slower than a binary heap. But there is an alternative called the “Min-Max Heap” that doesn’t have pretty code, but it has shorter dependency chains, which is important on modern hardware. As a result it often ends up faster than a binary heap, even though it allows you to pop from both ends. Which means there might be no reason to ever use a binary heap again.

Read the rest of this entry »

Partial Scaling – How to do Half a Multiplication

Programmers have a habit of over-generalizing things, and so it happened that I found myself writing more generalized versions of rotation, translation and scaling, the three most common operations that you’d want to do to 3D objects. These were more generalized in that they had a parameter for how much you want to do a certain operation. Like “translate by (10, 0, 0), but only apply the operation 20%”. This is easy to do for translation: Just multiply by the percentage and only translate by (2, 0, 0). Rotations are also easy in many representations: If the angles are explicit, like in Euler angles, you can just multiply those by the percentage; if you’re using quaternions, you can slerp.

But scaling is more complicated. Internally scaling is just multiplication, but how do you do half a multiplication? What does it mean to say “multiply by 4, but only apply the operation 50%”? My first approach was to multiply by the power, so you’d get “multiply by 4^{0.5}=2” (or 4^{0.1} if you only want to apply 10% of the operation) and that seems to work when you’re close to 1, but the further away you are from 1, the more wrong it gets. The answer ends up being to take the median of multiplication, division and exponentiation, but let me further explain the problem first:

Read the rest of this entry »

Concepts for the Current US Mess

An unforeseen disaster is never the consequence of a single factor, but rather is like a whirlwind, a point of cyclonic depression in the consciousness of the world, towards which a whole multiplicity of converging causalities have conspired

– Carlo Emilio Gadda (in That Awful Mess on the Via Merulana)

It’s hard for me to write a focused blog post at the moment because there just seem to be too many active problems. I could have written a focused blog post about a programming topic, but that feels tone-deaf. So instead this will be a scatter-shot blog post about ways of thinking that could help us out of this mess. Also, since I usually write about programming, I will try to feed the lessons back to programming.

For context (if you’re reading this in the future or from another country) the US has had a really bad year. We nearly started a war with Iran, we impeached our president but couldn’t get him out of office, and then we completely failed our response to the global pandemic. After initially doing nothing and hoping it would just go away, the US decided to react in the most costly way possible, causing mass unemployment while still proving mostly impotent in fighting the virus. Now, after that huge sunk cost, we have mostly given up on fighting the coronavirus, just in time for a new problem to arise: Massive amounts of protests all over the country, some of which even turned into riots. The immediate cause is that the police killed another unarmed black man because he was briefly resisting them. But of course it’s pent-up anger from years of police brutality. And of course it couldn’t have come at a worse time with mass-unemployment and a pandemic still raging through the country.

All of this didn’t have to be, so here are some helpful tools of thought:

Read the rest of this entry »

The Covid-Shutdowns are Actually a Great Civics Lesson

Currently much of the country is shut down to stop the spread of the coronavirus and there is very active debate about how soon we should open up again. Some people say as soon as possible, others are saying immediately. Those might sound like similar viewpoints, but “as soon as possible” might be anything from two weeks to two months, depending on who you ask. There’s also a lot of debate about how deadly a second wave would actually be if we opened up the country with few or no restrictions. What percentage would get the virus? How many of those would die?

Uncertainty about all of those numbers is slowly decreasing and it seems like the reopening will happen sooner rather than later.

But I want to frame the debate about how it’s actually a great civics lesson. It shows how the government is really of the people, by the people and for the people, and how it can only do things that the people allow it to do. It also neatly shows how we need the government to do things that everyone wants to happen, but that they can’t make happen on their own.

Read the rest of this entry »