Probably Dance

I can program and like games

Bitcoin Mining for Space Heating

When googling for the above words you find lots of people making jokes about how Bitcoin mining hardware will turn into an expensive space heater quickly after purchase because the mining difficulty increases so fast. But using Bitcoin mining hardware as space heaters is not necessarily a bad idea.

In my apartment we mostly heat with electric heaters. Which is a giant waste of money, but we don’t have a choice. And electric heaters are basically devices that try to waste as much electricity as possible, because the more electricity they waste, the more heat they generate. Sounds kinda like Bitcoin mining, right? When I realized that I turned off my electric heater and made my computer mine Bitcoin. By my estimates I will get something like 2 or 3 dollar worth of Bitcoin out of that by the end of the month. Which is 2 or 3 dollar more than my electric heater gives me.

If I had realized this at the beginning of the winter and had bought specialized mining hardware, I could have actually gone through the winter making a good amount of money. If I can’t convince my landlord to switch our heating solution by next winter, I’ll certainly be doing that next year.

Introducing the Asserting Mutex

Multithreaded code in C++ tends to become brittle over time. If you write your code well you’ll need almost no synchronization between different threads, but the price of that is that your code will be littered with undocumented conventions of when you can read or modify which state. In your average threaded C++ application there are countless potential race conditions, all of which never happen because people follow conventions about when to do what. Until someone doesn’t know about a convention that he has to follow or until you change the conventions and you forget to update one piece of code that you didn’t know about.

Enter the asserting mutex. The asserting mutex is a conditional breakpoint that triggers only if a potential race condition actually happens. I call it asserting mutex because you use it like a mutex to protect a critical section. It works very simply: If one thread locks the asserting mutex and a second thread attempts to lock the asserting mutex before the first thread unlocks it, you get an assert. And it guarantees that both threads will still be inside the critical section when you get the assert. The cost is one atomic increment and one atomic decrement, which is not free but cheap enough that you can place lots of asserting mutexes in your code before they cause problems. So you could use this to document many of your threading conventions. Used correctly this is a breakpoint that makes it very easy to find data races.

Here is the complete code:

Read the rest of this entry »

Alias templates with partial specialization, SFINAE and everything

Alias templates are a new way to do typedefs in C++11. You have probably seen them by now, but as a reminder here is what the standard considers to be an alias-declaration:

template<typename T>
using MyVector = std::vector<T, MyAllocator<T>>; // alias-declaration
MyVector<T> myvector; // instantiating with MyAllocator;

So that’s cool. Unfortunately the standard also says that “Because an alias-declaration cannot declare a template-id, it is not possible to partially or explicitly specialize an alias template.” (Paragraph 14.5/3) But that would be a terribly useful thing to have.

Read the rest of this entry »

Type-safe Pimpl implementation without overhead

I like the pimpl idiom because I like to keep my headers as clean as possible, and other people’s headers are dirty. Unfortunately the pimpl idiom never feels like a good solution because it has runtime overhead that wouldn’t be needed if I didn’t care about clean headers so much.

If you’re not familiar with the Pimpl idiom, it stands for “pointer to implementation” and you use it in C/C++ headers to use a class without having to include the other header in your header. You can also use it to hide your implementation from your users so that you can change the internals of your class and nobody has to know. It’s used all over the place but it has one disadvantage: You always need an extra heap allocation and every method performs an extra pointer dereference.

This code fixes that, so that there can be zero runtime overhead. Here’s how to use it:

class btRigidBody;
class MyRigidBody
{
    // ...
    ForwardDeclaredStorage<btRigidBody, 768> bulletBody;
};

And the code is below:
Read the rest of this entry »

The Best Game Stories Are Told in the Past Tense

I just finished playing Analogue: A Hate Story, and it is a great game. While playing it I noticed a few patterns of game stories that I can’t find collected anywhere, so I’ll do that here then.

The main one is that the best game stories are told in the past tense. Meaning most or all of the story has taken place before the player starts playing. Bioshock does this, Portal does it, Gone Home, To the Moon and Analogue also do it. It’s easy to come up with counter examples that also have a good story (just look at Christine Love’s previous game: don’t take it personally babe, it just ain’t your story) but it’s interesting that this pattern should prove so successful in an interactive medium.

Read the rest of this entry »

Source to Source Optimization

In April Olaf Krzikalla Gave a talk titled “Performing Source-to-Source Transformations with Clang” (Videos available here)

In that talk he showed an auto vectorizer that doesn’t perform the optimization in the assembly, but which instead spits out new source code that performs the optimized operations. Here is a picture from his talk:

A picture showing a source-to-source transformation. There is the source code for a simple loop operating on an array of floats on the left hand side, and the same loop using SIMD instructions on the right hand side.

And I want to explain why all compiler optimizations should do this.

Read the rest of this entry »

4GB per Vector

I recently had a problem where I had a vector that was growing once, being iterated over once, and then deallocated. And it was bothering me how much time I spent on reallocating. The problem was that I also could not predict well how big the vector would be ahead of time. It could vary by orders of magnitude for input that looked similar at first.

As I was researching the minor page faults that I was hitting during the reallocations I came across this stack overflow question. There’s a person there who speaks of resizing an array by requesting a lot of memory from the OS, but then only actually committing the memory as he wants the array to grow.

Which is smart, and I wonder why nobody has taken it further. I think all vectors should operate like that all the time. Thus I wrote my own vector class which always requests 4GB of memory from the OS, and which then only actually commits that memory one page at a time. After all we have this enormous address space in 64 bit that we won’t be using for a long time.

Read the rest of this entry »

Handmade Coroutines for Windows

In a previous post I implemented coroutines using the ucontext.h header. In this post I will show a replacement that should be much faster. I will also provide a Windows implementation.

I like to start off with some code, so here is the complete code for switching the stack in Linux:

switch_to_context:
    pushq %rbp
    movq %rsp, %rbp
    // store rbx and r12 to r15 on the stack. these will be restored
    // after we switch back
    pushq %rbx
    pushq %r12
    pushq %r13
    pushq %r14
    pushq %r15
    movq %rsp, (%rdi) // store stack pointer
switch_point:
    // set up the other guy's stack pointer
    movq %rsi, %rsp
    // and we are now in the other context
    // restore registers
    popq %r15
    popq %r14
    popq %r13
    popq %r12
    popq %rbx
    popq %rbp
    retq

Read the rest of this entry »

The problems with uniform initialization

C++11 made the {} syntax for initializing aggregates more widely usable. You can now also use it to call constructors and to initialize with a std::initializer_list. It also allows you to drop the type name in some circumstances. The general recommendation seems to be that you use it as much as possible. But when I have started doing that I have found that it sometimes doesn’t do what I want, and that it may make maintenance more difficult.

Here’s what it looks like:

#include <iostream>

struct Widget
{
    Widget(int m) : m{m} {}
    operator int() const { return m; }
    int m;
};

Widget decrement_widget(Widget w)
{
    return { w - 1 };
}

int main()
{
    int a{1};
    std::cout << a << std::endl; // "1"
    Widget w{5};
    std::cout << w << std::endl; // "5"
    std::cout << decrement_widget({5}) << std::endl; // "4"
}

It can be used to initialize everything (hence the name uniform initialization) and as you can see it makes some code more convenient because I don’t even need the name any more if the compiler should know it.

It makes your code look a bit weird at first because you have squiggly braces everywhere, but after a while I found that I prefer it because it sets initialization apart from function calls.

But I have stopped using it and I recommend that you don’t use it either.

Read the rest of this entry »

A faster implementation of std::function

As I wrote in my last post, I consider std::function to be a very important class that will change how you design your code, because it means that you have to use inheritance less often. In that post I was very impressed with the performance of std::function when compiled with optimizations. Unfortunately std::function can be far slower than a virtual function call in debug.

I wanted a std::function implementation that doesn’t have too big of a performance impact on your application when debugging it, so I wrote my own. It is also faster than all other implementations that I could find in release mode.

The code is in the public domain (I want all library writers to start using it) and here is a download link.

Read the rest of this entry »