4GB per Vector
by Malte Skarupke
I recently had a problem where I had a vector that was growing once, being iterated over once, and then deallocated. And it was bothering me how much time I spent on reallocating. The problem was that I also could not predict well how big the vector would be ahead of time. It could vary by orders of magnitude for input that looked similar at first.
As I was researching the minor page faults that I was hitting during the reallocations I came across this stack overflow question. There’s a person there who speaks of resizing an array by requesting a lot of memory from the OS, but then only actually committing the memory as he wants the array to grow.
Which is smart, and I wonder why nobody has taken it further. I think all vectors should operate like that all the time. Thus I wrote my own vector class which always requests 4GB of memory from the OS, and which then only actually commits that memory one page at a time. After all we have this enormous address space in 64 bit that we won’t be using for a long time.
I’ve called it a “settled vector” because it’s memory never moves around. (and because I’m replaying The Settlers 2 at the moment…) The source code is here. It is for Linux only, (you could port it to Windows using VirtualAlloc) 64 bit only (obviously) and it is slightly untested, but it works well enough that I am using it and that I can measure performance.
It’s fun: The program that I’m using this for right now has 844GB of memory reserved, while it is only actually using 72MB. If you’re wondering why my programs don’t blow up when I reserve several hundred gigabytes of memory on a machine that only has 8GB: I am just reserving the memory, not using it. And as long as I don’t use it, the operating system doesn’t actually map that memory to physical memory. The only thing I’m doing is telling the OS that I would like to have a 4GB range of continuous memory that I may or may not end up actually using.
For performance I measured my use case from above: A vector that does not reserve memory, grows once and is then discarded.
- When the vectors are less than 64kb in size, std::vector is faster. I assume because the settled vector talks to the OS more.
- Otherwise the settled vector can be up to 35% faster for vectors that are many megabytes in size.
- Iterators, pointers and references are never invalidated.
- Easier to determine out of range errors, use after free etc: Just lock the pages.
- It’s trivial to lock and unlock the memory at runtime for your own safety. (for example if you want to make sure that certain state is not modified while another thread reads from it)
- Easy to see where a pointer comes from: Every vector has it’s own 4GB range that will never change.
- Need to allocate at least one page (4kb)
- Bad performance for a small vector
- May blow up the operating system depending on how well it is expecting this (hasn’t happened to me yet)
That being said I haven’t quite understood the implications of this myself just yet. I think that other things will come out of this. For example I still have the whole capacity() reserve() etc concept. I probably should just get rid of that and have a zero overhead push_back() instead.
So I will continue to use std::vector for small allocations, but for anything that’s bigger than 128kb, I will from now on reserve 4GB of memory.