If I may, there’s a great ressource that gradually introduces GA from the basics to the advanced stuff. It has helped me greatly to grasp the full power of this framework.

]]>https://skillsmatter.com/skillscasts/10772-geometric-algebra-in-haskell

He also provided the source code in the presentation notes

]]>That being said I just downloaded Morwenn’s code and ran this benchmark:

https://github.com/Morwenn/cpp-sort/blob/master/benchmarks/bench.cpp

Which for me locally (running on the same machine as the other benchmark above) says that ska_sort runs in 37 cycles per item. Which would be even faster than my own benchmark. Because 20ns is equal to 70 cycles on my 3.5 gigahertz computer.

So after looking into it, I still don’t have a good answer for you. It could be 37 cycles, could be 70 cycles, or it could be 300 cycles. But what I can tell you is that when running Morwenn’s benchmark for random data, ska_sort easily beats all the other sorting algorithms. So it’s not like my benchmarks were hand-tuned for my algorithm.

]]>https://github.com/skarupke/settled_vector

Thanks for letting me know that the links were dead.

]]>* L2 size starts to look too much like a L1 on its own (say over 1/16th for large tables);

* L2 gets hit too often in operations vs L1 Level one hits (run-time statistics).

With run-time statistics, an optimized load-factor for the L2 can probably computed during resizing operations.

Not that resizing the L2 only, involves moving less data, which is good. It does only re-distribute the collisions in L1 which remain in-place until a L1 growth.

I hope this all makes sense ðŸ™‚

]]>The generic idea:

* Have a primary hash table without any load factor at all.

* Each input is projected to be present in some bucket/group.

* If it cannot be found in that specific bucket/group (or any ** fixed number ** of them as you tune it), switch the operation to a more “conventional” second level structure.

What this gets us:

* The L1 hash table can be completely filled (no more load-factor directed sizing).

* Lookup in the L1 hash becomes a fixed number of steps, great for code optimization.

* When the L1 grows in powers of two (as many do), then having a smaller L2 to deal with collisions in L2 can save a ton of memory vs growing the L1.

Notes:

The L2 still operates with a load-factor and could even resize independently from the L1.

]]>Please let me know what you think. I would like to implement rotations next..

]]>