Skein Hash Breaking

So I think it is finally time to start updating this blog. On April 1st xkcd posted a comic titled "Externalities", which partly involved a competition to find a value that matches the most bits in a given Skein 1024 hash. I decided to write a program to automate the guessing process and see how well I could do.

With a good cryptographic hash like Skein there should be no way to predict the input used to generate a given hash. This means that the best strategy is to simply to compute as many random hashes as possible. I originally wrote a simple C# program using the SkeinFish library. It performed reasonably well at about 100,000 hashes per second, but was limited by the fact that it was single threaded and that the SkeinFish library was not the fastest. Running this for a few hours did produce a reasonable result of 411 bits different.

After a while I decided to rewrite the program properly, using an optimized Skein implementation and multithreading. The best option looked to be the Skein3Fish library in C. This was new territory for me, as all of my C experience had been on Arduino and other microcontrollers. Best way to learn is by doing!

The basic structure of the program is this: Each thread has a counter to generate input data to Skein (given how the hash works, there is no need to generate random input or anything... as far as I understand at least). A starting counter value is input when the program is started and each thread starts at this value plus a per-thread offset. The current counter value is mapped to ASCII characters and used to calculate a hash. Each thread keeps a “best” value. When a thread finds a better result, it takes a lock on the main thread and checks if its value beats the global best value. It updates the global value if it does, or updates its own best value if not. This prevents the threads from having to lock the main thread often.

I made some effort to optimize the code, although I'm sure there is still room for improvement. The worker threads only report status every ~1M hashes, the comparison of matching bits is done with the method described at http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetTable, and the code for converting the counter value to ASCII character is made as simple as possible. On a dual core i5 processor it reached almost 5 million hashes per second. A pretty nice improvement over the first version! Both myself and a friend ran the program overnight, completing about 300 billion hashes total and getting a best of 402 bits difference. Well behind the leaders but still decent in my opinion.

The source code is available here. The directory structure is somewhat messy due to starting from one of the Skein3Fish example programs, but it should be fairly easy to build using cmake. Given that it was my first serious C program and was not thoroughly tested there may still be some bugs. Feel free to let me know if you find anything, but with the competition being long over I am not planning to work on it any further myself.