I haven't found a faster solver yet in my benchmarks, so I would be happy to see other implementations! The big caveat is that it still purely relies on the Dancing Links technique and not on the newer Dancing Cells from Christine Solnon and the TAOCP Fascicle 7. While my implementation has a different memory architecture than the standard one (arrays of fields instead of an array of node structs each containing the same fields), this could be an interesting addition to the provided algorithms. There is nothing preventing this addition but the involved implementation effort.
Would be nice to read some thoughts about Exact-Cover and the problems you solve with it! Maybe this project encourages somebody to look at this nice paradigm again :)