Non-crypto hashes in C++: Spooky Hash
[Hashing Home][Home]
SpookyHash was created by Bob Jenkins and was released on Halloween in 2010. It is one of the fastest non-cryptographic hashes around and is generally free of security problems. It can produce 32-bit, 64-bit and 128-bit hash values.
[Visual Studio solution].
|
Outline
The code is here:
Code
The following shows a C++ version of the code:
#include iostream #include fstream #include cstdlib #include cstring #include "SpookyV2.h" using namespace std; typedef uint64_t uint64; typedef uint32_t uint32; uint64_t swapVal(uint64_t x) { uint64_t y; char* px = (char*)&x; char* py = (char*)&y; for (int i = 0; i < sizeof(uint64_t); i++) py[i] = px[sizeof(uint64_t) - 1 - i]; return y; } void byte_swap32(unsigned int* pVal32) { unsigned char* pByte; pByte = (unsigned char*)pVal32; // move lsb to msb pByte[0] = pByte[0] ^ pByte[3]; pByte[3] = pByte[0] ^ pByte[3]; pByte[0] = pByte[0] ^ pByte[3]; // move lsb to msb pByte[1] = pByte[1] ^ pByte[2]; pByte[2] = pByte[1] ^ pByte[2]; pByte[1] = pByte[1] ^ pByte[2]; } #define BUFFSIZE 128 int main(int argc, char* argv[]) { string str = "abc"; char buff[128]; uint32 u32; uint64 u64; uint128_c_t u128; int seed = 0; strcpy_s(buff, str.c_str()); if (argc >= 2) { if (strlen(argv[1]) > BUFFSIZE - 1) return(0); strcpy_s(buff, argv[1]); } if (argc >= 3) seed = atoi(argv[2]); printf("String: %s\n", buff); printf("Seed: %d\n\n", seed); puts("\n===Spooky V2==="); uint32 u32s; uint64 u64s, u64s2; u32s = SpookyHash::Hash32(buff, strlen(buff), (uint32_t)0); printf("SpookyV2 32\t\t\t%lu Hex: %lx\n", u32s, u32s); u32s = SpookyHash::Hash32(buff, strlen(buff), (uint32_t)seed); printf("SpookyV2 32+Seed\t\t%lu Hex: %lx\n", u32s, u32s); u64s = SpookyHash::Hash64(buff, strlen(buff), (uint64_t)0); printf("SpookyV2 64\t\t\t%llu Hex: %llx\n", u64s, u64s); u64s = SpookyHash::Hash64(buff, strlen(buff), (uint64_t)seed); printf("SpookyV2 64+Seed\t\t%llu Hex: %llx\n", u64s, u64s); u64s = 0; //seed u64s2 = 0; SpookyHash::Hash128(buff, strlen(buff), &u64s, &u64s2); printf("SpookyV2 128\t\t\t%llx%llx\n", swapVal(u64s), swapVal(u64s2)); u64s = seed; //seed u64s2 = 0; SpookyHash::Hash128(buff, strlen(buff), &u64s, &u64s2); printf("SpookyV2 128+Seed\t\t%llx%llx\n", swapVal(u64s), swapVal(u64s2)); }
A sample run:
String: hello world Seed: 0 ===Spooky V2=== SpookyV2 32 2617184861 Hex: 9bff125d SpookyV2 32+Seed 2617184861 Hex: 9bff125d SpookyV2 64 14865987102431973981 Hex: ce4e98819bff125d SpookyV2 64+Seed 14865987102431973981 Hex: ce4e98819bff125d SpookyV2 128 5d12ff9b81984ece25103f0dee88e18b SpookyV2 128+Seed 5d12ff9b81984ece25103f0dee88e18b
A fuller test with SMHasher gives [here]:
Hash function MiB/sec cycl./hash cycl./map Quality problems ------------------------------------------------------------------ o1hash 12439661.09 16.77 166.13 insecure, no seed, zeros, fails all tests crc32_hw1 23208.73 46.74 179.70 insecure, 100% bias, collisions, distrib, BIC, machine-specific (x86 SSE4.2) t1ha0_aes_noavx 22785.26 38.71 180.61 LongNeighbors, machine-specific (x86 AES-NI) t1ha0_aes_avx1 22714.85 48.12 226.52 LongNeighbors, machine-specific (x64 AVX.txt) t1ha0_aes_avx2 22345.33 44.38 556.47 LongNeighbors, machine-specific (x64 AVX2) falkhash 20202.42 173.63 321.52 Sparse, LongNeighbors, machine-specific (x64 AES-NI) MeowHash64low 17378.06 85.48 237.60 Sparse, invertible, machine-specific (x64 AES-NI) MeowHash32low 17374.64 85.48 258.53 Sparse, invertible, machine-specific (x64 AES-NI) MeowHash 17371.91 85.48 247.96 Sparse, invertible, machine-specific (x64 AES-NI) FarmHash32 17112.05 47.7 214.71 machine-specific (x64 SSE4/AVX) xxh3 16538.52 32.81 184.86 DiffDist bit 7 w. 36 bits, BIC xxh3low 16462.36 32.77 199.79 farmhash32_c 16299.81 47.79 219.19 machine-specific (x64 SSE4/AVX) xxh128low 15174.85 33.79 187.05 xxh128 15174.14 40.46 195.65 farsh32 14053.09 74.29 245.33 insecure: AppendedZeroes, collisions+bias, MomentChi2, LongNeighbors metrohash64crc_2 14034.84 48.94 162.54 UB, Cyclic 8/8 byte, DiffDist, BIC, machine-specific (SSE4.2/NEON) metrohash64crc_1 14000.5 49.08 150.54 UB, Cyclic 8/8 byte, DiffDist, BIC, MomentChi2, machine-specific (SSE4.2/NEON) metrohash128crc_1 13948.67 65.2 168.08 UB, machine-specific (SSE4.2/NEON) metrohash128crc_2 13920.19 65.12 176.70 UB, machine-specific (SSE4.2/NEON) halftime_hash128 13478.23 97.79 252.14 wyhash32low 12911.09 29.59 205.43 2 bad seeds wyhash 12879 30.35 196.77 2^33 bad seeds CityCrc128 12343.43 74.5 209.75 fletcher2 12011.15 25.29 298.60 bad seed 0, UB, fails all tests fletcher4 11928.55 25.27 293.49 bad seed 0, UB, fails all tests halftime_hash256 11620.28 98.44 252.60 fibonacci 11339.87 26.33 705.64 UB, zeros, fails all tests ahash64 9862.62 27.32 181.68 rust Spooky128 9751.14 63.84 192.47 UB Spooky64 9747.47 62.2 191.71 UB Spooky32 9747.13 62.24 196.96 UB t1ha1_64le 9723.86 34.39 176.91 Avalanche cmetrohash64_1 9683.33 45.2 161.01 UB, LongNeighbors, BIC, MomentChi2 cmetrohash64_2 9669.95 44.75 149.67 LongNeighbors metrohash64_2 9668.51 44.45 164.30 UB, LongNeighbors metrohash64 9664.61 44.59 150.74 UB, LongNeighbors, BIC metrohash64_1 9664.57 45.37 152.31 UB, LongNeighbors, BIC, MomentChi2 cmetrohash64_1o 9658.31 42.84 163.45 UB, LongNeighbors, BIC, MomentChi2 FNV1a_YT 9643.42 32.06 175.19 bad seed, UB, fails all tests City128 9640.19 88.45 225.38 metrohash128_2 9584.94 59.1 167.43 UB, LongNeighbors metrohash128 9569.16 58.68 167.53 UB, LongNeighbors metrohash128_1 9558.17 59.04 175.94 UB, LongNeighbors FarmHash128 9409.63 74.52 210.25 VHASH_32 9404.99 77.01 250.57 sanity, Seed, MomentChi2 VHASH_64 9392.39 74.72 227.92 sanity, Seed, MomentChi2 farmhash128_c 9244.13 79.08 209.44 t1ha2_atonce 9237.12 38.94 194.32 Zeroes low3 City64noSeed 9090.42 32.23 171.53 Avalanche, Sparse, TwoBytes, MomentChi2, Seed City64low 9089.45 47.75 201.73 t1ha2_stream 9068.55 74.56 219.85 Sparse, Permutation, LongNeighbors City64 9066.9 47.81 197.78 Sparse, TwoBytes t1ha2_stream128 9065.5 93.19 236.50 Sparse, Permutation, LongNeighbors xxHash64 8936.63 51.31 174.34 pengyhash 8744.48 85.31 222.45 farmhash64_c 8713.16 47.96 201.00 FarmHash64 8684.76 48.13 200.51 crc64_hw 8440.13 34.94 141.15 insecure, 100% bias, collisions, distrib, BIC, machine-specific (SSE4.2/NEON) t1ha2_atonce128 8350.99 55.65 203.53 LongNeighbors PMPML_64 8161.19 53.2 179.16 unseeded: Seed, MomentChi2, BIC halftime_hash512 7681.62 125.81 274.01 tabulation 7621.75 42.19 179.93 t1ha1_64be 7481.37 38.16 193.22 Avalanche MUMlow 7225.18 37.85 197.92 UB, 5 bad seeds farsh64 7216.29 130.3 302.44 insecure: AppendedZeroes, collisions+bias, MomentChi2, LongNeighbors MUM 7134.56 37.85 172.34 UB, too many bad seeds, machine-specific (32/64 differs) nmhash32 7003.3 68.93 216.59 PMPML_32 6704.53 53.5 197.43 Avalanche >512, unseeded: Seed, BIC, MomentChi2, PerlinNoise nmhash32x 6342.95 56.41 217.75 crc32_hw 6330.42 35.55 170.16 insecure, 100% bias, collisions, distrib, BIC, machine-specific (SSE4.2/NEON) FNV2 6258.84 33.25 142.89 fails all tests FNV1A_Pippip_Yurii 6258.46 28.19 184.41 UB, sanity, fails all tests FNV1A_Totenschiff 6258.23 27.99 198.20 UB, zeros, fails all tests HighwayHash64 6242.58 99.55 248.41 mx3 6146.02 52.48 173.09 UB xxHash32 6040.87 51.77 177.91 LongNeighbors, collisions with 4bit diff, MomentChi2 220 prvhash64s_64 5481.48 170.05 325.39 mirhash 5413.73 39.68 154.47 UB, 2^36 bad seeds, LongNeighbors, machine-specific (32/64 differs) mirhash32low 5412.76 39.79 182.13 UB, 4 bad seeds, Cyclic, LongNeighbors, machine-specific (32/64 differs) Murmur3F 5226.4 52.18 175.85 UB prvhash64s_128 5161.33 260.96 442.70 t1ha0_32le 5132.18 54.83 193.53 Sparse, LongNeighbors halftime_hash64 4990.72 120.55 281.64 Murmur2B 4882.95 39.72 149.43 UB, 1.8% bias, collisions, 3.4% distrib, BIC seahash32low 4801.33 58.54 227.31 PerlinNoise 32, !msvc seahash 4796.97 58.55 201.58 PerlinNoise, !msvc fasthash32 4737.61 45.32 181.86 UB, insecure fasthash64 4737.21 42.79 164.87 UB, insecure, Moment Chi2 5159 ! umash32_hi 4662.92 54.22 214.20 umash64 4662.09 53.42 188.09 umash32 4633.19 53.42 216.33 t1ha0_32be 4585.59 55.98 183.45 Sparse, LongNeighbors clhash 4472.31 82.72 229.73 PerlinNoise, machine-specific (x64 SSE4.2) tabulation32 4317.34 35.45 197.20 collisions Murmur2C 4092.99 51.84 164.65 UB, 2^32 bad seeds, 91% bias, collisions, distr, BIC, LongNeighbors farsh128 3776.92 232.48 398.67 City32 3675.04 57.73 212.04 TSip 3228.14 57.96 211.71 !msvc Murmur3C 3197.63 67.9 198.00 UB, LongNeighbors, Text, DiffDist Crap8 3149.63 36.23 195.11 UB, 2.42% bias, collisions, 2% distrib Murmur2 3146.91 41.87 187.89 UB, 1 bad seed, 1.7% bias, 81x coll, 1.7% distrib, BIC Murmur2A 3146.79 46.87 191.96 UB, 1 bad seed, 12.7% bias, LongNeighbors aesnihash 2963.39 71.24 217.73 fails many tests, machine-specific (x64 AES-NI) BEBB4185 2951.62 222.03 343.63 UB, too many bad seeds, msvc-specific jodyhash6 2848.42 29.99 164.36 bias, collisions, distr, BIC, LongNeighbors wyhash32 2532.89 48.4 484.57 2 bad seeds, 32-bit umash128 2427.46 70.6 197.29 Murmur3A 2413.88 53.36 182.37 UB, 1 bad seed, Moment Chi2 69 prvhash64_64m 2386.19 51.18 186.87 prvhash64_128 2383.57 103.44 246.45 prvhash64_64 2375.72 51.61 190.97 hasshe2 2372.52 68.64 216.74 Permutation,TwoBytes,Zeroes,Seed PMurHash32 2344.78 58.48 196.43 1 bad seed, Moment Chi2 69 mirhashstrict32low 2218.87 65.48 190.59 1 bad seed, MomentChi2 9 mirhashstrict 2217.32 65.53 182.07 sha1ni 2019.96 135.84 564.40 insecure,sanity, Permutation, Zeroes, machine-specific sha1ni_32 2019.94 136.82 589.46 machine-specific superfast 1956.25 53.61 180.10 UB, bad seed 0, 91% bias, 5273.01x collisions, 37% distr, BIC sha2ni-256_64 1910.34 146.06 595.16 Zeroes, machine-specific sha2ni-256 1906.77 145.47 603.08 insecure,sanity, Permutation, Zeroes, machine-specific farsh256 1895.77 459.86 575.95 SipHash13 1889.1 89 199.95 0.9% bias Murmur1 1804.67 51.51 188.41 UB, 1 bad seed, 511x collisions, Diff, BIC lookup3 1658.31 48.84 194.15 UB, 28% bias, collisions, 30% distr, BIC pearsonbhash64 1486.34 104.32 185.03 poly_1_mersenne 1431.65 54.49 189.52 fails most tests jodyhash32 1428.37 44.36 185.85 bias, collisions, distr, BIC LongNeighbors pearsonbhash128 1347.03 121.75 214.84 poly_2_mersenne 1323.69 66.93 190.88 poly_3_mersenne 1323.59 74.86 206.77 poly_4_mersenne 1323.57 82.67 200.36 blake3_c 1285.91 340.01 552.63 no 32bit portability GoodOAAT 1052 71.62 192.19 pearsonbhash256 998.9 167.05 261.29 SipHash 980.88 127.77 246.19 MicroOAAT 977.6 59.61 185.06 100% bias, distrib, BIC FNV1a 791.84 69.69 177.84 bad seed, zeros, fails all tests sdbm 791.84 67.69 177.06 bad seed 0, fails all tests FNV64 791.82 70.24 159.29 fails all tests bernstein 791.82 68.63 180.71 bad seed 0, fails all tests beamsplitter 789.22 682.45 1150.3 UB, too many bad seeds HalfSipHash 755.78 114.47 243.72 zeroes chaskey 753.23 153.42 288.26 PerlinNoise x17 527.9 98.78 184.09 99.98% bias, fails all tests JenkinsOOAT_perl 452.49 118.78 194.78 bad seed 0, test FAIL MurmurOAAT 452.49 113.07 197.83 bad seed 0, 1.5-11.5% bias, 7.2x collisions, BIC, LongNeighbors JenkinsOOAT 452.48 142.85 213.93 bad seed 0, 1.5-11.5% bias, 7.2x collisions, BIC, LongNeighbors crc32 392.1 131.62 204.58 insecure, 8590x collisions, distrib, PerlinNoise sha1-160 364.95 1470.55 1794.1 Comb/Cyclic low32 rmd256 362.49 617.02 815.44 Bad seeds blake2b-256_64 356.97 1222.76 1435.0 blake2b-224 356.59 1228.5 1425.87 blake2b-160 356.08 1236.84 1458.15 blake2b-256 355.97 1232.22 1443.31 Sparse high 32-bit md5-128 353.76 638.29 803.39 md5_32a 353.64 629.85 799.56 Sparse high 32-bit sha1_32a 353.03 1385.8 1759.9 Cyclic low32, 36.6% distrib rmd128 334.36 659.03 838.32 Bad seeds blake2s-128 295.3 698.09 1059.2 pearsonhash64 287.95 174.11 196.50 Avalanche, Seed, SSSE3 only. broken MSVC pearsonhash128 287.95 171.72 194.61 Avalanche, Seed, SSSE3 only. broken MSVC pearsonhash256 264.51 184.87 218.79 Avalanche, Seed, SSSE3 only. broken MSVC blake2s-256 215.28 1014.88 1230.38 blake2s-160 215.01 1026.74 1239.54 blake2s-256_64 211.52 1044.22 1228.43 blake2s-224 207.06 1063.86 1236.50 rmd160 202.16 1045.79 1287.74 Bad seeds, Cyclic hi32 sha2-256_64 148.01 1376.34 1624.71 Bad seeds, Moment Chi2 7 sha2-256 147.8 1374.9 1606.06 Bad seeds, Moment Chi2 4 sha2-224_64 147.6 1360.1 1620.93 Bad seeds, Cyclic low32 sha2-224 147.13 1354.81 1589.92 Bad seeds, Comb low32 asconhashv12 144.98 885.02 1324.23 sha3-256 100.58 3877.18 4159.79 PerlinNoise sha3-256_64 100.57 3909 4174.63 PerlinNoise asconhashv12_64 86.73 684.02 606.93 floppsyhash 35.72 1868.92 1411.07 tifuhash_64 35.6 1679.52 1212.75 Cyclic low32
Problems include:
- Bad Seeds. A random seed is often used to initialise the hash. This problem relates to having a bad seed..
- Undefined behaviour (UB). This includes Misaligned, oob (Out of bounds), Signed integer overflow and Shift exponent overflow.
o1hash is the fastest, and a quick and dirty hashing method. It is generally not secure. t1h (Fast Positive Hash) requires AES-NI (Intel Advanced Encryption Standard Instructions) and is one of the fastest hashes at 22.785 Gbps. It has fairly good security levels. Meow comes in around 17.378 Gbps. We can see that SHA-1-160 runs as 364 Mbps, while Meow runs at 17.378 Gbps. Blake 3 is one of the fast cryptography hash and has a rate of 1.285 Gbps, and Blake 2 gives a rate around 215 Mbps. SHA-3 gives hashing rates of 100.58 Mbps.
Google recommend the following for 64-bit hashes without quality problems:
- xxh3low
- wyhash
- ahash64
- t1ha2_atonce
- FarmHash
- halftime_hash128
- Spooky32
- pengyhash
- nmhash32
- mx3
- MUM/mir
- fasthash32