ChargeHash
[Hashing Home][Home]
ChargeHash [1] is a non-cryptographic hashing function and created by William Stafford Parsons. It is an extremely small 32-bit hashing method with dual-threaded single-byte processing. A key advantage of the hash is that is doesn't use modulus, multiplication or division arithmetic operations, and where we process one by at a time. It can also run as two main threads, and which makes it run faster on systems with two CPUs. It remembers its state with a 64-bit state value, and can run on 32-bit or 64-bit systems (x86 and x64). Overall, it can be used as an alternative to CRC32 [here] and FNV1a [here].
|
Theory
Overall, there are many types of hashing methods, including ones which are data-oriented and ones that a security-oriented. A cryptographically secure hashing method should show that it is not possible to mathematically reverse the hash back to the original data, and where we cannot predict any of the bits that will change on the output. But, hashing methods are also used for hash tables, where we just need to make sure that we avoid hash collision. With this, we want a fast way of hashing data, and then using the hashed value to provide our look. Some of the most popular methods of this type of hash are xxHash64, City64 and Mumur. ChargeHash [1] has been created by William Stafford Parsons and is a light-weight 32-bit hashing method with dual-threaded single-byte processing:
uint32_t chargehash(const unsigned long input_count, const unsigned char *input) { uint32_t state = 0; uint32_t auxiliary = 0; unsigned long i = 0; while (i != input_count) { state += (state << 11) | (input[i] + 1111); state ^= state >> 5; auxiliary += ((auxiliary << 8) | input[i]) + 1; i++; } state += (((state << 9) | (state >> 23)) + state) << (17 - (state & 3)); auxiliary += (auxiliary << 11) | 1111; state += auxiliary >> 6; state ^= state >> 5; state ^= (((state << 21) | (state >> 11)) - state) >> (13 - (state & 7)); state += state << 15; state += (state >> 7) + (auxiliary >> 2) + auxiliary; return state; }
A sample run is here:
=== ChargeHash========= Message: 1000 Length: 4 Output: 83ef7029
As we can see the output has four bytes, and is thus 32 bits long. A key advantage of the hash is that is doesn't use modulus, multiplication or division arithmetic operations, and where we process one by at a time. It can also run as two main threads, and which makes it run faster on systems with two CPUs. It remembers its state with a 64 bit state value, and can run on 32-bit or 64-bit systems (x86 and x64). One tool that we use to assess hashes is SMhasher, and ChargeHash passes all of the main tests. verall it has been shown that it is faster than FNV, and could be useful for hashtable and Bloom filter integration.
A standard set of test vectors is:
Input ChargeHash "1000" 0x83EF7029 "1001" 0x939C5403 "1002" 0x85593E42 "1003" 0x30F55111 "1004" 0x2F68A2E3 "1005" 0x9E9AE708 "1006" 0xB48E54A8 "1007" 0x62BA91CE "1008" 0xA195B98B "1009" 0x2B1CEF82 "1010" 0xEE7F1907 "1011" 0x4F9BC15F "1012" 0xFCE92645 "1013" 0x0528EBEE "1014" 0xD2F6E341 "1015" 0xBC98E613 "1016" 0x2A573F7C "1017" 0x1E750136 "1018" 0x2B2B4080 "1019" 0x26DB9F49
Here is the C code used in the page:
#include "chargehash.h" #include "string.h" #include <stdio.h> uint32_t chargehash(const unsigned long input_count, const unsigned char *input) { uint32_t state = 0; uint32_t auxiliary = 0; unsigned long i = 0; while (i != input_count) { state += (state << 11) | (input[i] + 1111); state ^= state >> 5; auxiliary += ((auxiliary << 8) | input[i]) + 1; i++; } state += (((state << 9) | (state >> 23)) + state) << (17 - (state & 3)); auxiliary += (auxiliary << 11) | 1111; state += auxiliary >> 6; state ^= state >> 5; state ^= (((state << 21) | (state >> 11)) - state) >> (13 - (state & 7)); state += state << 15; state += (state >> 7) + (auxiliary >> 2) + auxiliary; return state; } int main(int argc, char *argv[]) { uint32_t entropy = chargehash(strlen(argv[1]), (unsigned char *)argv[1]); printf("=== ChargeHash=========\n"); printf("Message: %s\n",argv[1]); printf("Length: %d\n",strlen(argv[1])); printf("Output: %x\n",entropy); return 0; }
Refereces
[1] https://medium.com/@williamstaffordparsons/chargehash-is-a-new-tiny-32-bit-hashing-algorithm-with-dual-threaded-single-byte-processing-b4db82428cf6