For Crypto Speed, Taking The Highway is Great

Google — as a company — have vast data infrastructures. They thus need ways to quick identify data objects and match them. So, they have…

Photo by Eric Weber on Unsplash

For Crypto Speed, Taking The Highway is Great

Google — as a company — have vast data infrastructures. They thus need ways to quickly identify data objects and match them. So, they have been innovating in ways of producing ultra-fast hashing methods — both for short string and long string inputs. For their applications, the slowness of the cryptography hashes is just not good enough to provide the required scale. As a benchmark, a strong cryptography hash has a throughput of two or three cycles per byte (c/b), which means we can only process one byte in 2 or 3 cycles of the processor. The non-cryptography methods, though, can run a 0.2 c/b, and thus can run at many levels of magnitude faster than the cryptography methods.

And so, in 2017, Google released HighwayHash. It was created by Jyrki Alakuijala, Bill Cox, Jan Wassenberg [here] and based on SipHash. Overall, though, it has been assessed as running 5.2 times faster than SipHash for 1 KiB inputs. For many, SipHash is seen as one of the best non-cryptography hashes around and is generally free from security issues. HighwayHash is then based on the SipHash approach but has an enhancement that reduce the security level compared to SipHash, but which considerably improves the processing performance. As with SipHash, HighwayHash is a key-hashed method, and where we can apply a secret key to extend the range of hashes that are possible for a given input. If the key is kept secret it would be almost impossible to match input data to an output hash.

HighwayHash uses a tree-hash approach, and where we start by breaking up the input data and then hashing these together in a tree structure to eventually provide a root hash value. With hardware-acceleration this provides an ultra-fast processing method. As with SipHash, HighwayHash uses Intel’s SIMD (Single Instruction, Multiple Data) extensions, and extends this to use the16 SIMD registers to perform fast multiplications.

From this recent assessment, we can see that HighwayHash64 was at least six times faster than SipHash, but still slower than one of the fastest non-cryptography hashing methods (xxHash):

Hash function	    MiB/sec	cycl./hash	cycl./map Quality problems
------------------------------------------------------------------
xxh3 16538.52 32.81 184.86 DiffDist bit 7 w. 36 bits
xxh128low 15174.85 33.79 187.05
xxh128 15174.14 40.46 195.65
HighwayHash64 6242.58 99.55 248.41
SipHash 980.88 127.77 246.19

So, for security, SipHash looks strong, and for speed, with a reduced security level, HighwayHash looks strong. SipHash and HighwayHash compete well with the non-cryptography methods, but have so much more security (especially with the usage of the key).

Here is an implementation:

https://asecuritysite.com/encryption/smh_highway

and the Repl.it code:

The following shows a C++ version of the code:

#include 
#include
#include
#include

#include "highhash.h"
    using namespace std;
using std::cout;
using std::cin;

    typedef uint64_t uint64;
typedef uint32_t uint32;
    uint64_t swapVal(uint64_t x)
{
uint64_t y;
char* px = (char*)&x;
char* py = (char*)&y;
for (int i = 0; i < sizeof(uint64_t); i++)
py[i] = px[sizeof(uint64_t) - 1 - i];
return y;
}

  
int main(int argc, char* argv[])
{
    string str = "abc";
    char buff[128];
    uint32 u32;
uint64 u64;
    uint128_c_t u128;
int seed = 0;
strcpy_s(buff, str.c_str());
    if (argc >= 2) {
if (strlen(argv[1]) > BUFFSIZE - 1) return(0);
strcpy_s(buff, argv[1]);
}
if (argc >= 3) seed = atoi(argv[2]);
    printf("String: %s\n", buff);
printf("Seed: %d\n\n", seed);

    puts("\n=== HighwayHash ===");
uint64_t tkey[4] = { 0,0,0,0 };
uint64_t u64hi = HighwayHash64((uint8_t *)buff, strlen(buff), tkey);
    printf("\nHighwayHash 64\t\t\t%llu Hex: %llx\n", swapVal(u64hi), swapVal(u64hi));
    uint64_t t_hash[2];
HighwayHash128((uint8_t*)buff, strlen(buff), tkey,t_hash);
    printf("HighwayHash 128\t\t\t%llx %llx\n", swapVal (t_hash[0]), swapVal( t_hash[1]));
    uint64_t t_hash2[4];
HighwayHash256((uint8_t*)buff, strlen(buff), tkey, t_hash2);
    printf("HighwayHash 256\t\t\t%llx %llx %llx %llx\n", swapVal(t_hash2[0]), swapVal(t_hash2[1]), swapVal(t_hash2[2]), swapVal(t_hash2[3]));
    }

A sample run with all zero keys is:

String: test
=== HighwayHash ===
HighwayHash 64	2367525197889104785 Hex: 20db239fb0f76b91
HighwayHash 128 733a7c1ab2bccdc6 9236f7faf3a3f9a
HighwayHash 256 51c020e9c97902c7 56cff5e52e116be9 183e0ead97a97764 29a47a64f2df9db4

Conclusions

Too often in security, we max out the security levels and which gives us crazy security. Unfortunately, performance is often compromised. So, in designing systems that are secure-by-design, and which can perform to given limits, sometimes you have to compromise, and HighwayHash provides one such example. Google also have a fairly homogeneous infrastructure for the Cloud infrastructure, and where they can deploy code that can release extra functionality within the Intel x64 architecture.