SipHash: A Classic Tale of a Fix For A Problem

… and still one of the best hashing methods around

Photo by Eugen Str on Unsplash

SipHash: A Classic Tale of a Fix For A Problem

… and still one of the best hashing methods around

Around 2011/2012, JP Aumasson and Daniel J Bernstein (djb) outlined some major risks around many existing hashing methods [here]:

For this, YP outlined that a DoS could be created on a hashtable by sending the same hash (a multicollision) to a server. This results in a worst-case insert time, and can result in a DoS against the hashing method. The example given is to send 2MB of POST data consisting of 200,000 of the same 10 byte string and which results in 40,000,000,000 string comparisons — having an impact of a computation time of around 10 seconds on a 2 GHz machine. They found MurmurHash2/3 and Google’s CityHash as being vulnerable to these attacks. Google has since released FarmHash in order to address the weakness.

But for YP and jdb it was not all about outlining a weakness, but where they fixed the problem with SipHash [1]. This addressed some of the issues caused by DoS attacks against hashing methods [here]:

It contained the bold statement of “We propose that hash tables switch to the SipHash as a hash function”. Overall SipHash is a key-based hashing function, and where we can use a secret key to produce a hashed valued which cannot be reversed. The hash is thus generated from a key (K_in) to give SipHash(D_in,K_in). Without the knowledge of the key, it will be extremely difficult to determine the input data value.

SipHash is seen as one of the best non-cryptography hashes around and is generally free from security issues. It uses a tree hash approach, and where we start by breaking up the input data and then hashing these together in a tree structure to eventually provide a root hash value. SipHash uses Intel’s SIMD (Single Instruction, Multiple Data) extensions, and uses the 16 SIMD registers to perform fast multiplications. Overall, as with SipHash, it is a key-hashed method, and where we can apply a secret key to extend the range of hashes that are possible for a given input. SipHash uses a 128-bit key for the hash. Overall for MACs, SipHash is much faster than HMAC.

In the following we will use a key of ‘0123456789ABCDEF’, and which is 16 bytes (128 bits):

https://asecuritysite.com/encryption/smh_siphash?val1=a

I have check this against this [here] and for a key of ‘0123456789ABCDEF’ we should get 12398370950267227270 (0xac0fdcb09c334c86).

The following shows a C++ version of the code:

    #include "highhash.h"
    using namespace std;
using std::cout;
using std::cin;

    typedef uint64_t uint64;
typedef uint32_t uint32;
    uint64_t swapVal(uint64_t x)
{
uint64_t y;
char* px = (char*)&x;
char* py = (char*)&y;
for (int i = 0; i < sizeof(uint64_t); i++)
py[i] = px[sizeof(uint64_t) - 1 - i];
return y;
}

    void byte_swap32(unsigned int* pVal32) {
    unsigned char* pByte;
    pByte = (unsigned char*)pVal32;
// move lsb to msb
pByte[0] = pByte[0] ^ pByte[3];
pByte[3] = pByte[0] ^ pByte[3];
pByte[0] = pByte[0] ^ pByte[3];
// move lsb to msb
pByte[1] = pByte[1] ^ pByte[2];
pByte[2] = pByte[1] ^ pByte[2];
pByte[1] = pByte[1] ^ pByte[2];
    }

    #define BUFFSIZE 128
int main(int argc, char* argv[])
{
    string str = "abc";
    char buff[128];
    uint32 u32;
uint64 u64;
    uint128_c_t u128;
int seed = 0;
strcpy_s(buff, str.c_str());
    if (argc >= 2) {
if (strlen(argv[1]) > BUFFSIZE - 1) return(0);
strcpy_s(buff, argv[1]);
}
if (argc >= 3) seed = atoi(argv[2]);
    printf("String: %s\n", buff);
printf("Seed: %d\n\n", seed);
    puts("\n=== SipHash ===");
    unsigned char sipkey[16] = {'0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F' };
// We are using key = '0123456789ABCDEF'
    uint64_t u64sip = siphash(sipkey,(unsigned char *)buff, strlen(buff));
    printf("\nSipHash 64\t\t%llu Hex: %llx\n", (u64sip), (u64sip));
    u64sip = siphash13(sipkey, (uint8_t*)buff, strlen(buff));
    printf("SipHash13\t\t%llu Hex: %llx\n", (u64sip), (u64sip));
    uint32_t u32sip = halfsiphash(sipkey, (uint8_t*)buff, strlen(buff));
    printf("HalfSipHash\t\t%lu Hex: %lx\n", (u32sip), (u32sip));
    }

A sample run with a key of ‘0123456789ABCDEF’ is:

String: test
Seed: 0

=== SipHash ===
SipHash 64		5661403847363678525 Hex: 4e9158cdc424513d
SipHash13 10175878653605156576 Hex: 8d37fbb8a99d96e0
HalfSipHash 693022438 Hex: 294eaee6

The Repl.it code is here:

Conclusions

A great security research finds and understands a vulnerabilty, and is then able to fix it. YP and djb are two of the best cybersecurity researchers around.