The Wonder of Bloom

A Bloom filter is used to create a probabilistic guess on whether an item is in a data structure, and was created by Burton Howard Bloom…

The Wonder of Bloom

A Bloom filter is used to create a probabilistic guess on whether an item is in a data structure, and was created by Burton Howard Bloom (Bloom, 1970). Within the test, the query will define if the value is “possibly in the set” or “definitely not in the set”. Each added element is hashed with two or more hashing methods, and the values generated values are used to set the bits in a bit array. In this example we use a 32-bit bit vector, and use Murmur 2 and FNV for the hashes. Typically we use non-crypto hashes, in order to speed up the process.

In this demo, the first value is taken from Murmur 2, and the second one is from FNV. Each of these are used to generate a 32-bit bit vector. We will add “fred”, “bert” and “greg”, and which gives a Bloom filter of:

                01234567890123456789012345678901
Add fred: 00000000000000100000010000000000 fred [21,14]
Add bert: 00000000100000100000010000000100 bert [29,8]
Add greg: 00000000100100100000011000000100 greg [11,22]
We now have bit position 8, 11, 14, 21, 22 and 29 set.

We can now test for “amy” and “greg”:

Now we can test for amy:
amy is not there [16,12]
New we can test for greg:
greg may be in there [11,22]

Some sample code for C# is:

using Hashlib;
   class Bloom
{
public BitArray bits = new BitArray(32);

public Int16 hash1(String word)
{
var hash2 = HashFactory.Hash64.CreateMurmur2();
var val =(uint) hash2.ComputeString(word, Encoding.ASCII).GetULong();
            uint res = val % 32;
            return Convert.ToInt16(res);
}

public Int16 hash2(String word)
{
            var hash3 = HashFactory.Hash64.CreateFNV();
var val = (uint)hash3.ComputeString(word, Encoding.ASCII).GetULong();
            uint res = val % 32;
            return Convert.ToInt16(res);
}
public void add(String InString)
{
            Int16 Point1 = this.hash1(InString);
Int16 Point2 = this.hash2(InString);
this.bits[Point1] = true;
this.bits[Point2] = true;

}
public bool contains(String InString)
{
Int16 Point1 = this.hash1(InString);
Int16 Point2 = this.hash2(InString);
if (this.bits[Point1] && this.bits[Point2]) return true;
else return false;

}
public string checkFor(String inkey)
{
if (this.contains(inkey)) return(inkey + " may be in there");
else return(inkey + " is not there");
}
}

Conclusions

Bloom filters are used extensively in cyber security, as they allow us to add elements onto bit array, and then just in entity has already been added. This allows us to store multiple hashes onto a Bloom filter.