Homomorphic Hashing .. Is It Possible?

Let’s start with a problem. Bob, Alice, Carol and Dave have hashed their passwords with:

Homomorphic Hashing .. Is It Possible?

Let’s start with a problem. Bob, Alice, Carol and Dave have hashed their passwords with:

H(“Bob123”), H(“Alice111”), H(“Carol999”) and H(“Dave!!!”)

Can we now find a way that we have find the hash of:

H(“Bob123” || “Alice111” || “Carol999” || “Dave!!!”)

and where we do not need to find out what any of the passwords are. And if we could do this, could we also remote a hashed value from it? For this, we could remove Dave’s hash to get:

H(“Bob123” || “Alice111” || “Carol999” )

For this, we need incremental hashing (IH), or more generally, homomorphic hashing (HH).

Incremental hashing

Normally with hashing methods, we need to take all the previous data and new data in order to create a new hash value for our data. In a homomorphic hashing method, we can basically take the previous hash, and then add on any new data values. We can also remove previous data values from our hash, too. This is more efficient in computing, as we do not have to process all of our data into a new hash, and can use simple homomorphic operations for the computation. The method we will use (LtHash) was initially defined by Bellare and Micciancio [1] and defined as an incremental hash:

It has since been updated by Lewi et al at Facebook [2]:

Lewi et al focuses on a method by which an owner of a database — a distributor — is responsible for propagating updates to subscribers. With this, the distributor is then responsible for updating the subscribers with a signed hash. Overall, it may be possible to recompute the over hash to the data, but it would be more efficient to update the hash based on new data added. The distributer can then just take the previous hash, and then added the new data, and produce the new hash.

Homomorphic hashing

With homomorphic hashing, Bob (the distributor) needs to sign for updates to Grace (a subscriber). In a normally hashing method, Bob would take all the previous data and the new data, and compute a new hash (Figure 1). If we have large amounts of data, this may be a computationally intensive task. To overcome this we can use a homomorphic hashing method, where we take a previous hash and then compute the new hash using this and any new data:

In Golang, we initialize a 2,048-byte hash. In this case, we will use Blake2b, and where XOF is an extendable output function (and which is similar to a normal has, but can be of any size of output):

func New16() Hash {
 xof, _ := blake2b.NewXOF(2048, nil)
 return &hash16{xof: xof}
}

This initialises our hash value to a Nil hash value (“0000000000000000 … 00000000000000000”). To add a hash, we take the current hash (x) and add a new one (y). We then take two bytes (16 bits) at a time from each, and then add them together, and then put it back into the current value of x:

func add16(x, y *[2048]byte) {
 for i := 0; i < 2048; i += 2 {
   xi, yi := x[i:i+2], y[i:i+2]
   sum := binary.LittleEndian.Uint16(xi) +
     binary.LittleEndian.Uint16(yi)

   binary.LittleEndian.PutUint16(xi, sum)                                 }
}

To perform a removal of a value (y) from x (the current hash), we just subtract the hash two bytes(16 bits) at a time:

func sub16(x, y *[2048]byte) {
 for i := 0; i < 2048; i += 2 {
   xi, yi := x[i:i+2], y[i:i+2]
   sum := binary.LittleEndian.Uint16(xi) -
     binary.LittleEndian.Uint16(yi)

   binary.LittleEndian.PutUint16(xi, sum)                                 }
}

The code is [here]:

package main

import (
	"fmt"
	"os"

	"lukechampine.com/lthash"
)

func main() {

	m1 := "Edinburgh"
	m2 := "Glasgow"
	m3 := "Dundee"
	m4 := "Aberdeen"
	argCount := len(os.Args[1:])

	if argCount > 0 {
		m1 = os.Args[1]
	}
	if argCount > 1 {
		m2 = os.Args[2]
	}
	if argCount > 2 {
		m3 = os.Args[3]
	}
	if argCount > 3 {
		m4 = os.Args[4]
	}

	h := lthash.New16()
	fmt.Printf("Adding: %s\n", m1)
	fmt.Printf("Adding: %s\n", m2)
	fmt.Printf("Adding: %s\n", m3)
	h.Add([]byte(m1))
	h.Add([]byte(m2))
	h.Add([]byte(m3))
	oldSum := h.Sum(nil)
	fmt.Printf("Sum (first 32 bytes): %x\n", oldSum[:32])

	fmt.Printf("\nRemoving: %s\n", m1)
	fmt.Printf("Adding: %s\n", m4)
	h.Remove([]byte(m1))
	h.Add([]byte(m4))
	newSum := h.Sum(nil)
	fmt.Printf("Sum (first 32 bytes): %x\n", newSum[:32])

	fmt.Printf("\nNow checking with: %s, %s, %s\n", m2, m3, m4)
	h = lthash.New16()
	h.Add([]byte(m2))
	h.Add([]byte(m3))
	h.Add([]byte(m4))
	newSum = h.Sum(nil)
	fmt.Printf("Sum (first 32 bytes): %x\n", newSum[:32])

	fmt.Printf("\nThe length of the hash is: %d bytes\n", len(newSum))

A sample run is [here]:

Adding: Edinburgh
Adding: Glasgow
Adding: Dundee
Sum (first 32 bytes): 3a2031807078ccf8c9ae4fd21ce6c4407e558e9671aedd9eff8758766d690944

Removing: Edinburgh
Adding: Stirling
Sum (first 32 bytes): 3c02c125fd9d9d679f5f8a536dbd6498ff55b5d1ec85440b91cd9a6969100cbf

Now checking with: Glasgow, Dundee, Stirling
Sum (first 32 bytes): 3c02c125fd9d9d679f5f8a536dbd6498ff55b5d1ec85440b91cd9a6969100cbf

The length of the hash is: 2048 bytes

Conclusions

As we move into a world which uses homomorphic encryption for computing, we will see the rise of naturally encrypted data. The use of homomorphic hashing can have great applications within consensus networks, and in hashing extremely large amount of data. For one thing, it is not good for our carbon footprint to waste energy in rehashing data, so incremental hashes have one core advantage of saving energy — as every tick of a clock consumes a little more energy.

References

[1] Bellare, M., & Micciancio, D. (1997, May). A new paradigm for collision-free hashing: Incrementality at reduced cost. In International Conference on the Theory and Applications of Cryptographic Techniques (pp. 163–192). Springer, Berlin, Heidelberg.

[2] Lewi, K., Kim, W., Maykov, I., & Weis, S. (2019). Securing Update Propagation with Homomorphic Hashing. Cryptology ePrint Archive.