Hashing To A Field: Expanding The Message

Domain string tags (DSTs) are increasingly used in hashing methods. In Figure 1, Bob has a message and wants a hashed output of a given…

Hashing To A Field: Expanding The Message

Domain string tags (DSTs) are increasingly used in hashing methods. In Figure 1, Bob has a message and wants a hashed output of a given number of bytes. For this, we can add the Domain Separation Tag (DST) and a definition of the length of the output:

Figure 1: Applying a DST to the hashing fuction and producing a given hash length

Overall, we should not just concatenate the message and the DST, as it will not always get a unique set of (message, DST) pairs. For example, if we had two DSTs of “DEF” and “EF”, then, if we concatenate with a message of “ABC” and “ABCD”, we would get to identical pairs of (“ABC” || “DEF”) and (“ABCD” || “EF”) — as both would equal “ABCDEF”. The DST is thus used to initialise the hashing function rather than concatenating it to the message (as we would normally do when we add salt to a password).

Hash to curve

Let’s say we have almost 2²⁵⁶ integer point values. This will give us 115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,936 different (x,y) points. Can we find a way to hash some data onto one of these points, and so it is not possible to know the data that resulted in that point, and that it is highly unlikely for two different data elements to end up with the same data point?

In ECC (Elliptic Curve Cryptography) we often have to hash our data to a scalar value or to a point on the curve. Once hashed, it is difficult to reverse the operation (unless we did a brute-force search for it). We can either encode-to-curve (non-uniform) to hash-to-curve. Overall we often use the method of encoding/hashing to a curve with PAKE (password-authenticated key exchange), IBE (Identity-based Encryption) and within Verifiable Random Functions (VRF) and Oblivious Pseudorandom Functions (VOPRF).

One method of doing this is to hash to a value, and then find the nearest point — known as “try-and-increment” or “hunt-and-peck”. Unfortunately, these do not have a constant time implementation and could be open to side-channel analysis. One example of an attack against this is with WPA3, and where the Dragonblood vulnerability exploited a side-channel attack on the hashing of a password to the curve.

One method of overcoming these side channel attacks is proposed an Internet-Draft standard by the Internet Engineering Task Force (IETF)[here] related to “Hashing to Elliptic Curves”:

We encode and hashing a value into a point such as using the Cloudflare CIRLC library. The main hashing methods are SHA-256, SHA-384 and SHA-512, and these match to the curves of P-256, P384 and P521, respectively.

Domain seperation tag (DST)

Domain seperation in hashing, allows us to initialise the hashing method with a given domain string. This will help to seperate out different applications. For example, we might have an email application and a digital signature application. For this, we could define two application domains as strings, such as “Email”, and “Digital Signature”. The SHAKE128 hash, for example, for a 32-byte output for the email application can then be:

hash = SHAKE128(msg,"Email","",32)

and for “Digital Signature”:

hash = SHAKE128(msg,"Digital Signature","",32)

The general format is:

hash = SHAKE128(data, S,N,size)

and where S is the customization string for each application, and N is a function level string. With the function-level string, we should only use strings defined by NIST, but where the customization string can be adopted by different applications.

Expander

In most cryptographic hashing methods, we get a fixed length hash. For example, SHA1 gives us a 160-bit hash, and SHA-256 gives us a 256-bit hash. A cryptographic expander generates an number of arbitrary bytes using either an XOF (extendable output function) function or a hash function. This can be done with a Merkle-Damgård (MD) hash function of SHA256 or SHA512 as the hashing fuction, or as an XOF with SHAKE128 or SHA256.

For this, we need a DST, the length of expansion, the message, and either the MD or XOF method. In this case, we will use four domain tags for each of the different hashing/XOF functions: “QUUX-V01-CS02-with-expander-SHA256–128”, “QUUX-V01-CS02-with-expander-SHA256–128”, “QUUX-V01-CS02-with-expander-SHA512–256”, and “QUUX-V01-CS02-with-expander-SHAKE256”, but these could be defined for the application that we have:

 if method == "SHA256" {
  DST := "QUUX-V01-CS02-with-expander-SHA256-128"
  exp = expander.NewExpanderMD(crypto.SHA256, []byte(DST))
 }
 if method == "SHA512" {
  DST := "QUUX-V01-CS02-with-expander-SHA512-256"
  exp = expander.NewExpanderMD(crypto.SHA512, []byte(DST))
 }
 if method == "SHAKE128" {
  DST := "QUUX-V01-CS02-with-expander-SHAKE128"
  exp = expander.NewExpanderXOF(xof.SHAKE128, uint(128), []byte(DST))
 }
 if method == "SHAKE256" {
  DST := "QUUX-V01-CS02-with-expander-SHAKE256"
  exp = expander.NewExpanderXOF(xof.SHAKE256, uint(256), []byte(DST))
 }

We can then take the message and the the length of the expanded output and produce our hashed output:

got := exp.Expand([]byte(Msg), uint(lenBytes))

The full code is:

package main

import (
 "crypto"
 _ "crypto/sha256"
 _ "crypto/sha512"
 "fmt"
 "os"
 "strconv"

 "github.com/cloudflare/circl/expander"
 "github.com/cloudflare/circl/xof"
)

func main() {

 method := "SHA256"
 Msg := ""
 size := "0x20"
 argCount := len(os.Args[1:])

 if argCount > 0 {
  Msg = string(os.Args[1])
 }
 if argCount > 1 {
  method = string(os.Args[2])
 }
 if argCount > 2 {
  size = string(os.Args[3])
 }

 var exp expander.Expander

 DST := "QUUX-V01-CS02-with-expander-SHA256-128"
 exp = expander.NewExpanderMD(crypto.SHA256, []byte(DST))

 if method == "SHA256" {
  DST := "QUUX-V01-CS02-with-expander-SHA256-128"
  exp = expander.NewExpanderMD(crypto.SHA256, []byte(DST))
 }
 if method == "SHA512" {
  DST := "QUUX-V01-CS02-with-expander-SHA512-256"
  exp = expander.NewExpanderMD(crypto.SHA512, []byte(DST))
 }
 if method == "SHAKE128" {
  DST := "QUUX-V01-CS02-with-expander-SHAKE128"
  exp = expander.NewExpanderXOF(xof.SHAKE128, uint(128), []byte(DST))
 }
 if method == "SHAKE256" {
  DST := "QUUX-V01-CS02-with-expander-SHAKE256"
  exp = expander.NewExpanderXOF(xof.SHAKE256, uint(256), []byte(DST))
 }

 lenBytes, _ := strconv.ParseUint(size, 0, 64)

 got := exp.Expand([]byte(Msg), uint(lenBytes))

 fmt.Printf("Message:\t%s\n", Msg)
 fmt.Printf("Domain string:\t%s\n", DST)
 fmt.Printf("Method:\t\t%s\n", method)
 fmt.Printf("\nLength:\t%d\n", lenBytes)
 fmt.Printf("Result: %x\n", got)

}

A sample test vector from [here] is:

We can see that using [here], we get the same result:

The main application of this is with hashing to a curve, and is defined [here].

Conclusion

The expander function is useful within hashing a message to a curve, and which can include a DST as part of this.

You can find the expander function here:

https://asecuritysite.com/circl/go_expander

Note, that it has fixed DSTs for testing purposes, but where the DSTs would differ from the test values.