Creating An Encryption Key from a Passphrase

I’ve lost count of the number of programs I’ve reviewed where a hashing method has been used to generate an encryption key from a…

Photo by Towfiqu barbhuiya on Unsplash

Creating An Encryption Key from a Passphrase

I’ve lost count of the number of programs I’ve reviewed where a hashing method has been used to generate an encryption key from a passphrase. Overall this is not good practice, as it can be relatively easy to crack the encryption key, and there are flaws in the way that many of the hashing methods create a hash value. For example, MD5, SHA-1 and SHA-256 have fundamental weaknesses related to the length extension attack [here]. To overcome these problems we can use a KDF (Key Derivation Function), and one of the most useful is the HMAC version: HKDF (HMAC Key Derivation Function).

Within HKDF we can use most of the standard hashing methods to generate an encryption key of a given length. For a 128-bit key, we would generate 16 bytes, for a 192-bit key we would generate 24 bytes, and for a 256-bit key, we would generate 32 bytes. A 512-bit key would need 64 bytes. The HKDF copes with this, in having a variable-length output, whereas most hashing functions only produce a fixed-length output. MD5, for example, has a 128-bit output, SHA-1 has 160 bits, and SHA-256 has 256 bits.

HKDF

One of the most widely used cryptography integrations into Python is cryptography. This has a number of KDF methods in the Hazmat primitives. In this case, we will generate a number of outputs for HKDF and also for scrypt and PBKDF2. Overall for HKDF, we need an input of the key as a number of bytes, and then add a salt value to this. We also need the hashing method that we will use. The standard format is:

hkdf = HKDF(algorithm=type, length=length,salt=salt, info=b"")
mykey=hkdf.derive(data)

And where type is the hashing algorithm used, and length is the number of bytes for the output. We can then wrap up the HKDF function into a function, and the code becomes [here]:

from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.kdf.hkdf import HKDF
from cryptography.hazmat.primitives.kdf.scrypt import Scrypt
from cryptography.hazmat.primitives.kdf.pbkdf2 import PBKDF2HMAC
from cryptography.hazmat.primitives.kdf.concatkdf import ConcatKDFHash,ConcatKDFHMAC
from cryptography.hazmat.primitives.kdf.x963kdf import X963KDF


import binascii
import sys
import bcrypt

st = "00"
hex=False
showhex="No"
k="00"
length=16
slt=""

def show_hash(name,type,data,length,salt):

hkdf = HKDF(algorithm=type, length=length,salt=salt, info=b"")
mykey=hkdf.derive(data)

hex=binascii.b2a_hex(mykey).decode()
b64=binascii.b2a_base64(mykey).decode()
print (f"HKDF {name}: {hex} {b64}")

def show_hash_pbkdf2(name,type,data,length,salt):

hkdf = PBKDF2HMAC(algorithm=type, length=length,salt=salt, iterations=1000)
mykey=hkdf.derive(data)
hex=binascii.b2a_hex(mykey).decode()
b64=binascii.b2a_base64(mykey).decode()
print (f"HKDF {name}: {hex} {b64}")

def show_hash_scrypt(name,data,length,salt):

hkdf = Scrypt(length=length,salt=salt,n=2**14,r=8, p=1)
mykey=hkdf.derive(data)
hex=binascii.b2a_hex(mykey).decode()
b64=binascii.b2a_base64(mykey).decode()
print (f"HKDF {name}: {hex} {b64}")

def show_hash_concat(name,type,data,length,salt):

hkdf = ConcatKDFHash(algorithm=type, length=length,otherinfo=b"")
mykey=hkdf.derive(data)
hex=binascii.b2a_hex(mykey).decode()
b64=binascii.b2a_base64(mykey).decode()
print (f"HKDF {name}: {hex} {b64}")


if (len(sys.argv)>1):
st=str(sys.argv[1])

if (len(sys.argv)>2):
showhex=str(sys.argv[2])

if (len(sys.argv)>3):
length=int(sys.argv[3])

if (len(sys.argv)>4):
slt=str(sys.argv[4])

if (showhex=="yes"): hex=True



try:
if (hex==True): data = binascii.a2b_hex(st)
else: data=st.encode()
if (hex==True): salt = binascii.a2b_hex(slt)
else: salt=slt.encode()


print ("Key: ",st)
print (" Hex: ",binascii.b2a_hex(data).decode())

print ("Salt: ",slt)
print (" Hex: ",binascii.b2a_hex(salt).decode())

print()


show_hash("Blake2p (64 bytes)",hashes.BLAKE2b(64),data,length,salt)
show_hash("Blake2s (32 bytes)",hashes.BLAKE2s(32),data,length,salt)
show_hash("MD5",hashes.MD5(),data,length,salt)
show_hash("SHA1",hashes.SHA1(),data,length,salt)
show_hash("SHA224",hashes.SHA224(),data,length,salt)
show_hash("SHA256",hashes.SHA256(),data,length,salt)
show_hash("SHA384",hashes.SHA384(),data,length,salt)
show_hash("SHA3_224",hashes.SHA3_224(),data,length,salt)
show_hash("SHA3_256",hashes.SHA3_256(),data,length,salt)
show_hash("SHA3_384",hashes.SHA3_384(),data,length,salt)
show_hash("SHA3_512",hashes.SHA3_512(),data,length,salt)
show_hash("SHA512",hashes.SHA512(),data,length,salt)
show_hash("SHA512_224",hashes.SHA512_224(),data,length,salt)
show_hash("SHA512_256",hashes.SHA512_256(),data,length,salt)
show_hash_pbkdf2("PBKDF2",hashes.SHA256(),data,length,salt)
show_hash_scrypt("Scrypt SHA256",data,length,salt)
show_hash_concat("Concat SHA256",hashes.SHA256(),data,length,salt)

except Exception as e:
print(e)

A sample run with a passphrase of “The quick brown fox jumps over the lazy dog” and no salt gives [here]:

Key:  The quick brown fox jumps over the lazy dog
Hex: 54686520717569636b2062726f776e20666f78206a756d7073206f76657220746865206c617a7920646f67
Salt:
Hex:

HKDF Blake2p (64 bytes): 2391a02087f2f31bae5d18c58640c26b I5GgIIfy8xuuXRjFhkDCaw==

HKDF Blake2s (32 bytes): 4a49cb3ccee24a80d197a84e0db4f3c2 SknLPM7iSoDRl6hODbTzwg==

HKDF MD5: 51d4215f07f07c624b1a00f56d8745e3 UdQhXwfwfGJLGgD1bYdF4w==

HKDF SHA1: 7972ff9f0a22487ceed6cef9621a4e09 eXL/nwoiSHzu1s75YhpOCQ==

HKDF SHA224: b19e1dda148a7bb06755f39d87cc3c53 sZ4d2hSKe7BnVfOdh8w8Uw==

HKDF SHA256: 5203a33ae802819576dfef424acaed0e UgOjOugCgZV23+9CSsrtDg==

HKDF SHA384: 8e2553493bcfd4cfb4001c8d8e6f4bf9 jiVTSTvP1M+0AByNjm9L+Q==

HKDF SHA3_224: e5f9b14ef6a6f389f2382f3cdc2f1110 5fmxTvam84nyOC883C8REA==

HKDF SHA3_256: 67296413e6d144339a22ed3c4f2dc8b5 ZylkE+bRRDOaIu08Ty3ItQ==

HKDF SHA3_384: 70e9ce8dc0b8f867ca6b33112b1d6b98 cOnOjcC4+GfKazMRKx1rmA==

HKDF SHA3_512: 4f41c1fbe1b8f2b746f8059003373b12 T0HB++G48rdG+AWQAzc7Eg==

HKDF SHA512: 97e73e136f1bc96515e55bf496aef006 l+c+E28byWUV5Vv0lq7wBg==

HKDF SHA512_224: 8275ff941870018b8269c8680a0349cb gnX/lBhwAYuCachoCgNJyw==

HKDF SHA512_256: 731b7eb0097a5b0e63f44fccf8325120 cxt+sAl6Ww5j9E/M+DJRIA==

HKDF PBKDF2: 88cc217342b5695d429cee202c412ead iMwhc0K1aV1CnO4gLEEurQ==

HKDF Scrypt SHA256: c6afcb0d03ed20937fb22786398bdde4 xq/LDQPtIJN/sieGOYvd5A==

HKDF Concat SHA256: 4946a3cc69d82fde0f63187818416d51 SUajzGnYL94PYxh4GEFtUQ==

The code is here:

scrypt and PBKDF2

Two other methods that we added are scrypt and PBKDF2. These are excellent KDFs as they are slow and defeat GPUs. We can also make them as slow as we like (and will thus be costing in cracking our encryption key). With Scrypt we basically define the salt, the length, and the cost parameters. n is the CPU cost parameter, r is the block size, and p is the parallelization parameter:

hkdf = Scrypt(length=length,salt=salt,n=2**14,r=8, p=1)

For PBKDF2 works by hashing a number of times (known as iterations). This defeats GPUs, as the task is sequential. With this we basically just define the number of iterations, and the more iterations, the slower it will be to crack:

hkdf = PBKDF2HMAC(algorithm=type, length=length,salt=salt, iterations=1000)

Both these methods disable large parts of a GPU infrastructure, and where we drop from testing billions of passwords per second, to just 100s per second (or even less).

Conclusions

Quite simple … don’t hash your password to get an encryption key, use a proper KDF. If you need speed, HKDF, but if you need to defeat the crackers, use scrypt or PBKDF2.

Here’s my demo: