Format Preserving Encryption with PythonFormat Preserving Encryption (FPE) is a method which allows the basic formatting of a message to stay in a similar format, and where the value itself is encrypted. It can be used to hide credit card details. |
Outline
Within tokenization, we can apply format-preserving encryption (FPE) methods, which will convert our data into a format which still looks valid, but which cannot be mapped to the original value. For example, we could hide Bob’s credit card detail into another valid credit card number, and which would not reveal his real number. A tokenization server could then convert the real credit card number into a format which still looked valid. For this, we have a key which takes the data and then converts it into a form which the same length as the original.
The method we use is based on a Feistel structure, and where we have a number of rounds, and then apply the key through a Feistel function for each round:
We thus split the data into blocks (typically 64-bits), and then split into two parts. We then take these splits into the left part and the right part, and feed through each round, and then swap them over. The ⊕ symbol is an exclusive-OR operator.
So, we have a problem here. In most encryption methods we deal with block sizes, such as 64 bit for DES and 128 bits for AES. The output will then be a multiple of 64 bits or 128 bits, as we cipher one block at a time. In FPE we want to have something which will match to the length of the input data. The solution is Format-preserving, Feistel-based encryption (FFX) and which produces an output which matches the length of the input.
NIST has thus defined a standard known as SP 800–38G, and which defines two FF schemes: FF1 and FF3. While these work on 128-bit block sizes, they can also work on blocks which have fewer bits than this. For this we have a key (K) and which creates a permutation of the bits to create an invertible version of the output.
For FF1 we have 10 rounds and for FF3 we have eight rounds. First, we split an input value of n characters into a number of characters (u and v — and where n = u + v):
For the encrypting process we use a modular addition (EX-OR) and for decryption, we use a modular subtraction. For each round, we split into a and b. For the F function in each round, we generate an HMAC output (using SHA-1) from the key (K), the bᵢ, and the counter value (i).
An outline of the code is:
import pyffx import sys msg='hello' al='0123456789abcdefghijklmnopqrstuvwxyz' if (len(sys.argv)>1): msg=str(sys.argv[1]) if (len(sys.argv)>2): al=str(sys.argv[2]) # e = pyffx.Integer(b'secret-key', length=4) # print (e.encrypt(1234)) e = pyffx.String(msg.encode(), alphabet=al, length=len(msg)) print("Alphabet: ",al) print("Message: ",msg) print ("\nEncrypted: ",e.encrypt(msg))
A sample run is:
Alphabet: 0123456789abcdefghijklmnopqrstuvwxyz Message: hello Encrypted: 3tyxt
Presentation
The following is an outline of a presentation [slides]: