Sometimes in Cybersecurity, Boring is Good

Correcting Errors Is Good

Sometimes in Cybersecurity, Boring is Good

Correcting Errors Is Good

Sometimes in cybersecurity, boring is good, and so writes T.Lange for the submission of the McEliece Post Quantum Crypto (PQC) submission:

And while researchers gush about lattice methods and their learning with errors and learning with rounding, and about the beauty of isogenies, the good old boring McEliece method plods on, and does “what it says on the tin”.

And so the classic McEliece method still has a chance to be the PCQ method of choice, as it is one of the three finalists in the NIST competition for public key encryption/key exchange:

Overall, though, the McEliece method has a bit of a PR problem, and one of its weaknesses is that while there are many YouTube videos explaining lattice methods and isogenies, unfortunately, the McEliece method has been a bit forgotten about. Perhaps, it’s because it crosses over from data communications to cryptography, that there is a little bit of a misunderstanding? But, I think the method is just as intriguing as lattice, multivariates and isogenies, so I am going to explain a simple example.

Please bare with it. In this example, I will use Hamming codes to correct errors in the cipher, in order to explain the core principles. First I need to explain Hamming codes.

Hamming codes

With (7,4) Hamming code we take 4 bits of data and add 3 Hamming bits to give 7 bits for each 4-bit value. We create a code generator matrix G and the parity-check matrix H. The input data is multiplied by G, and then to check the result is multiplied by H:

If we have a data input of [1 0 1 0] then to create the message to send we get:

The bits transmitted will thus be “1 0 1 1 0 1 0”.

which identifies there are no errors. For a data input of “1010”, we get the following [here]:

Input           Coded (1 error)         Error Position
[1 0 1 0]       [1 0 1 1 0 1 0]         [0 0 0]
----------------------
Single Errors:
Input           Coded (1 error)         Error Position
[1 0 1 0]       [0 0 1 1 0 1 0]         [1 0 0]
[1 0 1 0]       [1 1 1 1 0 1 0]         [0 1 0]
[1 0 1 0]       [1 0 0 1 0 1 0]         [0 0 1]
[1 0 1 0]       [1 0 1 0 0 1 0]         [0 1 1]
[1 0 1 0]       [1 0 1 1 1 1 0]         [1 0 1]
[1 0 1 0]       [1 0 1 1 0 0 0]         [1 1 0]
[1 0 1 0]       [1 0 1 1 0 1 1]         [1 1 1]

It can be seen that the data is “1010” and after the Hamming bits are added the result is “1011010”. If there are no errors the result is “000”, which means there are no errors.

If we add an error of the first coded bit we get a received value of “0 0 1 1 0 1 0”. The result is then “100” which identifies that Bit 0 is in error. If we look at an error in the next bit, we get a value of “010” which identifies that position is in error, and so each result correctly identifies the error, and we can thus correct it (for single-bit errors). And now we will use this method to perform our encryption.

McEliece using Hamming codes

I illustrate the process with a (7,4) Hamming code that corrects one error and use the example from the previous section. The generator matrix is then:

             1 0 0 0 1 1 1
         G = 0 1 0 0 0 1 1
             0 0 1 0 1 0 1
             0 0 0 1 1 1 0

Bob then selected a scrambler matrix (S):

              1  1  0  1
          S = 1  0  0  1
              0  1  1  1
              1  1  0  0

And also a permutation matrix:

              0  1  0  0  0  0  0 
              0  0  0  1  0  0  0
              0  0  0  0  0  0  1
         P =  1  0  0  0  0  0  0
              0  0  1  0  0  0  0
              0  0  0  0  0  1  0
              0  0  0  0  1  0  0

Bob public key then becomes the dot product of these matrices

                  1 1 0 1 0 1 0
     G' = SGP   = 1 1 0 0 1 0 0
                  1 0 0 1 0 0 1
                  0 1 1 1 0 0 0

Bob will use G′ as a public key, but S, G and P are secret. Next, for Alice to send a ciphered message to Bob, she converts the message into binary vectors of length k. For a message of m, Alice will create randomly a binary n -vector of weight t, and randomly places t ones in a zero vector of length n). This is then sent to Bob:

Bob then decrypts with:

At this stage, we will use our H matrix to determine the syndrome, and identify where the error is. As the position of the error would have changed because of the permutation matrix, we need to remap the position of the error. The code to achieve this is here:

import numpy
import sys
import io

g =numpy.array([[1,0,0,0,1,1,1],[0,1,0,0,0,1,1],[0,0,1,0,1,0,1],[0,0,0,1,1,1,0]])

h=numpy.array([[1,0,0,0,1,1,1],[0,1,0,1,0,1,1],[0,0,1,1,1,0,1]])

s =numpy.array([[1,1,0,1],[1,0,0,1],[0,1,1,1],[1,1,0,0]])
p=numpy.array([[0,1,0,0,0,0,0],[0,0,0,1,0,0,0],[0,0,0,0,0,0,1],[1,0,0,0,0,0,0],[0,0,1,0,0,0,0],[0,0,0,0,0,1,0],[0,0,0,0,1,0,0]])

mapping=([4,1,5,2,7,6,3])

print('G:\n',g)
print('S:\n',s)
print('P:\n',p)

matrix = numpy.dot(s,g)%2
G_1 = numpy.dot(matrix,p)%2

print('Result:\n',G_1)

message = ([1,1,0,0])
e = ([0,0,0,0,1,0,0])

if (len(sys.argv)>1):
        message=numpy.genfromtxt(io.StringIO(sys.argv[1]),delimiter=',')

res=numpy.dot(message,G_1)%2

cipher = numpy.add(res,e)%2

print('\nCipher:',cipher)

p_1=numpy.linalg.inv(p)%2

print('\nP^{-1}:\n',p_1)

y_1 = numpy.dot(cipher,p_1)%2

syndrome = numpy.dot(h,numpy.transpose(cipher))%2

error = syndrome[0]+2*syndrome[1]+4*syndrome[2]
print("\nsyndrome\':",syndrome,error)
error=mapping[int(error-1)]
print ("Error position: ",error)
print ("y': ",y_1)
y_1[error-1]=(y_1[error-1]+1)%2
print ("y' (corrected): ",y_1)

s_1=numpy.linalg.inv(s)%2

print ('\nS^{-1}:\n',s_1)

# syn = ([1,0,0,0])

decipher = numpy.dot(y_1[0:4],s_1)%2

print ('\nDecipher:',decipher)

A sample run is [here]:

Message:
 [1. 0. 1. 1.]
G:
 [[1 0 0 0 1 1 1]
 [0 1 0 0 0 1 1]
 [0 0 1 0 1 0 1]
 [0 0 0 1 1 1 0]]
S:
 [[1 1 0 1]
 [1 0 0 1]
 [0 1 1 1]
 [1 1 0 0]]
P:
 [[0 1 0 0 0 0 0]
 [0 0 0 1 0 0 0]
 [0 0 0 0 0 0 1]
 [1 0 0 0 0 0 0]
 [0 0 1 0 0 0 0]
 [0 0 0 0 0 1 0]
 [0 0 0 0 1 0 0]]
Result (GSP):
 [[1 1 0 1 0 1 0]
 [1 1 0 0 1 0 0]
 [1 0 0 1 0 0 1]
 [0 1 1 1 0 0 0]]

Cipher: [0. 0. 1. 1. 1. 1. 1.]

P^{-1}:
 [[0. 0. 0. 1. 0. 0. 0.]
 [1. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 1.]
 [0. 0. 0. 0. 0. 1. 0.]
 [0. 0. 1. 0. 0. 0. 0.]]

syndrome': [1. 1. 0.] 3.0
Error position:  5
y':  [0. 1. 1. 0. 1. 1. 1.]
y' (corrected):  [0. 1. 1. 0. 0. 1. 1.]

S^{-1}:
 [[1. 1. 0. 1.]
 [1. 1. 0. 0.]
 [0. 1. 1. 1.]
 [1. 0. 0. 1.]]

Decipher: [1. 0. 1. 1.]

And there you go. In real life, we use Goppa codes rather than Hamming codes, but the method is much the same.

Unfortunately the inventor of the method — Robert J. McEliece - passed away in May 2019, but his legacy lives on:

Robert J. McEliece, RIP: Who Laid Down The Foundations Of A Quantum Robust Future
We live in a world that is dominated by large (and faceless) corporation, but it is individuals who have really build…medium.com

Conclusion

To me, I love all the post-quantum methods. It is like someone took all the classic physics methods away, and said they would be replaced with a whole new set of methods. I just love the learning involved in investigating a whole new world of trust. One thing that is sure, is that RSA, ECC and DH are on their way out.