What goes around, comes around again … Welcome, Code-based Cryptography

I have to smile when I see code-based methods coming back within a post quantum work, and where our existing public key methods will be…

What goes around, comes around again … Welcome, Code-based Cryptography

I have to smile when I see code-based methods coming back within a post-quantum work, and where our existing public key methods will be replaced by lattice, isogenies, hash-based, and code-based methods. I smile, because I started my career in cybersecurity and networking through data communications, and it’s great to see it coming back again.

I remember, too, watching the demonstration of audio CD-ROM on Tomorrow’s World and being amazing by its ability to cope with scratches on the disc. And so, what goes around, comes around again.

Part of my current work is looking at Reed-Solomon codes, and how they can be used with Shamir Shares, and a good deal of the theory goes back to a paper published by the might Robert McEliece [here]:

In the paper, Robert outlines that Shamir Secret Shares are actually just an example of the Reed-Solomon code, but where we can only reveal one of the factors. And, of course, we see McEliece’s work coming back into relevance with the Classic McEliece key exchange method for Post Quantum Cryptography, and which was defined [here]:

So let’s dip our toe in the water, and look at error-correcting code.

Hamming codes

With error detection, such as using parity bits and CRC-32, we can only detect when we receive an error. With an error correction code, we can not only detect that there has been an error, but we can correct the error.

One of the most popular methods is the Hamming code and which is a forward error correction (FEC) scheme. The error correction bits are known as Hamming bits and the number that need to be added to a data symbol is determined by the expression:

2^n>=(m+n+1)

where m is number of bits in the data symbol and n is number of Hamming bits.

Hamming bits are inserted into the message character as desired. Typically, they are added at positions that are powers of 2, i.e. the 1st, 2nd, 4th, 8th, 16th-bit positions, and so on. For example, to code the character 011001 then, starting from the right-hand side, the Hamming bits would be inserted into the 1st, 2nd, 4th and 8th-bit positions.

With (7,4) Hamming code we take 4 bits of data and add 3 Hamming bits to give 7 bits for each 4-bit value. We create a code generator matrix G and the parity-check matrix H. The input data is multiplied by G, and then to check the result is multiplied by H [here]:

If we have a data input of [1 0 1 0] then to create the message to send we get:

The bits transmitted will thus be “1 0 1 1 0 1 0”.

which identifies there are no errors.

For a data input of “1010”, we get the following:

Input           Coded (1 error)         Error Position
[1 0 1 0]       [1 0 1 1 0 1 0]         [0 0 0]
----------------------
Single Errors:
Input           Coded (1 error)         Error Position
[1 0 1 0]       [0 0 1 1 0 1 0]         [1 0 0]
[1 0 1 0]       [1 1 1 1 0 1 0]         [0 1 0]
[1 0 1 0]       [1 0 0 1 0 1 0]         [0 0 1]
[1 0 1 0]       [1 0 1 0 0 1 0]         [0 1 1]
[1 0 1 0]       [1 0 1 1 1 1 0]         [1 0 1]
[1 0 1 0]       [1 0 1 1 0 0 0]         [1 1 0]
[1 0 1 0]       [1 0 1 1 0 1 1]         [1 1 1]

It can be seen that the data is “1010” and after the Hamming bits are added the result is “1011010”. If there are no errors the result is “000”, which means there are no errors.

If we add an error of the first coded bit we get a received value of “0 0 1 1 0 1 0”. The result is then “100” which identifies that Bit 0 is in error. If we look at an error in the next bit, we get a value of “010” which identifies that position is in error, and so each result correctly identifies the error, and we can thus correct it (for single-bit errors).

The following is the code used [here]:

import numpy
import sys

val1='1011'
if (len(sys.argv)>1):
  val1=sys.argv[1]

def encode(m, g):
  en = numpy.dot(m, g)%2
  return en

def decode(m, h):
  dec = numpy.dot(h, m)%2
  return dec

def flipbit(enc,bitpos):

  if (enc[bitpos]==1):
    enc[bitpos]=0
  else:
    enc[bitpos]=1
  return enc
	
def hamming(inp):
  g =  numpy.array([[1, 0, 0, 0, 1, 1, 1],[0, 1, 0, 0, 0, 1, 1],[ 0, 0 ,1 ,0 ,1 ,0 ,1],[0, 0, 0, 1, 1, 1, 0]])
  h = numpy.array([[ 1, 0, 1, 1, 1, 0, 0],[1 ,1 ,0 ,1 ,0 ,1 ,0],[ 1 ,1, 1, 0, 0, 0, 1],])
  
  inp = inp[:4]

  enc = encode(inp, g)

  dec = decode(enc, h)
  print ("Input  \t\tCoded (1 error)  \tError Position")
  print (inp,"\t",enc,"\t",dec)

  print ("----------------------")
  print ("Single Errors:")
  print ("Input  \t\tCoded (1 error)  \tError Position")
  for i in range (0,7):
    enc = encode(inp, g)
    enc1=flipbit(enc,i)
    dec = decode(enc1, h)
    print (inp,"\t",enc1,"\t",dec)

inp =  numpy.frombuffer(val1.encode(), dtype=numpy.uint8)-ord('0')
 
hamming(inp)

We can expand on this, and use any size of binary code:

https://asecuritysite.com/comms/hamm3

Reed-Solomon

In 1960, Irving Reed and Gustave Solomon proposed the Reed-Solomon method [1]:

With this we have k data bits and 2t parity bits to give us a total of n bits, and which is defined an an (n,k) code for m-bit symbols. The value of t is the number of bits that can be corrected, and the block length (n) is defined by (2m-1) symbols.

To encode we create a generator function that is a polynomial:

All of the codewords created by this generator will then be exactly divisible by it. We then create a polynomial from the message (p(x)) so that:

This can be generated using Lagrange interpolation (as we would with Shamir Shares). The sender then sends:

and thus the value of s(x) will be actually divisible by g(x). If it is not, the receiver can determine that there has been an error, and where the error is determined from receiving:

and where e(x) will provide the locations of the errors. The following is an outline [here]:

import reedsolo 
import random
import sys

strin='Hello World'

rs = reedsolo.RSCodec(10)

str1=rs.encode(strin)

print 'Input string:\t\t',strin
print 'String after SR:\t',str1,'\nHex:',str(str1).encode("hex")

print '\n=== 1 error==='

i = random.randrange(0,len(str1))

str1=str1.replace(chr(str1[i]),'X',1)

print 'String with error:\t',str1
print 'Corrected string:\t',rs.decode(str1)

print '\n=== 2 errors==='

i = random.randrange(0,len(str1))

str1=str1.replace(chr(str1[i]),'X',1)

print 'String with error:\t',str1
print 'Corrected string:\t',rs.decode(str1)

print '\n=== 3 errors==='

i = random.randrange(0,len(str1))

str1=str1.replace(chr(str1[i]),'X',1)
print 'String with error:\t',str1
print 'Corrected string:\t',rs.decode(str1)

print '\n=== 4 errors==='

i = random.randrange(0,len(str1))

str1=str1.replace(chr(str1[i]),'X',1)

print 'String with error:\t',str1
print 'Corrected string:\t',rs.decode(str1)

print '\n=== 5 errors==='

i = random.randrange(0,len(str1))

str1=str1.replace(chr(str1[i]),'X',1)

print 'String with error:\t',str1
print 'Corrected string:\t',rs.decode(str1)

A sample run is:

Input string:  falkirk
String after SR: falkirkWþ�11µÖöR 
Hex: 66616c6b69726b57fe003131b5d6f6521e

=== 1 error===
String with error: falkXrkWþ�11µÖöR
Corrected string: falkirk

=== 2 errors===
String with error: falkXXkWþ�11µÖöR
Corrected string: falkirk

=== 3 errors===
String with error: falkXXkWþX11µÖöR
Corrected string: falkirk

=== 4 errors===
String with error: falkXXkWþX11µXöR
Corrected string: falkirk

=== 5 errors===
String with error: falkXXkWþX11XXöR
Corrected string: falkirk

Conclusions

If you are interested in the Post Quantum Crypto methods, they are here:

https://asecuritysite.com/pqc

References

[1] Reed, I. S., & Solomon, G. (1960). Polynomial codes over certain finite fields. Journal of the society for industrial and applied mathematics, 8(2), 300–304.