Knowing Your Nearest Neighbour … Meet Euclidian Distances

We increasingly have complex datasets, and where we need to find the nearest match of a given set of values. For example, we may have a…

Euclidean Distances And Polynomials

Polynomial operations are increasingly used in cryptography as a replacement for matrix operations. This is especially true in lattice encryption. So, let’s look at an example of using polynomials, and apply it to detect the difference between two vectors.

We increasingly have complex datasets, and where we need to find the nearest match of a given set of values. For example, we may have a database of facial points, and where we hold the measurements for a scan of a face. In the following, we see measurements for Billy Connolly’s right pupil, the length of his nose, and the length of his upper lip [here]:

When a new face is being scanned, we will thus go through our database of these features and find the one that matches closest to it. For this, we often use a Euclidean distance to find the closest match.

Euclidian distance

With the Euclidean distance, we measure the distance between two vector points, which is often used to find out the nearest vector to a given point. The Euclidean distance between (6,4,3) and (9,2,1) will thus be:

and which gives the length of the line that connects the two points. Overall, we can use Euclidean distances in machine learning and cryptography to determine the nearest similarity to something.

Often in cryptography and machine learning, we represent these as polynomial values. For example, we might have three data sets of A=[5,4,1,11,10], B=[3,3,2,10,8] and C=[5,6,1,12,9], and then can compute the Euclidean distance between them using polynomial operations:

a=5x⁴+4x³+x²+11x+10

b=3x⁴+3x³+2x²+10x+8

c=5x⁴+6x³+x²+12x+9

In Python we can use Numpy to convert these [here]:

A=[5,4,1,11,10]
B=[3,3,2,10,8]
C=[5,6,1,12,9]

a=np.poly1d(A)
b=np.poly1d(B)
c=np.poly1d(C)

and then to compute the Euclidean distance with:

distab = np.linalg.norm(a - b)
distac = np.linalg.norm(a - c)
distbc = np.linalg.norm(b - c)

The advantage of using polynomials in machine learning and cryptography is that we can constrain our values within a finite field (and using mod p operations), along with being able to use add, subtract, multiply and divide operations. Here is an outline of the code [here]:

import numpy as np
import sys
A=[5,4,1,11,10]
B=[3,3,2,10,8]
C=[5,6,1,12,9]
if (len(sys.argv)>1):
    A=eval("["+sys.argv[1]+"]")
if (len(sys.argv)>2):
 B=eval("["+sys.argv[2]+"]")
if (len(sys.argv)>3):
    C=eval("["+sys.argv[3]+"]")
a=np.poly1d(A)
b=np.poly1d(B)
c=np.poly1d(C)
print ("A:\n",np.poly1d(a))
print ("B:\n",np.poly1d(b))
print ("C:\n",np.poly1d(c))
distab = np.linalg.norm(a - b)
distac = np.linalg.norm(a - c)
distbc = np.linalg.norm(b - c)
 
print(f"Euclidean distance (a-b): {distab}")
print(f"Euclidean distance (a-c): {distac}")
print(f"Euclidean distance (a-c): {distbc}")

A sample run is:

A:
    4     3     2
5 x + 4 x + 1 x + 11 x + 10
B:
    4     3     2
3 x + 3 x + 2 x + 10 x + 8
C:
    4     3     2
5 x + 6 x + 1 x + 12 x + 9
Euclidean distance (a-b): 3.3166247903554
Euclidean distance (a-c): 2.449489742783178
Euclidean distance (a-c): 4.358898943540674

We can check the distance from A to B, with:

Conclusions

So, it’s all about finding your nearest neighbour. If you want to know more about polynomial operations within a finite field, try here:

https://asecuritysite.com/principles/polyop

This is the type of method that is used in lattice encryption.