Mathematics in the Blood: The Lenstra Family

I know it’s a strange question to pose, but which family has most advanced the Internet and Cybersecurity? Well, the Lenstra family has a…

Mathematics in the Blood: The Lenstra Family

Apple [here] Spotify [here]

I know it’s a strange question to pose, but which family has most advanced the Internet and Cybersecurity? Well, the Lenstra family has a strong claim to that title. From their Dutch roots, they have contributed so much to our modern world — both from a theoretical and a practical point of view. I suppose there’s something in the nature of the Dutch that not only wants to solve real problems but do it in a scientific way. That approach is also the beating heart of academic research — to take major problems and solve them through collaborative efforts, and where each researcher breaks solves part of the puzzle. The Internet — and in fact our modern world — has been created through the coming together of all the amazing work of researchers over many decades.

Meet the Cryptography and Problem-Solving Brothers

My grandfather was an electrician, and my father was one too. I suppose electrical things are in my blood. It was where I started my career, and I have always had a love for everything that relates to electrons. I must admit I gave myself a little too many electrical shocks when I was a child, as the temptation to take things apart just seemed too strong for me. And, it still amazes me that we often just take electricity — and all its applications in our modern world — for granted. Where would we be without our control of electricity and all things electrical? You certainly wouldn’t be reading this article, now.

So, sometimes, there’s something in our blood that defines our future careers. And for the Lenstra family that was certainly the case, and from their Dutch roots, Hendrik, Arjen, Andries and Jan Karel have become important mathematicians.

One of the most famous of the brothers for those who have studied networking is the mighty Jan Karel Lenstra (J.K. Lenstra) and whose many major breakthroughs include scheduling, local searches and the travelling salesman problem. We can thus all thank Jan for his work on routing problems, and which led to the creation of routing protocols on the Internet:

The solving of the routing problem on the Internet, allowed the Internet to scale to levels that we see now, and where we have almost instant access to information from any part of the planet. We can thank J.K. Lenstra for providing that foundation.

Arjen followed a cryptography focus for his work, including many classic papers such as those related to the factorization of polynomials and a famous paper entitled “Ron was wrong. Whit is right” [here]:

But, Arjen’s most cited paper included his brother (Hendrik W. Lenstra Jr.) as a co-author [3]:

And, in these days of Microsoft Word and LaTeX, don’t you just love the pen markup on the paper? The brothers also collaborated on another classic paper — and which included the mighty J.M Pollard [2]:

Lenstra–Lenstra–Lovász (LLL)

When the two brothers worked together they created some of their best work, and it was the classic Factorizing Polynomials with Rational Coefficients paper [3] that led to the Lenstra–Lenstra–Lovász (LLL) method [paper]. The paper also included mighty Laszlo Lovász [here] (who has an h-index of 109):

This will use this method — defined as Lenstra–Lenstra–Lovász (LLL) — to crack the signature, and discover the private key used to digitally sign the message. This will search for the private key that has been used to sign a message with ECDSA. In this case we will generate two signatures, and then search for a private key.

One of the most common signatures is ECDSA (Elliptic Curve Digital Signature Algorithm) and which is used with Bitcoin and Ethereum. With this, Bob creates a random private key (priv), and then a public key from:

Next, in order to create a signature for a message of M, he creates a random number (k) and generates the signature of:

The signature is then (r,s) and where r is the x-co-ordinate of the point kG. H(M) is the SHA-256 hash of the message (M), and converted into an integer value. If the k value is revealed for any of the signatures, an intruder can determine the private key using:

This works because:

and so:

and for priv:

We can then use the code [here] to implement a searching method based on LLL:

import ecdsa
import random
import libnum
import olll
import hashlib
import sys

# https://blog.trailofbits.com/2020/06/11/ecdsa-handle-with-care/
 
G = ecdsa.NIST256p.generator
order = G.order()

print ("Curve detail")
print (G.curve())
print ("Order:",order)
print ("Gx:",G.x())
print ("Gy:",G.y())

priv = random.randrange(1,order)
 
Public_key = ecdsa.ecdsa.Public_key(G, G * priv)
Private_key = ecdsa.ecdsa.Private_key(Public_key, priv)
 
k1 = random.randrange(1, pow(2,127))
k2 = random.randrange(1, pow(2,127))

msg1="Hello"
msg2="Hello1"

if (len(sys.argv)>1):
	msg1=(sys.argv[1])
if (len(sys.argv)>2):
	msg2=(sys.argv[2])

m1 = int(hashlib.sha256(msg1.encode()).hexdigest(),base=16)
m2 = int(hashlib.sha256(msg2.encode()).hexdigest(),base=16)
 
sig1 = Private_key.sign(m1, k1)
sig2 = Private_key.sign(m2, k2)

print ("\nMessage 1: ",msg1)
print ("Message 2: ",msg2)
print ("\nSig 1 r,s: ",sig1.r,sig1.s)
print ("Sig 2 r,s: ",sig2.r,sig2.s)
print ("\nk1: ",k1)
print ("k2: ",k2)

print ("Private key: ",priv)

r1 = sig1.r
s1_inv = libnum.invmod(sig1.s, order)
r2 = sig2.r
s2_inv = libnum.invmod(sig2.s, order)
 
matrix = [[order, 0, 0, 0], [0, order, 0, 0],
[r1*s1_inv, r2*s2_inv, (2**128) / order, 0],
[m1*s1_inv, m2*s2_inv, 0, 2**128]]
 
search_matrix = olll.reduction(matrix, 0.75)
r1_inv = libnum.invmod(sig1.r, order)
s1 = sig1.s
 
for search_row in search_matrix:
    possible_k1 = search_row[0]
    try_private_key = (r1_inv * ((possible_k1 * s1) - m1)) % order
 
    if ecdsa.ecdsa.Public_key(G, G * try_private_key) == Public_key:
        print("\nThe private key has been found")
        print (try_private_key)

A sample run is [here]:

Curve detail
CurveFp(p=115792089210356248762697446949407573530086143415290314195533631308867097853951, a=-3, b=41058363725152142129326129780047268409114441015993725554835256314039467401291, h=1)
Order: 115792089210356248762697446949407573529996955224135760342422259061068512044369
Gx: 48439561293906451759052585252797914202762949526041747995844080717082404635286
Gy: 36134250956749795798585127919587881956611106672985015071877198253568414405109

Message 1:  hello
Message 2:  goodbye

Sig 1 r,s:  115147473306387600780958700123813228515236063210926878166205132442387398405974 78422551211706787416844282162734821752165856246967039833155909830188362436931
Sig 2 r,s:  72928835934664146344187979593177679887058837944881110039604237325952057142506 34831214671095490475430891005520988929988430486970993941519827388518136205821

k1:  2238116107289725910464212774221939217
k2:  23155266189808659522258191324208917771
Private key:  3126769432554995310932591745910468237140199425344791317304188208833915624553

It is a truly fantastic paper, and well worth a read.

In December 2019, a team led by Paul Zimmermann of INRIA announced the factorization of the largest ever RSA modulus (RSA-240):

RSA-240 = 124620366781718784065835044608106590434820374651678805754818788883289666801188210855036039570272508747509864768438458621054865537970253930571891217684318286362846948405301614416430468066875699415246993185704183030512549594371372159029236099RSA-240 = 509435952285839914555051023580843714132648382024111473186660296521821206469746700620316443478873837606252372049619334517
        × 244624208838318150567813139024002896653802092578931401452041221336558477095178155258218897735030590669041302045908071447

The factorization involved factorizing a 795-bit integer into its prime number factors. It caused industry experts to define that RSA would only be safe with at least 2,048-bit keys. Actually it was Arjen, in the 1990s, who was the first to crack the early RSA challenges, and managed to factorize 330 (100 decimal digits), 364, 426 and 430-bit modulus values [here]:

Factorizing with elliptic curves

Hendrik W. Lenstra Jr. became a Professor at the University of Amsterdam in 1979, and then, in 1987, he appointed to the University of California, Berkeley. One of his most famous PhD students is Daniel J. Bernstein [here], who is famous for producing standards such as Curve 25119, ChaCha20 and Poly1305. In the year Hendrik was appointed to Berkley, he outlined a method to factorize integrations using elliptic curve methods [1]:

The security of several public key methods relies on the difficulty in factorizing a modulus created from the multiplication of large prime numbers. RSA is a good example of this, and where we take two large prime numbers (p and q), and multiply them to create a modulus (N). The difficulty is then to be able to find p and q, if we know N. The two core methods we can use for this factorization are the general number field sieve (GNFS) method and ECM (Elliptic Curve Method).

ECM

With his method, we define the moving from a point P on an elliptic curve to 2P. For this, we find the tangent to the point P and use this to find 2P. This tangent will be defined with a slope (s) and with a gradient in the form of a/b. For this, we must find the modular inverse of b. If we cannot find the modular inverse of b, a factor of N is gcd(N,b), and where gcd() is the greatest common denominator.

If our curve is y²=x³+ax+b, the slope (s) will be:

as defined in differentiation. In detail, we first pick a random point P=(x0,y0) on an elliptic curve of:

We also select a random value of A and then calculate:

For two points on the curve: P=(Px,Py) and Q=(Qx,Qy) , the slope of the line between them will be:

With this we have s in the form of a/b (modN) .

Next we define:

and using:

If Px=Qx and Py=−Qy, we define as 0 [Identity], else we calculate R=P+P=2P=(Rx,−Ry):

Here are some test runs:

N=15 (Factor: 3 and 5 Try!
N=6,161 (Factors: 61 x 101) Try!
N=32,128 (Factors: 2 x 2 x 2 x 2 x 2 x 2 x 2 x 251) Try!
N=53,421 (Factor: 3 x 17,807) Try!
N=55,440 (Factors: 2 x 2 x 2 x 2 x 3 x 3 x 5 x 7 x 11) Try!
N=999,999 (Factor: 3 x 3 x 3 x 7 x 11 x 13 x 37) Try!
N=10,000,000 (Factor: 2 x 2 x 2 x 2 x 2 x 2 x 2 x 5 x 5 x 5 x 5 x 5 x 5 x 5) Try!
N=100,001 (Factor: 11 x 9091) Try!
N=455,839 (Factor: 559 x 761) Try!
N=3,789,829 (Factor: 3209 x 1181) Try!
N=7,388,399 (Factor: 3571 x 2069) Try!
N=392,524,199 (Factor: 431 x 919 x 991) Try!

An outline of the code used is [code]:

#!/usr/local/bin/python
# -*- coding: utf-8 -*-
import math
import random 
import sys#y^2=x^3+ax+b mod n prime=[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271 ]
# ax+by=gcd(a,b). This function returns [gcd(a,b),x,y]. Source Wikipedia
def extended_gcd(a,b):   
	x,y,lastx,lasty=0,1,1,0
	while b!=0:
		q=a/b
		a,b=b,a%b
		x,lastx=(lastx-q*x,x)
		y,lasty=(lasty-q*y,y)
	if a<0: 
		return (-a,-lastx,-lasty)
	else:
		return (a,lastx,lasty)# pick first a point P=(u,v) with random non-zero coordinates u,v (mod N), then pick a random non-zero A (mod N), 
# then take B = u^2 - v^3 - Ax (mod N).
# http://en.wikipedia.org/wiki/Lenstra_elliptic_curve_factorizationdef randomCurve(N):
	A,u,v=random.randrange(N),random.randrange(N),random.randrange(N)
        B=(v*v-u*u*u-A*u)%N
        return [(A,B,N),(u,v)]	# Given the curve y^2 = x^3 + ax + b over the field K (whose characteristic we assume to be neither 2 nor 3), and points
	# P = (xP, yP) and Q = (xQ, yQ) on the curve, assume first that xP != xQ. Let the slope of the line s = (yP - yQ)/(xP - xQ); since K 	       # is a field, s is well-defined. Then we can define R = P + Q = (xR, - yR) by
	#	s=(xP-xQ)/(yP-yQ) Mod N		
	#	xR=s^2-xP-xQ	Mod N
	#	yR=yP+s(xR-xP)	Mod N 
	# If xP = xQ, then there are two options: if yP = -yQ, including the case where yP = yQ = 0, then the sum is defined as 0[Identity]. 	       # thus, the inverse of each point on the curve is found by reflecting it across the x-axis. If yP = yQ != 0, then R = P + P = 2P = 	      # (xR, -yR) is given by
	#	s=3xP^2+a/(2yP)	Mod N	
	#	xR=s^2-2xP	Mod N
	#	yR=yP+s(xR-xP)	Mod N
	#	http://en.wikipedia.org/wiki/Elliptic_curve#The_group_law''')
	
def addPoint(E,p_1,p_2):
	if p_1=="Identity": return [p_2,1]
	if p_2=="Identity": return [p_1,1]
	a,b,n=E
	(x_1,y_1)=p_1
	(x_2,y_2)=p_2
	x_1%=n
	y_1%=n
	x_2%=n
	y_2%=n
	if x_1 != x_2 :
		d,u,v=extended_gcd(x_1-x_2,n)
		s=((y_1-y_2)*u)%n
		x_3=(s*s-x_1-x_2)%n
		y_3=(-y_1-s*(x_3-x_1))%n
	else:
		if (y_1+y_2)%n==0:return ["Identity",1]
		else:
			d,u,v=extended_gcd(2*y_1,n)
			s=((3*x_1*x_1+a)*u)%n
			x_3=(s*s-2*x_1)%n
			y_3=(-y_1-s*(x_3-x_1))%n	return [(x_3,y_3),d]	# http://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication
	#	Q=0 [Identity element]
	#	while m:
	#		if (m is odd) Q+=P
	#		P+=P
	#		m/=2
	#	return Q')def mulPoint(E,P,m):
	Ret="Identity"
	d=1
	while m!=0:
		if m%2!=0: Ret,d=addPoint(E,Ret,P)
		if d!=1 : return [Ret,d]  # as soon as i got anything otherthan 1 return 
		P,d=addPoint(E,P,P)
		if d!=1 : return [Ret,d]
		m>>=1
	return [Ret,d]
def ellipticFactor(N,m,times=5):
	for i in xrange(times):
		E,P=randomCurve(N);
		Q,d=mulPoint(E,P,m)
		if d!=1 : return d
	return Nfirst=True
n=455839
if (len(sys.argv)>1):
        n=int(sys.argv[1])print n,'=',for p in prime:
	while n%p==0:
		if (first==True):
			print p,
			first=False
		else:
			print 'x',p,
		n/=pm=int(math.factorial(2000))
while n!=1:
	k=ellipticFactor(n,m)
	n/=k
	if (first==True):
		print k,
		first=False
	else:
		print 'x',k,

Conclusions

And, so, there’s something in the Dutch approach to solving real problems and in using a scientific approach, and the Lenstra have highlighted that as well as any other.

And for the factorizing of integers, the great public key methods of the past, may not scale well into the 2020s, especially as factorization becomes more powerful and is actually beating Moore’s Law. If you are interested, here are other methods for factorization:

Difference of squares. Diffsq. This factorizes using the difference of squares method.
Factors of integers. Factors. Determine factors of an integer.
Pollard’s ρ method (Factoring integers). Pollard. The Pollard ρ method factorises integers.
Simplify ap (modN) Go. Simplify the a^p (modN) operation.
Smooth numbers. Go. Outline of smooth numbers.
Quadratic residue (mod p). Go. Outline of quadratic residue (mod p).
Quadratic residue (mod N) and where N=pq. Go. Outline of quadratic residue (mod N) with N made-up of two prime numbers.
Dixon. Go. Dixon Method.

References

[1] Lenstra Jr, H. W. (1987). Factoring integers with elliptic curves. Annals of mathematics, 649–673.

[2] Lenstra, A. K., Lenstra, H. W., & Lovász, L. (1982). Factoring polynomials with rational coefficients. Mathematische annalen, 261(ARTICLE), 515–534.

[3] Lenstra, A. K., Lenstra, H. W., Manasse, M. S., & Pollard, J. M. (1993). The number field sieve. In The development of the number field sieve (pp. 11–42). Springer, Berlin, Heidelberg.