Crypto is SHAttered!

SHA-1 is not a favoured son at the present time, and has been on borrowed time when it has shown to have theoretical collisions from its…

Crypto is SHAttered!

SHA-1 is not a favoured son at the present time, and has been on borrowed time when it has shown to have theoretical collisions from its hash signatures. But it was Google’s who finally produced a collision for SHA-1 in 2017. For this Google used some of the best computing power around to cause a practical collision [here]:

Google, and the University of Amsterdam, thus announced that they had cracked SHA-1 (Secure Hash Algorithm 1), and which has been a standard hashing method since 1993. SHA-1 produces a 160-bit hash signature, and NIST designed a new hash function for SHA-2, and which has four main hash sizes: 224-bits (SHA-224); 256-bits (SHA-256); 384 bits (SHA-384); and 512 bits (SHA-512). SHA-3 uses a new type of hashing method, and is optimised for IoT methods [here].

But wait … it took Google two years to create a single collision. Considering the resources at Google’s fingertips, the cracking of SHA-1 is well out-of-budget for most organisations [here]. The answer for many is the move to SHA-2/SHA-256 [here], but this can be costly, especially to replace digital certificates which use SHA-1. Basically we have evolved from MD5 (a 128-bit hash signature), and then onto SHA-1 (a 160-bit hash signature), but SHA-256 ( a 256-bit hash signature) is now being recommended. And just in case SHA-2 gets cracked, there’s a new method defined with SHA-3 (Keccak) [here].

The research team produced two PDF files which have the same hash signature: (PDF 1 and PDF 2). This type of collision is known

Overall SHAttered took 9,223,372,036,854,775,808 computations, with the equivalent of 6,500 years of single-CPU computations (such as from a desktop computer) and 110 years of single-GPU computations. With 110 GPUs, of course, it will only take one year. So, if you can afford lots of GPUs, then you could find a collision, but its going to be costly in terms for energy requirements. SHAttered is 100,000 faster than a brute force attack that uses the birthday paradox, and where we would it would take 12,000,000 GPU one year to crack.

So how did they manage to beat brute-force? Well they used a few tricks in the PDF format, in order to stuff bytes in certain ways, that allowed them to produce the same SHA-1 hash:

MD5 problems

Moore’s Law predicted that computing power doubles every 18 months or so, so if we have a code which takes 100 years to crack, within 18 months, with the equivalent cost of a system, it will only take 50 years. To simplify things we must project that computing power doubles every year, so we find that a code which takes 100 years to crack, will, after 10 years, only takes a matter of weeks to crack (7 weeks). But the trend of improving hardware is now being overtaken by the Cloud, and the standard cryptography we have been using for years is now being push off-the-shelf.

The first to feel the heat is MD5, created by Ron Rivest, and has been a standard method for creating a digital fingerprint of data. It is used extensively in checking that data has not been changed and in providing identity. In the past it has been used to store hashed values of passwords, but its application in this area is reducing fast, as many of the common hashed MD5 values for words have been resolved. One of the key things that is important for MD5 is that the different data does not produce a collision — where different data, especially in the same type of context does not produce the same hash signature. Recently, though, Mat McHugh showed that he could produce the same hash signature for different images, using hashclash, and for just 65 cents on the Amazon GPU Cloud, and took just 10 hours to process.

For 10 hours of computing on the Amazon GPU Cloud, Mat created these two images which generate the same hash signature (Figure 1). If we check the hash signatures we get:

C:\openssl>openssl md5 hash01.jpg
MD5(hash01.jpg)= e06723d4961a0a3f950e7786f3766338
C:\openssl>openssl md5 hash02.jpg
MD5(hash02.jpg)= e06723d4961a0a3f950e7786f3766338

Figure 1: Images

Hashing

With MD5 we use a 128-bit digital fingerprint for our data, and a change of one bit should change the complete signature. For example:

  • Try “The quick brown fox jumps over the lazy dog.”. Try!, which should give a MD5 of: E4D909C290D0FB1CA068FFADDF22CBD0
  • Try “The quick brown fox jumps over the lazy dog”. Try!, which should give a MD5 of: 9E107D9D372BB6826BD81D3542A419D6

The method has been shown that it has flaws, where a change in a few of the bits, does not change the output. A collision occurs when there are two different values that produce the same hash signature. In the following example we use a hex string to define the data element (as the characters would be non-printing).

  • Try!, which should give a MD4 hash of: 79054025255FB1A26E4BC422AEF54EB4.
  • Try!, which should give a MD4 hash of: 79054025255FB1A26E4BC422AEF54EB4

Programs too can be modified to give the same hash value, for example these files(goodbye.exe and hello.exe).

C:\openssl>openssl md5 erase.exe
MD5(erase.exe)= cdc47d670159eef60916ca03a9d4a007
C:\openssl>openssl md5 hello.exe
MD5(hello.exe)= cdc47d670159eef60916ca03a9d4a007
C:\openssl>erase.exe
This program is evil!!!
Erasing hard drive...1Gb...2Gb... just kidding!
Nothing was erased.
(press enter to quit)
C:\openssl>hello.exe
Hello, world!
(press enter to quit)

Cracking the Cloud

The Cloud is becoming the largest (and most inexpensive) super computer ever created, and when GPU (Graphical Processing Units) are added it becomes superfast in cracking ciphers. With this we can parallelise the processing for the cracking so that a range of hash values can be allocated to processing elements running in parallel so that a 1,000 processing element array will take approximately 1/1,000th of the time.

There are two ways to defeat a computer with cryptography. The first is to give it a problem that it really struggles with, such as: “If I take a value and raise it to the power of a secret number and then divide by a certain number, and give you the remainder, you’ll not be able to find what my secret number is within a reasonable amount of time” — basis of El Gamal and Diffie-Hellman. The other method is to have too many of something — such as keys — so that it will take too long for the computer to find the right key that fits.

Most of the cryptography methods are thus defined to make it too difficult to find the right fit. Unfortunately the benchmarking for many of the methods is built around fair old computers, running with just a few cores, and non-optimized for the task. With GPUs in the Cloud, the task becomes so much easier.

An extension of this is to customise for the method used, such as using ASICs to crack Bitcoins, but, compared with running the cracking on the Cloud, this is more expensive in costs.

It’s your birthday

Mat used the birthday attack which is one of the methods used for a brute-force attack, and is based on the birthday problem in probability theory. It defines that we take a set of n randomly chosen people, and there will be a certain percentage that will have the same birthday. A group size of only 70 people results in a 99.9% chance of two people sharing the same birthday. Using this method, if we take an m-bit output there are 2^m messages, and the same hash value would only require 2^(m/2) random messages.

Here are some examples:

  • Same birthday with 1 person should give 0%. Calc
  • Same birthday with 10 people should give 11.7%. Calc
  • Same birthday with 20 people should give 41.14%. Calc
  • Same birthday with 23 people should give 50.73%. Calc
  • Same birthday with 30 people should give 70.63%. Calc
  • Same birthday with 60 people should give 99.41%. Calc

Conclusions

Two years is a long time, and, as far as Google knows, no-one in the wild has implemented that crack. The industry is moving towards SHA-2 and SHA-3, both of which should be safe for billions of years (or until the next crack comes along).

Finally is this a good bet …

If we have classes of 30 students, and I take the money if there is at least two students in the class with the same birthday, otherwise you can have the money.

Try this link.