Some Companies Are Still Storing Passwords With MD5 and Unsalted!

In the car industry, if you find out that something is truly unsafe, you recall the car and replace the component with something that fixes…

Photo by American Public Power Association

Some Companies Are Still Storing Passwords With MD5 and Unsalted!

In the car industry, if you find out that something is truly unsafe, you recall the car and replace the component with something that fixes the vulnerability. But, in cybersecurity, well, for some reason — in places — we just never learn. One area of cybersecurity is the cryptographic naughty step — and on that step is the MD5 hashing method. It shares its position with another hashing method (SHA-1) and with the symmetric key encryption methods of DES and RC4. These methods should never really be seen in a production environment and should be long since gone.

To use ever use MD5 for hashing passwords is the equivalent of driving without a seat belt. Overall, it has 128 bits for a hash, and where it is not only possible to crack the hash, but it is possible to create a hash on a program or a document that still makes it valid. To give an example, it is relatively easy to take the Putty executable and insert a backdoor into the program, and then still end up with the same valid MD5 hash. It is basically as bad as it gets in cybersecurity.

But, Électricité de France (EDF) has just been finded EUR 600,000 by CNIL (Commission Nationale de l’Informatique et des Libertés) [here]. One of the rulings related to Sur le manquement à l’obligation d’assurer la sécurité des données (failure to store data securely). Along with this, it was also pin-pointed for sending out marketing material without content, collecting data on customers without any clear reason, and not handling data requests from customers to have their data deleted.

And, what was even worse, is that there was no salting of the password for the hashed password (for EDF, around 25,800 customer passwords were still just hashed with MD5, and without salting). This moves it from being easy to crack a hashed password, to be extremely trivial. A simple rainbow table can easily crack the hashed password, and Hashcat can be easily set up to try millions of passwords per second, especially with the usage of GPUs. Most hashed password system these days uses a slow hashing method such as bcrypt, scrypt, Argon2 or PBKDF2, and which are not able to scale onto a GPU. These methods support hashing over a number of rounds, and which means that the hashing process cannot run on multiple cores.

EDF, too, had around 2,400,000 passwords that used SHA-512, and which were not salted. Using SHA-512 is no real defence against Hashcat, as SHA-512 is a fast hashing method, and with a GPU, it is possible to crack at billions of hashes per second. As no salting is used, too, a password which is already known for its hash, can be trivially hashed by a rainbow table.

A trivial task

A hash collision is created when we take two different inputs of data, and then create the same hash. One way of doing with is to search for two data elements and add random data in order to find the same hash. With GPUs, and with the MD5 method it is now possible to take two images and eventually create the same hash value for them.

For MD5 we have a hash of 128 bits, and so has 2¹²⁸ different hashes. Unfortunately, it doesn’t take too long to create a collision, and where we have different content producing the same hash. Recently, though, Mat McHugh showed that he could produce the same hash signature for different images, using Hashcat, and for just 65 cents on the Amazon GPU Cloud, and took just 10 hours to process. He created these two images which generate the same hash signature (Figure 1). If we check the hash signatures we get:

C:\openssl>openssl md5 hash01.jpgMD5(hash01.jpg)= e06723d4961a0a3f950e7786f3766338
C:\openssl>openssl md5 hash02.jpgMD5(hash02.jpg)= e06723d4961a0a3f950e7786f3766338
Figure 1: Images

But can we find instant collisions? Well, we can if we use a natural collision. For this, if we have two data elements of a and b, then if H(a)=H(b), we can also create a hash for H(a || c) = H(b || c) and where “||” is a concatenation. In the following, we have a collision. An example of a collision in MD5 is:

0e306561559aa787d00bc6f70bbdfe3404cf03659e704f8534c00ffb659c4c8740cc942feb2da115a3f4155cbb8607497386656d7d1f34a42059d78f5a8dd1ef
0e306561559aa787d00bc6f70bbdfe3404cf03659e744f8534c00ffb659c4c8740cc942feb2da115a3f415dcbb8607497386656d7d1f34a42059d78f5a8dd1ef

If we now add “hello” to this data we get [here]:

b'0e306561559aa787d00bc6f70bbdfe3404cf03659e704f8534c00ffb659c4c8740cc942feb2da115a3f4155cbb8607497386656d7d1f34a42059d78f5a8dd1ef' 
Hex: cee9a457e790cf20d4bdaa6d69f01e41 b'0e306561559aa787d00bc6f70bbdfe3404cf03659e744f8534c00ffb659c4c8740cc942feb2da115a3f415dcbb8607497386656d7d1f34a42059d78f5a8dd1ef'
Hex: cee9a457e790cf20d4bdaa6d69f01e41
Adding: hello b'0e306561559aa787d00bc6f70bbdfe3404cf03659e704f8534c00ffb659c4c8740cc942feb2da115a3f4155cbb8607497386656d7d1f34a42059d78f5a8dd1ef68656c6c6f'
Hex: 4d0c8baa8a036cff537f00d6e26bbef5 b'0e306561559aa787d00bc6f70bbdfe3404cf03659e744f8534c00ffb659c4c8740cc942feb2da115a3f415dcbb8607497386656d7d1f34a42059d78f5a8dd1ef68656c6c6f'
Hex: 4d0c8baa8a036cff537f00d6e26bbef5

We see that the original data gives the same MD5 hash value (cee9a457e790cf20d4bdaa6d69f01e41), and when we add the string of “hello”, we also get a collision of “4d0c8baa8a036cff537f00d6e26bbef5”.

An outline of the code is [here]:

import hashlib
from binascii import unhexlify,hexlify
import sysm = hashlib.md5()
m1 = unhexlify('0e306561559aa787d00bc6f70bbdfe3404cf03659e704f8534c00ffb659c4c8740cc942feb2da115a3f4155cbb8607497386656d7d1f34a42059d78f5a8dd1ef')
m2 = unhexlify('0e306561559aa787d00bc6f70bbdfe3404cf03659e744f8534c00ffb659c4c8740cc942feb2da115a3f415dcbb8607497386656d7d1f34a42059d78f5a8dd1ef')# '0e306561559aa787d00bc6f70bbdfe3404cf03659e7X4f8534c00ffb659c4c8740cc942feb2da115a3f415dcbb8607497386656d7d1f34a42059d78f5a8dd1ef'm1 = unhexlify('4dc968ff0ee35c209572d4777b721587d36fa7b21bdc56b74a3dc0783e7b9518afbfa200a8284bf36e8e4b55b35f427593d849676da0d1555d8360fb5f07fea2' )m2 = unhexlify('4dc968ff0ee35c209572d4777b721587d36fa7b21bdc56b74a3dc0783e7b9518afbfa202a8284bf36e8e4b55b35f427593d849676da0d1d55d8360fb5f07fea2')m1=unhexlify('d131dd02c5e6eec4693d9a0698aff95c2fcab58712467eab4004583eb8fb7f8955ad340609f4b30283e488832571415a085125e8f7cdc99fd91dbdf280373c5bd8823e3156348f5bae6dacd436c919c6dd53e2b487da03fd02396306d248cda0e99f33420f577ee8ce54b67080a80d1ec69821bcb6a8839396f9652b6ff72a70')m2=unhexlify('d131dd02c5e6eec4693d9a0698aff95c2fcab50712467eab4004583eb8fb7f8955ad340609f4b30283e4888325f1415a085125e8f7cdc99fd91dbd7280373c5bd8823e3156348f5bae6dacd436c919c6dd53e23487da03fd02396306d248cda0e99f33420f577ee8ce54b67080280d1ec69821bcb6a8839396f965ab6ff72a70')# d131dd02c5e6eec4693d9a0698aff95c2fcab58-12467eab4004583eb8fb7f8955ad340609f4b30283e4888325-1415a085125e8f7cdc99fd91dbdX280373c5bd8823e3156348f5bae6dacd436c919c6dd53e2-487da03fd02396306d248cda0e99f33420f577ee8ce54b67080a80d1ec69821bcb6a8839396f965-b6ff72a70word = "hello"
if (len(sys.argv)>1):
word=str(sys.argv[1])a =word.encode()mm1 = hashlib.md5()
mm2 = hashlib.md5()
mm1.update(m1+a)
mm2.update(m2+a)
print (hexlify(m1+a),"\nHex:",mm1.digest().hex())
print ("\n",hexlify(m2+a),"\nHex:",mm2.digest().hex())

Conclusions

For such a large company as EDF, where are the cybersecurity teams? Where are the auditors? Where’s the CEO asking questions about how passwords are stored? The usage of legacy systems is well-known in the industry, and every company needs to examine the way they deal with their customer’s data and move toward things that are cryptographically sound as a starting point. But, should also be looking to move its security levels up, especially in the usage of zero-knowledge proof techniques and multifactor authentication (MFA).

If you want to learn more about hashing methods, try here:

https://asecuritysite.com/hash

Enjoy your learning!

https://billatnapier.medium.com/membership