Let’s Ask Satoshi Three Questions … Why Two Hashes? Why RIPEMD160 and SHA-256? Why RIPEMD160?

Satoshi over-engineered Bitcoin, and made it ultrasecure

Let’s Ask Satoshi Three Questions … Why Two Hashes? Why RIPEMD160 and SHA-256? Why RIPEMD160?

Satoshi over-engineered Bitcoin, and made it ultrasecure

I strongly believe that Satoshi Nakamoto was a highly technical person and who had a background in cryptography and cybersecurity research. As a result, the technical selection of the component parts of Bitcoin has stood up to modern cracking. I care little about whether cryptocurrency is a good thing or a bad thing for our society (that is for our society, governments, and the markets to decide) and have never owned any Bitcoins, but I do sit back and wonder at the amazing cryptographic machine that Satoshi created.

And, so, on 3 January 2009, Satoshi set the machine in motion and created a new digital world — and one that was mathematically sound. It cared little about the paper-based approaches of the past and wet signatures — and brought a whole new level of security. Overall, not many things in cybersecurity could run for over 14 years without them breaking down in some way. For this, Satoshi basically over-engineered everything in the design and knew that a single flaw would have brought the whole infrastructure down — and never to be recovered. When you lose all your money from your bank, you don’t go back a second time with your money!

While the proof-of-work method has proven to be a significant weak point (not because of the method selected, but due to the waste of energy used), the cryptography construct has been shown to be fully robust. I appreciate that quantum computers could change this, but, at the moment, it’s an engine that has run for around 14 years without any major failures.

It is unlikely that we will ever know why Satoshi selected the things they did for Bitcoin, but it is probably one of the greatest breakthroughs in the history of computer science. For this, Satoshi had a dream that anyone, anywhere, could simply generate a random 256-bit random number (their private key) and store it in a wallet. They could create an ECDSA signature of their private key in order to produce a public key, and then hash this two times to produce a public identifier. Then, whenever a transaction is signed with the private key (using an ECDSA signature), the public identifier is then used to prove that this is the person who has the unique private key … just genius!

No databases or registered keys! No PKI (Public Key Infrastructure)! No digital certificates (Yuk!). Just pure cryptography, elliptic curve mathematics and digital hashing in all their beauty! Oh, and a Merkle Tree and some blocks, too, just to make sure it was ultra trustworthy.

Let’s ask Satoshi some questions …

But, there are quite a few questions to ask … apart from who Satoshi actually was. The first is why Satoshi hashed twice, why Satoshi selected RIPEMD160 and not SHA1 (which also has a 160-bit hash value), and why did Satoshi select both SHA2 and RIPEMD160? As we will find, Satoshi created something that completely over-engineered security, but aimed to produce an efficient solution to Bitcoin transactions. In a world, where we often see minimal levels of security, Satoshi really went to town on making Bitcoin (almost) unbreakable.

And, so before we begin to analyse, let’s look at how to generate a Bitcoin ID:

I will explain each element of this later, but before we do this, we need to look at the reason that Satoshi selected two hashes: to prevent length-extension attacks.

Length extension attack

Before we go into the hashing methods used, let’s look at a problem with hashing: the length extension attack. The original hash methods were often based on the Merkle-Damgård (MD) construction. With this, we create a hash function using blocks of data. Based on the MD construct, Ron Rivest created the MD5 hashing method, and it was widely adopted in the industry. It works by taking a static initialisation vector (IV) and then feeding this into a one-way function (f), along with a block of the message. We feed this output into the next stage, and so on until we get to a message pad at the end:

The one-way function (f) will generally compress the data and produce fewer bits out than are fed in. Unfortunately, the MD construct has many weaknesses, and one of the most serious is the length extension attack. With this, an adversary (Eve) can take a hash for an unknown message, and then add additional data to produce a new valid hash.

So Bob could take a hash of a password that he and Alice know (“qwerty123”) and then append with a message (“hello”) to produce:

H(Password || Message)

where “||” identifies the appending of one string onto another. Thus when Bob sends a message to Alice, she will prepend the message with the shared password, and generate the same hash. In this way, Bob has proven the message and that he knows the secret password. This is a message authentication code (MAC) and validates that Bob knows a shared secret and the message. But, the MD method is flawed as it is possible for Eve to take a previous hash for a known message, and then append a new method to produce:

H(Password || Original Message || New Message)

In this way, Eve does not know the password but can still generate a valid hash, and add her message onto it. An outline of the code is [here]:

import hashpumpy
import hashlib
import sys
password=b'password'
message= b'message'
addition = b'addition'
if (len(sys.argv)>1):
password=(sys.argv[1]).encode()
if (len(sys.argv)>2):
message=(sys.argv[2]).encode()
if (len(sys.argv)>3):
addition=(sys.argv[3]).encode()

# Compute a previous hash for H(Password || Message)
m = hashlib.sha1()
m.update((password+message))
rtn=m.hexdigest()
print ("Previous hash: ",rtn)

# Compute a hash for H(Password || Message || Addition)
rtn = hashpumpy.hashpump(rtn, message, addition, len(password))
print ("New hash: ",rtn[0])
print ("New message: ",rtn[1])
m = hashlib.sha1()
m.update(password+rtn[1])
rtn=m.hexdigest()
print ("Computing new hash (password+newdata): ",rtn)

A sample run for a message of “message” and with the addition of the text of “addition” is [here]:

Previous hash:  22583ca8f00efff6296b4b571b9c2e1bcf22a99a
New hash: dd448d0874b738ca1b85bc00e151fbf16393ce4a
New message: b'message\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00xaddition'Computing new hash (password+newdata): dd448d0874b738ca1b85bc00e151fbf16393ce4a

In this case, the hash of H(Password || “message”) is ‘22583ca8f00efff6296b4b571b9c2e1bcf22a99a’, and Eve can now use this to generate a new valid hash, without the knowledge of the password. We can see that Eve can generate some extra bytes in the message, and then add a new message and create a valid hash.

RIPEMD160 and the length extension attack

On its own, RIPEMD160 is vulnerable to the length extension attack. With this we might append a secret value (k) to a known message (m) and then produce a message authentication code of:

=RIPEMD_160(𝑚||𝑘)

Then if we know h along with the length of m and k, Eve can then easily compute another value hash of h’:

′=RIPEMD_160(𝑚||𝑝||𝑧)

and where p defines a well-known bit string and where Eve can pick whichever value of z she wants.

Bitcoin key

So, why did Satoshi Nakamoto select RIPEMD160 and SHA-2 for Bitcoin?

First, let’s look at how the private key and Bitcoin address is created. For a Bitcoin, we initially generate a random 256-bit value for the private key (s), and then create an ECDSA signature to produce the public key (vk):

sk = ecdsa.SigningKey.from_string(s.decode('hex'), curve=ecdsa.SECP256k1)
vk = sk.verifying_key

Next, we hash the ECDSA public key with SHA256, and then hash again with RIPEM160:

ripemd160 = hashlib.new('ripemd160')
ripemd160.update(hashlib.sha256(s.decode('hex')).digest())

Finally, we convert the output to a public identifier using Base58.

So why did Satoshi hash twice?

Well, the double hash certainly makes the generation of the Bitcoin address ultra secure, as it would be near impossible to reverse a Bitcoin address back to the public key, and then back to the private key.

So why did Satoshi use RIPEMD160?

Well, SHA256 produces a 256-bit hash, and RIPEMD160 produces a 160-bit. Like SHA-1, the 160-bit hash length would not be secure enough to fully protect. But, in this case, we have the strong security of SHA-256, and then apply RIPEMD160. The 160-bit makes the address makes the Bitcoin ID have a shorter address. For this, we have — after the ‘1’ identifier — 33 Base58 characters:

186FdYTQQbU7KvaLrsticTgPzv6wfgpc8G

Each Base58 character supports around five bits, and we end up with 160 bits for the identifier.

So, that is the possible reason for the 160 bits, but why do we use RIPEMD160, and not SHA-1 (which also has a 160 bit hash)?

Coding

RIPEMD is a 128-bit, 160-bit, 256-bit or 320-bit cryptographic hash function and was created by Hans Dobbertin, Antoon Bosselaers and Bart Preneel [1]. It is used on TrueCrypt, and is open source. The 160-bit version is seen as an alternative to SHA-1, and is part of ISO/IEC 10118 [Theory].

The following defines the coding for the generation of Bitcoin keys [here]. In this case, we use the RIPEMD160 hash to create the public key:

import ecdsa
import random
import hashlib
b58 = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'
def privateKeyToWif(key_hex):
return base58CheckEncode(0x80, key_hex.decode('hex'))

def privateKeyToPublicKey(s):
sk = ecdsa.SigningKey.from_string(s.decode('hex'), curve=ecdsa.SECP256k1)
vk = sk.verifying_key
return ('\04' + sk.verifying_key.to_string()).encode('hex')

def pubKeyToAddr(s):
ripemd160 = hashlib.new('ripemd160')
ripemd160.update(hashlib.sha256(s.decode('hex')).digest())
return base58CheckEncode(0,ripemd160.digest())
def keyToAddr(s):
return pubKeyToAddr(privateKeyToPublicKey(s))
def base58encode(n):
result = ''
while n > 0:
result = b58[n%58] + result
n /= 58
return result
def base58CheckEncode(version, payload):
s = chr(version) + payload
checksum = hashlib.sha256(hashlib.sha256(s).digest()).digest()[0:4]
result = s + checksum
leadingZeros = countLeadingChars(result, '\0')

return '1' * leadingZeros + base58encode(base256decode(result))
def base256decode(s):
result = 0
for c in s:
result = result * 256 + ord(c)
return result
def countLeadingChars(s, ch):
count = 0
for c in s:
if c == ch:
count += 1
else:
break
return count
private_key = ''.join(['%x' % random.randrange(16) for x in range(0, 64)])
print 'Private key: ',private_key
pubKey = privateKeyToPublicKey(private_key)
print '\nPublic key: ',pubKey
print '\nWif: ',privateKeyToWif(private_key)
print '\nAddress: ',keyToAddr(private_key)

A sample run is:

Private key:  97c5a919495d4869c11e0872480939c0a81bfe3674767a2e3e2c66490b6113a8

Public key: 04ece141ea5eaa448cd5cabf7f4aeef7a529f5b140fb4e6d2840a142a9325f09801c844db4067b39a1a9cdb96d41f98c6abaf6f9f11cbe4793954b0f08e9e1e951

Wif: 5Jy8PwjawKaJASM2zsracU5cHEQ9FhHtq5cMrTrpKwsPsQjjyeN

Address: 186FdYTQQbU7KvaLrsticTgPzv6wfgpc8G

OpenSSL has dropped support for RIPEMD160, and which stops Hashlib from using it. In this case, we can use Pycryptodome to implement:

from Crypto.Hash import RIPEMD160
s="Hello"

h = RIPEMD160.new()
h.update(s.encode())
print (h.hexdigest())

RIPEMD160

While OpenSSL has dropped RIPEMD160, Libre OpenSSL still supports it. A test vector for RIPEMD is:

RIPEMD-160("The quick brown fox jumps over the lazy dog") =
37f332f68db77bd9d7edd4969571ad671cf9dd3b

And an implementation using Libre OpenSSL is here:

https://asecuritysite.com/hash/ripemd

Conclusions

I appreciate this was a long article, but it needs to be, in order to show how well Satoshi did with his Bitcoin approach. For the selection of RIPEMD160, Satoshi wanted as small a public ID as possible. For RIPEMD160 rather than SHA-1, Satoshi perhaps wanted to mix-up the hashing methods — in case SHA-1 and SHA-256 were broken.

Satoshi dreamed, analysed, engineered, and implemented, and then stepped back to watch the machine in motion — and took no additional credit for it. Overall, Bitcoin is far from perfect but it broke the mould and started the rebuilding of cybersecurity and trust — in the way it should have been created.

Reference

[1] Dobbertin, H., Bosselaers, A., & Preneel, B. (1996, February). RIPEMD-160: A strengthened version of RIPEMD. In International Workshop on Fast Software Encryption (pp. 71–82). Berlin, Heidelberg: Springer Berlin Heidelberg.