What’s That? … It’s a Blockchain!

I presented at a blockchain event a few years ago, and shared the stage with the head of blockchain at a major company. As I was flicking…

What’s That? … It’s a Blockchain!

I presented at an industry-focused blockchain event a few years ago, and shared the stage with the lead for blockchain at a major company. As I was flicking through my slides before the start of the presentation, he said to me, “What’s that? It looks interesting”, I looked at my slide, and said, “It’s a Merkle Tree”. “And, what’s that?” I paused for a few seconds as I considered what Ralph Merkle would have said in reply, but blurted out, “That’s a blockchain!”. “Ah, that’s nice. Never heard of that before”, he said.

I knew then, that the whole area of blockchain has in trouble of existing in a world which some people — even the head of blockchain chain at one of the major technology companies in the world — just used the word (/term) without really knowing what was undernealth. For me, it was like an electrical engineer asking me what the colours on the wires are, “Is blue live or neutral? I always get mixed up with them”.

Blockchain isn’t really a thing. It is part of our tool kit that we use to implement trust in our flawed digital work. It’s Merkle Trees, public key signing, identity checking, zero knowledge proofs, Pedersen commitments, and so much more.

In five pages, create a blockchain

It was Ralph Merkle, who, in five short pages, laid out the method that has become what we would know as blockchain [here]:

The patent was assigned to Stanford, and where it outlined a way to validate data through a tree of hashes, and where the top-level hash could be used to validate all the data within a block:

In this, data element Y1 and Y2 were hashed, and then a hash created from this, and eventually we reached the top of the tree, and where a proof of the validity of all the data could be proven from the top-level hash, and any changes quickly traced within the tree. It is such a beautiful patent, and just say what it is, and finishes. It has four basic claims, but, basically, it just built around a core figure for the tree.

Beautiful maths at its most wonderful form.

Satoshi Nakamoto

In Satoshi’s white paper of Bitcoin, he/she lays out a more trusted world, and at its core is the usage of Merkle Trees. With this we create a tree of hashes and then it this to create a root hash. In blockchain, it is this root hash that we add to our blockchain, in order to create a chaining effect.

So let’s say we have the names of a few countries: ‘France’, ‘Germany’, ‘Denmark’, ‘Sweden’, ‘UK’, ‘Spain’, and ‘Italy’. We can then hash as [here]:

Input:		 [ 'France', 'Germany', 'Denmark', 'Sweden', 'UK', 'Spain', 'Italy' ]
Root hash: 65C7C87E2731AE97D0BE89EA3299A544FD28DE6A
Tree depth 3
Tree levels: 4
Tree nodes: 7
Level  0
[ '65C7C87E2731AE97D0BE89EA3299A544FD28DE6A' ]
Level  1
[ '8D832398ACABF7B669C023174169C8876F9764DC',
'2BBB8E1E1D16671B763D0C026C9A2BA43FD73381' ]
Level  2
[ '0F705D70B6CF6FFE1FE570FC24947767438541FA',
'A5D0D57347C4BE03B6E91E0176D5E53614AA861E',
'F8A5C1FC111019BDCD2E6F7877299878F9699640',
'AD79EF0F076D3A686AB9738925F4DD2C7E69D7D1' ]
Level  3
[ 'E3772AC4B4DB87B4A8DBFA59EF43CD1A8AD29515',
'17D53E0E6A68ACDF80B78D4F9D868C8736DB2CEC',
'89DA124E04DFE1AD9946CD37D91A119E1D028898',
'72DDD2B619AF6D6A73FEBF80F7FCAD22495498CD',
'7770D04C7BFB791E7C95959F12D6797BEE87A8D8',
'20A8DF9B760336178FCA425339EC1C7E542A2463',
'AD79EF0F076D3A686AB9738925F4DD2C7E69D7D1' ]

The root of the tree is then 65C7C87E2731…299A544FD28DE6A, and we have seven nodes and four levels. We can now draw the Merkle Tree:

The SHA-1 hash of ‘France’ is E3772AC4B4D…43CD1A8AD29515. Notice that because we have an odd number of nodes, the hash for ‘Italy’ is used for the next level. The following is some sample code for node.js [here]:

var merkle = require('merkle');
const args = process.argv.slice(2);
const str= args[0];
var arr = str.split(',')
console.log("Input:\t\t",arr);
var tree = merkle('sha1').sync(arr);
console.log("Root hash:\t",tree.root());
console.log("Tree depth\t",tree.depth());
console.log("Tree levels:\t",tree.levels());
console.log("Tree nodes:\t",tree.nodes());
var i;
for (i = 0; i < tree.levels(); i++) {
console.log("\nLevel ",i);
console.log(tree.level(i));
}

If we create with no hashing method, we can clearly see the structure of the true. In this case we use ‘1’,’2',’3',’4' and ‘5’ for our data [here]:

var merkle = require('merkle');Input: [ ‘1’, ‘2’, ‘3’, ‘4’, ‘5’ ]
Type: none
Root hash: 12345
Tree depth 3
Tree levels: 4
Tree nodes: 6
Level 0[ ‘12345’ ]
Level 1[ ‘1234’, ‘5’ ]
Level 2[ ‘12’, ‘34’, ‘5’ ]
Level 3[ ‘1’, ‘2’, ‘3’, ‘4’, ‘5’ ]

The great advantage of using a Merkle Tree is that we can easily check that a value is included in the root hash. In the diagram below, to test that ‘Germany’ (Node 2) is in the tree, we would need only need Node 1, Node 34 and Node 567 in order to generate the root (Node 1234567):

The reason we only need nodes 1, 34 and 567, is that we can regenerate Node 12 and Node 1234 from these values. Bitcoin, for example, uses this method in order to check the validity of a transaction within the block. We can then add our transactions into a Merkle Tree, and produce a root hash. Within the block we can then add the root hash of the previous block and the root hash of the next block, and thus interlock the blocks into a chain:

The Merkle Patricia Tree

With the Merkle tree, we create a tree of hashes, and where the root hash provides the overall consistency of the data within the tree. Its core advantage is that we can easily check that data is within the tree by analysing a subtree [here].

A Merkle-Patricia-Tree enhances this by using a key (normally defined as a string) to store associative arrays. Patricia is defined as a Practical Algorithm To Retrieve Information Coded In Alphanumeric — paper.

A node is then associated with a key. This is defined as a trie — a digital tree. This differs from a Merkle tree in that the actual key for each node is not stored, but its position in the tree is used to define the key. The nodes which are below a given node are defined with the same prefix as the string at that node, and the root of the tree is then an empty string [here].

So let’s take an example with five words to index: flower, flows, far, pitching and pitches. We can now draw a tree that will index these strings:

If a user enters ‘f’, we progress to the second level, and then enters ‘low”, and onto the third level. And finally, they enter ‘s’, and we move to the fourth level. We can see that the data is now ordered and associated, and where the position in the tree defines the key to which that data element is associated with.

In the Ethereum blockchain, we use a modified Merkle Patricia Trie — as defined in the Yellow Paper [here] — in order to create a trie that contains all the transactions. In this way, we can generate a complete worldview of all the transactions that have ever been made.

As we are dealing with transaction IDs, each of the keys has X hexadecimal characters. Every node in the trie can then have 16 possible children. The maximum depth of the trie will be X.

Ref: https://i.stack.imgur.com/YZGxe.png

Each node can then be an extension, a branch or a leaf. A leaf is an endpoint and will contain the transaction value. In the diagram, we see that the root hash is the hash of all of the transactions. We then have an extension to define the top-level node, and where we can see that there are four transactions and which are defined by the keys of “a711355”, “a77d337”, “a7f9365” and “a77397”. The transaction amounts are for 45.0 ETH, 1.00 WEI, 1.1 ETH and 0.12 ETH, respectively.

We can then follow the tree to find the transactions. We start with the key value of “a7…” (ROOT: extension node). Then we have a leaf at “a7..1355” (and where “1355” is the end part of the key). The transaction value here is 45.0 ETH. We have another leaf node with “a7..9365” (and where “9365” is the end part of the key), and with a transaction value of 1.1 ETH.

Next, we have an extension to “a7d3”. Finally, we get to the last two transactions with end leaves (“a7d33..7” and “a7d39..7”), and which have transaction amounts of 1.00 WEI and 0.12 ETH, respectively.

In the following, we have the Top 2,500 Web sites in the world and implemented with a Radix Trie (aka PATRICIA trie) [here]:

Conclusions

We live in an almost completely untrusted digital world. With Merkle Trees we can add strong levels of trust, and easily compute the validity of transactions. If you are interested, here is an outline of the greatness of Ralph Merkle: