The Most Amazing Machine on the Planet Isn’t a Machine

In data, we must find a bridge between our world and there’s

The Most Amazing Machine on the Planet Isn’t a Machine

In data, we must find a bridge between our world and there’s

For every second of your existence on the planet, you have access to one of the most amazing machines ever created. And it isn’t an iPhone X, the Amazon AWS Cloud, the Ethereum blockchain, or the Tesla car, it is something a whole lot more intelligent … it is your brain.

Our computing world is now filled with machines which are trying to match our human brain, and focusing on single tasks such as face recognition, licence plate detection, and in tracking objects. There are masses of computing power going into self-driving cars, but driving a car is something we almost take for granted. But we can do them all, without even thinking about it. Imagine a world where you could not recognise your partner, just because they had shaved their beard off or had a haircut, or where you could not look at a photograph and recognise someone that you know. It all happens in our brain, and it is the thing that helps us survive and thrive.

We live in an analogue world and see light reflecting off the screen, and detect changes in colour and brightness. This allows us to see shapes, and then convert these into objects. Around a third of the human brain has a core function of analysing the things that we see — the visual system — and is contained within the cortex. For shapes, such as the letters of the alphabet, we need the object cortex, and where we can differentiate between things that have the same shape, such as a ball, the letter ‘O’, and an apple. The visual cortex can also make sense of seeing a car at different viewing angles, but still knowing that it is a car — even if the car was upside down. Computers, though, can only process in a binary manner and must convert our analogue world into a digital form. For them, a letter that we identify in our brain as the shape of the letter ‘e’ is represented by the binary pattern of “0110 0101”.

We thus differ in the ways that we often identify things around us. So, our human processing system is extremely powerful at making sense of our world, and in spotting objects no matter how they are presenting to us. As humans, this is fundamental and is a core part of keeping us safe. The fast recognition that a car is speeding towards you, allows you to quickly make a decision on the best course of action to take. For a machine, the data must be converted into a binary form, before the processing can take place. We thus have a problem.

How can we communicate effectively with a common format, so that humans can define thing to a machine — such as binary patterns — and for a machine to create something that a human would understand? We thus encode our data, and how it in simple forms that humans can understand. For binary patterns, such as for encryption keys, we might use a hexadecimal format, and for non-printable characters (such as a tab space, we could use a special escape character). And so, while humans are fast at recognising objects, our brains need complex models of objects and have to continually realign for our analogue world. For a machine, once captured, an ‘e’ becomes just eights bits of data.

Binary, hex, octal

On a computer system, code and data are represented as binary, but humans find it difficult to deal with binary formats, so other formats are used to represent the binary values. Two typical formats used to represent characters are ASCII and UTF-16. With ASCII we have 8-bit values and thus supports up to 256 different characters (28). UTF-16 extends the characters to 16-bit values, and thus gives a total of 65,536 characters (216). Within ASCII coding, we map printable characters, such as ‘a’, and ‘b’, to decimal, binary and hexadecimal values :

ASCII Binary    Hex   Decimal
----------------------------
e 0110 0101 0x65 101
E 0100 0101 0x45 69
' ' 0010 0000 0x20 32

We also have other ‘non-printing’ characters which typically have a certain control function. These include CR (Carriage Return), LF (Line Feed), Horizontal Tab (HT) and Space:

ASCII  Binary     Hex    Decimal Character representation
CR 0110 0101 0x0D& 13 \r
LF 0100 010 0x0A& 10 \n
HT 0000 0111 0x07& 7 \t

Web link (ASCII): here.

Within text files we are likely to have line breaks, and which are created by CR and LF characters. In Microsoft Windows-type systems, we use CR and LF at the end of a line (\n\r), while a Linux/Mac-type system will only use CR for a new line (\r). Normally when we encrypt into ciphertext it produces a bitstream which contains non-printing characters, and thus represent them in a printable way. We may also need to represent our encryption keys in a printable and/or distributable manner. For this we often use a hexadecimal or Base-64 format as it allows us to represent a hexadecimal format into a printable format:

The most common format for representing standard English characters is ASCII. In its standard form, it uses a 7-bit binary code to represent characters (letters, giving a range of 0 to 127), but it is rather limited in its scope as it does not support symbols such as Greek letters. To increase the number of symbols which can be represented, extended ASCII is used which has an 8-bit code. Some important non-printable ASCII characters are: New line (0x13); Carriage Return (0x10); Tab (0x07); and Backspace (0x08), while a Space is represented by 0x20. The representations are for ‘A’ and ‘B’ are defined in:

Char Dec UTC-16             ASCII    Hex   Oct  HTML
A 65 00000000 01000001 01000001 41 101 A
B 66 00000000 01000010 01000010 42 102 B

Hexadecimal and Base-64

The conversation to hexadecimal format involves splitting the bit stream into groups of four bits (Figure 2) and for Base-64 into groups of six bits (Figure 1.2). With the hexadecimal format, we have values from 0 to 15, and are represented by four-bit values from 0000 to 1111. For Base-64, we take six bits at a time. For example, if we take an example of “fred“, then we get:

ASCII  f        r        e        d
Binary 01100110 01110010 01100101 01100100

To convert to Base-64, we group in 6-bits:

Binary 011001 100111 001001 100101 011001 00

And then map these to a Base-64 table:

Binary 011001 100111 001001 100101 011001 00
Decimal 25 39 9 37 25 0
Base-64 Z n J l Z A

The result is ZnJlZA.

Figure 2: Hex conversion
Figure 3: Base64 conversion

With Base-64, we create groups of four Base-64 characters, and we pad with zeros to fill-up the six-bit values, and then use the “=” character to pad to create groups of four Base-64 characters:

test -> 01110100 01100101 01110011 01110100 
test -> 011101 000110 010101 110011 011101 00[0000] = =
test -> d G V z d A = =
help -> 01101000 01100101 01101100 01110000 
help -> 011101 000110 010101 110011 011101 00[0000] = =
help -> a G V s c A = =

Unfortunately, some of the characters look similar when they are printed, such as whether we have a zero (‘0’) or an ‘O’. To avoid this we can convert to a Base-64 format, but there are similar-looking letters: 0 (zero), O (capital o), I (capital i) and l (lower case L), and non-alphanumeric characters of + (plus) and / (slash). The solution is Base-58, used in Bitcoin applications, and where we remove the characters which are similar looking.

For Base-58, we convert the ASCII characters into binary, and the keep dividing by 58 and convert the remainder to a Base58 character. The alphabet becomes:

‘123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz’

If we take an example of ‘e’. With ‘e’ we have a decimal value of 101, so we divide by 58 to get:

1 remainder 43

and next we divide 1 by 58 and we get:

0 remainder 1

We then take the character at position 1 and at position 43, to give:

123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz

and gives:

2k

If we now take ‘ef’, we get 25958 (102 + 101 × 256), where we move each character up one byte. Basically, we take the binary value of the string and then divide by 58 and take the remainder. So ‘ef’ is ‘01100101 01100110’.

Web link (Base-58 conversion): here.

Conclusions

And there you go. Please, please, please, do waste your time with this most amazing machine, and try and exercise it. Education is one of the most important things that we have, so don’t sit back on the knowledge you have, go and learn something new.