Base64 is Malleable!

First, let’s start with some basics.

Photo by Alvaro Calvo on Unsplash

Base64 is Malleable!

First, let’s start with some basics.

Base64 is a way of representing binary data in the form of text, and is used for exchanging encryption keys, ciphertext, file attachments in emails, and so on. It can also be used to create a thumbprint for a data object, and in representing a digital signature. It is thus important that it gives us a reliable result. And what is malleable in coding? Well, it is where we operate on something, and for it to change it result. For example, if we add five to a value, and then subtract by five, we should get the original value. If not, our code is malleable.

A bit of magic

Let’s take a Base64 encoded string, and decode it to a byte array and print the answer. Next, we will take this byte array and convert it back into a Base64 string and print it. And — as if by magic — the value for the Base64 string generated differs from the original.

So let’s try with a range of values from a byte value of “0000 0000” (0x00) to “1111 1111” (0xFF) [here]:

When we run this program, it can be seen that decoding back into Base64 most of the time gave the wrong answer, and where a “w” or a “Q” replaced the second Base64 character:

00== d3 0w== d3
10== d7 1w== d7
20== db 2w== db
30== df 3w== df
40== e3 4w== e3
50== e7 5w== e7
60== eb 6w== eb
70== ef 7w== ef
80== f3 8w== f3
90== f7 9w== f7
a0== 6b aw== 6b
b0== 6f bw== 6f
c0== 73 cw== 73
d0== 77 dw== 77
e0== 7b ew== 7b
f0== 7f fw== 7f
01== d3 0w== d3
11== d7 1w== d7
21== db 2w== db
31== df 3w== df
41== e3 4w== e3
...
ee== 79 eQ== 79
fe== 7d fQ== 7d
0f== d1 0Q== d1
1f== d5 1Q== d5
2f== d9 2Q== d9
3f== dd 3Q== dd
4f== e1 4Q== e1
5f== e5 5Q== e5
6f== e9 6Q== e9
7f== ed 7Q== ed
8f== f1 8Q== f1
9f== f5 9Q== f5
af== 69 aQ== 69
bf== 6d bQ== 6d
cf== 71 cQ== 71
df== 75 dQ== 75
ef== 79 eQ== 79
ff== 7d fQ== 7d

And, perhaps Golang is the problem, so let’s try some Python code [here]:

And magically, it is wrong again (D3 is 211 in decimal):

(‘\xd3’, ‘0w==’)

Why?

Basically what is happening here relates to the conversion from Base64 into bytes. With Base64 we take six bits at a time, and encode as a Base64 character (see table below). If we do it long-handed, then a “0” is 0x52 (110100b) and so we get:

“00==” is 110100 __ 11 0100

And where “=” is just padding until we get multiples of four characters.

But, in the conversion, the Base64 converter to a byte array doesn’t like not having a multiple of eight bits, and so it chops off the 4 bits at the end, and we get (where “_” has been truncated):

“0w==” is 110100 11 _ _ _ _

This is 211 is decimal, and is the value return. We thus only return a single byte (rather than the 12 bits of the original).

And so, it’s solved.

Conclusions

Don’t just implement Base64 signatures in your code, and assume that they will work. A signature once applied, cannot be undone. Make sure that you are always dealing with multiples of eight bits in your code, and that you test your code. Any bit value which is not a multiple of 8 bits, will most likely give the wrong Base64 string value when converted back.