Cybersecurity Debasing

In Cybersecurity, we often try to detect sequences of characters or bytes. The formats of our files can vary, and where we might use hex…

Photo by Amirhossein Azandarian Malayeri on Unsplash

Cybersecurity Debasing

In Cybersecurity, we often try to detect sequences of characters or bytes. The formats of our files can vary, and where we might use hex, binary, or Base64. There are many bases we can use, and each is defined by the number of characters they can support:

Base2  [01]
Base3 [123]
Base5 [01234]
Base10 [0123456789]
Base26 [A-Z]
Base32 [A-Z2-7=]
Base45 [0-9A-Z $%*+-./:]
Base58 (bitcoin) [1-9A-HJ-NP-Za-km-z]
Base62 [0-9A-Za-z]
Base64 [A-Za-z0-9+/=]
Base67 [A-Za-z0-9-.!~_]
Base85 (Ascii85) [!"#$%&'()*+,-./0-9:;<=>?@A-Z[\]^_`a-u]
Base91 [A-Za-z0-9!#$%&()*+,./:;<=>?@[]^_`{|}~"]

Base58 is interesting and is used in Bitcoin addresses. With this we see the “[1–9A-HJ-NP-Za-km-z]” character set and which does not have the characters that can be interpreted as another one. These include a lack of a “0” (zero), an “I” (a capital I), an “O” (a capital O), and an “l” (a lowecase ‘l”.

So, let’s try to automatically detect a few:

  • Message=”2JdhtsPysrBHd” Try.
  • Message=”bafFq9NqDpx … 7LReQEhtrZCKbQ” Base64 Try.
  • Message=”5468657265206973 … 77320746F646179" (Base10) Try.
  • Message=”315807065 … 153814190302191993" Try.
  • Message=”0101010001101 … 00100001" Try.
  • Message=”3122211321 … 221121211213332131133321" (Base3) Try.
  • Message=”B2BIECH44.OE-3E-34/*AHWE5/DC3DEVE” (Base48) Try.

Base45

Base45 format is used in applications such as QR codes within vaccination passports. With this we take two bytes are a time [A, B] and then determine the values of [C, D and E] for: (A×256)+B=C+(D×45)+(E×45×45). For this we basically determine (A×256)+B and then divide by 45 and note the remainder. We then have a lookup table for the remainder values. The table for the conversion is:

Value Encoding  Value Encoding  Value Encoding  Value Encoding
00 0 12 C 24 O 36 Space
01 1 13 D 25 P 37 $
02 2 14 E 26 Q 38 %
03 3 15 F 27 R 39 *
04 4 16 G 28 S 40 +
05 5 17 H 29 T 41 -
06 6 18 I 30 U 42 .
07 7 19 J 31 V 43 /
08 8 20 K 32 W 44 :
09 9 21 L 33 X
10 A 22 M 34 Y
11 B 23 N 35 Z

For example, for “hello”, we have:

h            e           l           l           o
0110 1000 0110 0101 0110 1100 0110 1100 0110 1111

The first two bytes are A=64+32+8=104 and B=64+32+5 =101, and where we get:

(104*256)+101 = 26,725

We then divide by 45 and note the remainder:

26725 / 45 = 593 r 40 = ‘+’
593 /45 = 13 r 8 -> ‘8’
13/45 = 0 r 13 -> ‘D’

The result is:

h            e           l           l           o
0110 1000 0110 0101 0110 1100 0110 1100 0110 1111
Encoded:  +8D VDL2
Decode: hello

Normally, we need a few more characters, in order for the program to detect the Base, so we will try “+8D VD82E59DV2FGECDZC$FF0$E”:

https://asecuritysite.com/coding/unbase?m=%2B8D%20VD82E59DV2FGECDZC%24FF0%24E

And we should get:

If we just try “+8D VDL2” on its own, it will not detect it correctly, as it could be other base formats.