We can analyse out input encoded values using a range of methods. This page computes the top matches for a given string input.
Cracking Codes - Detecting the Encoding Method |
Theory
If you are into cybersecurity you should hopefully know all about hexademical and Base64 format. With hex, we have a character set of [0–9A-F]. But, we also have many other Base character sets, such as Base58 for Bitcoin, and Base45 for the EU Green Passport.
>Base2
For “fred” we can represent our ASCII characters in a bit format, and where we have [here]:
01100110 01110010 01100101 01100100 f r e d
This is actually a Base-2 form.
Base16
It takes up too many characters for Base2, so we often group our bits into four, and then define the equivalent hex character. This then gives us:
0110 0110 0111 0010 0110 0101 0110 0100 6 6 7 2 6 5 6 4
The Base16 form of “help“ is thus “68656C70”. Here is an example of the conversion:
Figure 2: Conversion to hex
Base-64
Another common format is Base64, and which uses a character set of “[A-Za-z0–9+/=]”. With “help” we have:
01101000 01100101 01101100 01110000 h e l p 011010 000110 010101 101100 011100 00 Z n J l Z A = =
In this case we need the number of characters to be a multiple of four, so we pad the end of the Base16 string. This “fred” is “ZnJlZA==” in Base16 (hex). The Base64 mapping is:
Figure: Conversion to Base-64
Base58
Base58 is used in Bitcoin, we where have at the character set of: [1–9A-HJ-NP-Za-km-z]. This has been created to get rid of the characters that could be misinterpreted for a Bitcoin wallet address. These include ‘0’, ‘I’,’O’, and ’l’. An example is:
Input: fred Type: base58 Coding: 3ctAMq
Base45
Base45 format is used in applications such as QR codes within vaccination passports. With this we take two bytes are a time [A, B] and then determine the values of [C, D and E] for: (A×256)+B=C+(D×45)+(E×45×45). For this we basically determine (A×256)+B and then divide by 45 and note the remainder. We then have a lookup table for the remainder values.
An example is [here]:
Input: test Type: base45 Coding: 7WE QE
Some character sets
Here are some character sets for a few others:
Base2 [01] Base3 [123] Base5 [01234] Base10 [0123456789] Base26 [A-Z] Base32 [A-Z2-7=] Base45 [0-9A-Z $%*+-./:] Base58 (bitcoin) [1-9A-HJ-NP-Za-km-z] Base62 [0-9A-Za-z] Base64 [A-Za-z0-9+/=] Base67 [A-Za-z0-9-.!~_] Base85 (Ascii85) [!"#$%&'()*+,-./0-9:;<=>?@A-Z[\]^_`a-u] Base91 [A-Za-z0-9!#$%&()*+,./:;<=>?@[]^_`{|}~"]
Coding
The code is:
import codext import sys import binascii type="base2" message="Testing" if (len(sys.argv)>1): type=sys.argv[1] if (len(sys.argv)>2): message=str(sys.argv[2]) print ("Message:\t",message) print ("Type:\t\t",type) str=codext.encode(message, type) print("Coding:\t",str)