We use Base-58 in Bitcoin strings where we use an encoding alphabet of '123456789ABCDEFGHJKLMN PQRSTUVWXYZabcdefghijkmno pqrstuvwxyz', where we have deleted '0' (zero), 'l' (lowercase 'l'), 'I' (capital I):
Base 58 |
Theory
We have all been in the position of not knowing if we have a zero ('0') or an 'O'. So how do we avoid this? Well in the encoding of non-printing characters to printable ones we use Base-64, where we take six bits at a time, and convert to a Base-64 character.
But what about similar-looking letters : 0 (zero), O (capital o), I (capital i) and l (lower case L), and non-alphanumeric characters of + (plus) and / (slash). The solution is Base-58 where we remove the characters which are similar looking.
So we convert the ASCII characters into binary, and the keep dividing by 58 and convert the remainder to a Base58 character. The alphabet becomes:
'123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'
So let's take an example of 'e'. With 'e' we have a decimal value of 101, so we divide by 58 to get:
1 remainder 43
and next we divide 1 by 58 and we get:
0 remainer 1
So we take character at position 1 and at position 43, to give:
123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz
and gives:
2k
If we now take 'ef', we get 25958 (102 + 101 * 256), where we move each character up one byte. Basically we take the binary value of the string and then divide by 58 and take the remainder. So 'ef' is '01100101 01100110'.
Code
An outline of the code is here:
import sys val='e' from hashlib import sha256 # Code taken from https://github.com/bitcoin/bitcoin/blob/master/contrib/testgen/base58.py # 58 character alphabet used __b58chars = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz' __b58base = len(__b58chars) b58chars = __b58chars if bytes == str: # python2 iseq = lambda s: map(ord, s) bseq = lambda s: ''.join(map(chr, s)) buffer = lambda s: s def b58encode(v): """ encode v, which is a string of bytes, to base58. """ long_value = 0 for (i, c) in enumerate(v[::-1]): if isinstance(c, str): c = ord(c) long_value += (256**i) * c result = '' while long_value >= __b58base: div, mod = divmod(long_value, __b58base) result = __b58chars[mod] + result long_value = div result = __b58chars[long_value] + result # Bitcoin does a little leading-zero-compression: # leading 0-bytes in the input become leading-1s nPad = 0 for c in v: if c == 0: nPad += 1 else: break return (__b58chars[0]*nPad) + result def b58decode(v, length = None): """ decode v into a string of len bytes """ long_value = 0 for i, c in enumerate(v[::-1]): pos = __b58chars.find(c) assert pos != -1 long_value += pos * (__b58base**i) result = bytes() while long_value >= 256: div, mod = divmod(long_value, 256) result = chr(mod) + result long_value = div result = chr(long_value) + result nPad = 0 for c in v: if c == __b58chars[0]: nPad += 1 continue break result = bytes(nPad) + result if length is not None and len(result) != length: return None return result print ('Input:\t',val) res= b58encode(val) print ('Base58:\t',res)