What’s the difference between ‘A’ and ‘a’? … 1

My education around the Internet grew-up around the understanding of data communications, and then onto networking, and finally a core…

What’s the difference between ‘A’ and ‘a’? … 1

My education around the Internet grew-up around the understanding of data communications, and then onto networking, and finally a core understand of how the whole of the Internet worked. At the communications layer, you very quickly understand that there’s a lot of noise around, and where you were often sending digital pulses along lines that were designed for voice communications. The core concept was thus understanding the good part of the message: the signal, and the bad part of it: the noise. For this, I am always reminded of Peter Gabriel’s lyrics:

And in this place, can you reassure me
With a touch, a smile — while the cradle’s burning
All the while the world is turning to noise
Oh the more that it’s surrounding us
The more that it destroys
Turn up the signal
Wipe out the noise

At the core of this was Shannon theory, and who defined that the capacity of a channel varied with the ratio of the signal to the noise:

C = B log2 (1+S/N)

and where B is the bandwidth. So if the bandwidth was 100 MHz, and the signal to noise ratio is 10:1, we get:

C = 10e6 log2(1+10) = 34 Mbps

And so we have noisy channels, where the noise sometimes caused errors in our bits. And thus we implemented error detection methods where we could tell if at least of our bits was sent in error. A good example of this is CRC-32 [here]. But in many applications, we could not request a resend, so we created error correction codes — such as those used in CD-ROMs. A good example of this is Hamming codes [here] and Reed-Solomon [here].

Within the understanding of errors, we need understand how much we need to change our bit pattern to produce something else. A core concept of this is the Hamming Distance. For example the number of bits that must change between101 and 110 is two [here]. If the probability of changing one bit is 1 in 100, then the probably of changing two bits is 1 in 10,000.

The following uses a simple bit-shift operation and X-OR to implement the method, along with the import of a library (commpy) in order to check the answer:

import sys
import commpy 
from bitstring import BitArray

val1='0110'
val2='1110'

if (len(sys.argv)>1):
	val1=sys.argv[1]
if (len(sys.argv)>1):
	val2=sys.argv[2]

def hamming2(str1, str2):
	if  len(str1) <> len(str2):
		print "Strings differ is size"
		return 0
	a = int(str1, 2)
	b = int(str2, 2) 
	count= 0
	c = a ^ b
	while c:
        	count += 1
        	c &= c - 1 
	return count

print "Binary form:\t",val1," - ",val2
print "Decimal form:\t",int(val1,2)," - ",int(val2,2)
print ""

print "Hamming distance is ",hamming2(val1,val2)

v1=BitArray(bin=val1)
v2=BitArray(bin=val2)

print "Hamming distance is ",commpy.utilities.hamming_dist(v1,v2)

Try an example:

Distance between 111 and 101 gives a Hamming distance of 3. Calc
Distance between 10101010 and 011111111 gives a Hamming distance of 5. Calc

Conclusions

I started with a title of “What’s the difference between ‘A’ and ‘a’?. Well, ‘A’ is 0x41 (0100 0001) and ‘a’ is 0x61 (0110 0001) and where you can see that we only need to change one bit to get from ‘A’ to ‘a’. The Hamming distance is thus 1 [here].