What’s Your Favouriate Algorithm?

I was browsing Twitter when I saw this tweet and it got me thinking [here]:

What’s Your Favouriate Algorithm?

I was browsing Twitter when I saw this tweet and it got me thinking [here]:

For me, there are so many create algorithms, and from a cryptographic point-of-view, the RSA method sticks out as something that is so useful, but so powerful. But the one method that was continually mentioned in the replies is the mighty DCT (Discrete Cosine Transformation):

We take so many things for granted in this modern age. On our cameras, we create images using hi-res cameras, which we can easily upload. But, uncompressed, the files for a 20 Megapixel camera would be 50MB each, and if we recorded video at 30 frames per second (30 fps) it would be 90,000,000,000 Bytes (90TB) for every minute of video. So how do our cameras, DVD players and digital TV manage to cram so much information into such a small amount of data?

The magic all happens underneath, and where the compression system knows exactly what your eyes will see. Imagine the bandwidth that would be required for watching a football match on TV if it wasn’t for compression — and especially the DCT (Discrete Cosine Transform) method? One minute of our football match would eat up the bandwidth our the whole of your area.

Remember too that video is basically a whole lot of images being played back … a bit like a flick book that you played with as a child.

Your wonderful eyes

Your eyes are wonderful things and can process and make sense of complex things so much faster than a computer can. Overall our eyes are more sensitive to changes in luminosity (brightness) and less sensitive to changes of colour. Unfortunately most computer systems store colours with three bytes of Red, Blue and Green (24-bit colour for RGB), but this is not really a thing we can process, so we often convert into Y (Luminosity), U (Cb — “Blueness”) and V (Cr — “Redness”). A standard conversion is:

Y=0.299R+0.587G+0.114B
Cb=0.1687R–0.3313G+0.5B
Cr=0.5R–0.4187G+0.0813B

You can see that Green has a strong effect on the brightness, and Blue has the least effect. In compression we can thus reduce the resolution on the U and V elements, and make sure that Y has a high accuracy. We can also reduce some of the high-frequency changes. As we can see, the image can look fine when we view it, but ends up with pixel blocks when we zoom in:

And if we zoom in a bit more:

In DCT compression, as we use in JPEG and MPEG, we convert the pixels into pixel blocks (such as 8x8 blocks), and then convert each of these to spatial frequency elements. Often we see these pixel blocks when our images or video have problems. If you are interested this is the conversion:

and which gives us an array of 8x8. The lowest frequency component is F(0,0) and the highest frequency is F(7,7). We can easily lose some of the higher level components as your eye will not see these (unless we zoom in). If you are interested in how DCT works, the following shows some different examples of pixel blocks. The following shows an input from a block:

[[21 21 21 22 22 22 22 22]
[21 21 21 21 21 21 21 21]
[21 21 21 21 21 21 21 21]
[21 21 21 21 21 20 20 20]
[22 22 22 22 21 21 21 21]
[24 24 24 23 23 22 22 22]
[26 26 25 25 24 24 24 23]
[27 27 27 26 25 25 25 24]]

After taking the DCT we get:

[[179   3   0   0   0   0   0   0]
[-11 -3 0 0 0 0 0 0]
[ 7 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0]
..
[ 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0]]

Then after we quantize we get:

[[ 200.  0.  0.  0.  0.  0.  0.  0.]
[ -12. 0. 0. 0. 0. 0. 0. 0.]
[ 14. 0. 0. 0. 0. 0. 0. 0.]
..
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]]

And then we can take the inverse DCT to give:

[[25 25 25 25 25 25 25 25]
[24 24 24 24 24 24 24 24]
[23 23 23 23 23 23 23 23]
[22 22 22 22 22 22 22 22]
[23 23 23 23 23 23 23 23]
[25 25 25 25 25 25 25 25]
[28 28 28 28 28 28 28 28]
[29 29 29 29 29 29 29 29]]

The quantization process allows us to then divide each of the components by a given value. The larger the value, the more chance that the result will become zero.

In the following [here], we have used a fairly standard quantization matrix, which will divide F[0,0] by 16, and F[7,7] by 99 [view here]:

but if we now increase the divisor for F[0,0] we get [view here]:

where you can see that we have lost a low-frequency component can start to see some of the pixel blocks appearing.

Remember, in digital images and video, it is Y that is King, and U and V are just foot soldiers.

Conclusions

And, so, perhaps DCT is one of the most wonderful (and useful) algorithms ever produced. It has truly transformed our media world. What’s disappointing — perhaps — is that so many people take all this technology for granted, and forget about the decades of research work that when into those lovely HD video images.