Lesson 1 in Secure Architecture and Development: Understanding Entropy

Cryptography is a complex subject and needs special care and attention — it needs a first class understand of both theory and practice. If…

Lesson 1 in Secure Architecture and Development: Understanding Entropy

Cryptography is a complex subject and needs special care and attention — it needs a first class understand of both theory and practice. If we have one without the other, we could create so many problems. I have done so many code reviews where the methods used to generate keys and artefacts significantly weaken the whole infrastructure (often to the point of trivial cracking).

In code reviews, the first on my list is to investigate the source of randomness and trace how this is done. If we can get at least 128 bits of key entropy (256 bits is preferred, of course), I can move on and test with as many random values as I can generate. I then look for the probability of collisions, and this can lead to nonce reuse and also highly lower entropy in the range of seeds produced.

Lesson 1: Understand entropy

So, lesson one for a secure developer is possibly the understanding of entropy, as the security of a 256-bit key — which is almost unbreakable — can be significantly reduced by a lack of a proper random seed value. Developers often just call up rand() without any real care about how the random values are generated.

I once did a review, where the seed was a static value in the program, and something like:

seed=9999999999999999999999999
r = rand(seed)
key = r.Next()

I asked why all the 9’s, and the answer was that it was a big number. And, then I asked about the values that the key would have. And the answer was that it would be random. And, then I asked to run the program, and show the values of the key, and guess what? Every sequence of keys was the same — every time we ran the program.

On going back, the developer discovered that the program always produced the same set of keys, and where we had the same key generated, every time the program was restarted.

But, the fundamental weaknesses here is that we are using security-by-obfuscation, and where an adversory can just probe the values to find the seed, and produce the same keys. Any, 99,999,999,999,999,999,999 is not a large number, and only has a key entropy of 66.4 bits:

>>> import math

>>> math.log10(99999999999999999999)/math.log10(2)
66.43856189774725

For 66.4 bits, it is with range of brute forcing the seed value, and once discovered, there is no need to discover it again. I would take less than a day to match the keys produced to the random seed value. A value more than 72-bits would be extremely costly, and probably more in the range of the NSA. If possible, we would always look for at least 128 bits of entropy, and nearer 256 bits is best.

Test, test and test

The other area of problems is where we generate the same random nonce value, as we can often crack the cipher when this occurs. Normally I ask for the random values to be generated in their core form (as an integer or hex value), and then look for the occurance of collisions. These can then be predicted using the Birthday attack method. And, so, this will give us a true measure the entropy, both by looking at collision, and also measuring the Shannon entropy of the values.

Conclusions

The most fundamental is making sure there is good entropy. Here’s how not to do it …