Apart from sounding like a strange ’70s prog-rock band, the title of this posting actually has something to do with random numbers. As some of you are aware, the company I work for has produced a small USB device which we are aiming at producing random numbers from for use in situations where a large number of IID random values are needed. However, rather than aiming at producing the megabits per second which some companies do for around £400 or so, our goal is to have a device more than capable of around 16kbits per second for around £30 or so.

My question is related to the mixing of streams of random numbers. Our device has two hardware RNGs which produce bitstreams. I am already using Üli M. Maurer’s “Universal Statistical Test for Random Bit Generators” to estimate the entropy gathered by the RNGs. What I am trying to determine is if it is reasonable and safe to use a hashing function such as SHA256 in the following way:

  1. Assume the two RNG streams have had their entropy estimated (and derated slightly).
  2. Now create a hash state and feed bytes worth of the data from each of the two streams, summing the estimates of the entropy being fed into the hash.
  3. Now, once the sum of estimated entropy reaches some threshold (perhaps 1.5 times the bit-size of the hashing function) we finalise the state and emit the hash.
  4. Repeat from 2.

What I am trying to determine is whether or not it would be subsequently reasonable to estimate the entropy in each emitted hash state as being in the least bit close to the hash size of the function. E.g. assuming that in the 256 bits coming out of the SHA256 hash, that there is, perhaps 200 bits of entropy.

Ultimately I am trying to come up with a way to process the streams (in a microcontroller, so nothing too onerous on CPU or RAM) such that the data coming off the device can be treated confidently as having a high number of shannons of entropy in every byte. As I stated above, my goal would be 16kbits/second of entropy and if that came in the form of 20kbits/second of data (2500 bytes/second) where the assumption of at least six bits of entropy per byte was sound, then I would be very very pleased indeed.

Some of you familiar with the general area of randomness and particularly of testing randomness might have noticed that 2500 bytes is a rather convenient size for applying the FIPS 140-1 and 140-2 tests to. My goal is to aim at as high a FIPS 140 rating as I can.

If you think you can help, please do email me.

Comments on this page are closed.