One of the most popular pseudorandom number generators (PRNGs) is the Mersenne Twister. It was created by Makoto Matsumoto and Takuji Nishimura in 1997 and it derives its name from its period length - which the number of values before the sequence repeats - and which is a Mersenne prime. This page contains on the most commonly used version of the Mersenne Twister algorithm is MT19937 and generates a 32-bit word length (using the Mersenne prime of \(2^{19937}−1]\). The method is named after Marin Mersenne and who was a 17th Century French Minim friar. His main contribution included his investigations of prime numbers, along with being seen as the father of acoustics:
Mersenne Twister |
Theory
Mersenne Twister is used extensively within many applications including mt_rand() in PHP and random.random() in Python. A Mersenne prime - named after Marin Mersenne - is defined as a prime number that is one less than a power of two. The general form is \(M_n = 2^n − 1\) where n is an integer.
The largest found is \(2^{77232917}-1\), and which was discovered by Jonathan Paceon on Boxing Day in 2017. Since 1997, the Great Internet Mersenne Prime Search (GIMPS) distributed system has been used to find new Mersenne primes.
We can try prime numbers (2,3,5,7,11...) to see if \(2^{p-1}-1\) is also prime:
- p=2 (\(2^2-1\) = 3 which IS prime)
- p=3 (\(2^3-1\) = 7 which IS prime)
- p=5 (\(2^5-1\) = 31 which IS prime)
- p=7 (\(2^7-1\) = 127 which IS prime)
- p=11 (\(2^{11}-1\) = 2047 is NOT prime). Ignore, p=11!
The sequence becomes:
2, 3, 5, 7, 13, 17, 19, 31, 61, 89, 107, 127, 521, 607, 1279, 2203, 2281, 3217, 4253, 4423, 9689, 9941, 11213, 19937, 21701, 23209, 44497, 86243, 110503, 132049, 216091, 756839, 859433, 1257787, 1398269, 2976221, 3021377, 6972593, 13466917, 20996011, 24036583, 25964951, 30402457, 32582657, 37156667,...
A typical value used is 19,937.
The basic parameters are:
- w: word size (in number of bits)
- n: degree of recurrence
- m: middle word, an offset used in the recurrence relation defining the series x, 1 ≤ m < n
- r: separation point of one word, or the number of bits of the lower bitmask, 0 ≤ r ≤ w - 1
- a: coefficients of the rational normal form twist matrix
- b, c: TGFSR(R) tempering bitmasks
- s, t: TGFSR(R) tempering bit shifts
- u, d, l: additional Mersenne Twister tempering bit shifts/masks
and where \(2^{nw − r} − 1\) is a Mersenne prime.
The details of the maths behind the method is [here]
The coefficients for MT19937 are:
(w, n, m, r) = (32, 624, 397, 31) a = 9908B0DF16 (u, d) = (11, FFFFFFFF16) (s, b) = (7, 9D2C568016) (t, c) = (15, EFC6000016) l = 18
The following code is taken from [here]:
class mersenne_rng(object): def __init__(self, seed = 5489): self.state = [0]*624 self.f = 1812433253 self.m = 397 self.u = 11 self.s = 7 self.b = 0x9D2C5680 self.t = 15 self.c = 0xEFC60000 self.l = 18 self.index = 624 self.lower_mask = (1<<31)-1 self.upper_mask=1<<31 # update state self.state[0] = seed for i in range(1,624): self.state[i] = self.int_32(self.f*(self.state[i-1]^(self.state[i-1]>>30)) + i) def twist(self): for i in range(624): temp = self.int_32((self.state[i]&self.upper_mask)+(self.state[(i+1)%624]&self.lower_mask)) temp_shift = temp>>1 if temp%2 != 0: temp_shift = temp_shift^0x9908b0df self.state[i] = self.state[(i+self.m)%624]^temp_shift self.index = 0 def get_random_number(self): if self.index >= 624: self.twist() y = self.state[self.index] y = y^(y>>self.u) y = y^((y<>self.l) self.index+=1 return self.int_32(y) def int_32(self, number): return int(0xFFFFFFFF & number) import sys seed=123 if (len(sys.argv)>1): seed=int(sys.argv[1]) if __name__ == "__main__": rng = mersenne_rng(seed) print ("Seed:\t",seed) for i in range(20): print (rng.get_random_number())
PRNGs are deterministic and periodic. With the periodic nature, we repeat the random sequence for the length of all the possible outputs. So if you are watching a slot machine, you might guest that the sequence for the output will eventually repeat. With the Mersenne Twister the period is a massive 2^{19937}-1, which means that it is unlikely to repeat in a reasonable time frame. An important element is thus the seed value, as, if this can be guessed, it may be possible to replicate the whole sequence.
Another possible attack is on the state machine for generating the sequence, as a PRNG is deterministic. If some of these values are discovered, it may be possible to recreate others from it.
Cryptographically Secure PRNG (CSPNG) should never repeat its sequence (non-periodic), and uses randomness based on random events (such as the time between system interrupts). Examples of CSPNG include /dev/urandom and CryptGenRandom. A CSPNG should also be non-deterministic. An attack on Twister is defined [here]