Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and chemometrics......click here for more.


Random Number Generators

In classical statistical textbooks random numbers are created by drawing numbered bullets out of a box containing a known number of them. While this procedure may be feasible with a small set of bullets, it becomes increasingly unmanageable with an increasing number of bullets. Apart from this, there is another issue which is commonly overlooked: it is doubtful whether drawing bullets out of a large box is really a random process with equal chances for all bullets.

For this and other reasons, random number generators have been computerized. In fact, any higher level programming language offers at least one form of random number generator. The generation of random numbers, however, is not an easy task for a computer, since the computer is a deterministic machine with no built-in randomness. Thus it is impossible to create true random numbers without any additional hardware.

What can be done, is to create pseudo random numbers which behave almost like random numbers but which are repeated after a fixed (mostly quite long) period. These pseudo random numbers are generated by linear congruential generators (LCG). The principle of an LCG is quite simple: a new pseudo random number is generated on the basis of the previous random number by adding a certain offset and wrapping the result if it exceeds a certain limit. The process can be denoted by the following equation:

xi = (a + bxi-1) mod c

[The mod operator calculates the remainder of the division of (a+bxi-1)/c.]

Pseudo random numbers as generated by an LCG have both advantages and disadvantages:
 

    fast calculation: the calculation can be performed with a few integer arithmetic instructions. This is the reason why LCGs are almost universally used in computers.
    "random" numbers can be repeated exactly. This can be of a great advantage when looking for a bug in some calculations which rely on random numbers.
    the lower (rightmost) bits of the random numbers may not be random at all. In general, you should never use parts of a pseudo random number as another random number.
    some bit combination may never occur. If the constants a,b, and c in the formula above are not carefully chosen, the random numbers may not cover the whole range of possible numbers.
    as the sequence of pseudo random numbers is fixed, there is a correlation among sequential random numbers. This effect can prove to be a major problem if one uses random numbers to create points in a k-dimensional space (as with Monte Carlo methods). The points will not fill up the space but will line up on n-dimensional hyperplanes. The number of hyperplanes is roughly the kth root of the constant c.


If you are interested in having a look at some of these aspects, start the following  interactive example .