Random Numbers and Computers

1. Intro

There are several scenarios in computer programming where we find ourselves in the need of random output; be it a word processor which displays a random quote from a set of quotes each time it starts, or a videogame in which a certain figure has to perform a task selected at random from a set of tasks. Random sequences are also required in encryption algorithms, and that is where it gets serious. In case a 'random sequence' used in encryption is not truly random and its terms are related to each other and are predictable, a security breach is likely to occur. The solution appears simple: Make a reliable algorithm that produces random numbers. But wait! The problem with random numbers is a different one and is not as simple as it appears.

A computer is a deterministic machine. It performs tasks based on instructions provided to it. The state change of each and every capacitor in our computer's memory is governed by the instructions given at a higher level (There are billions of capacitors in a RAM). So if nothing inside a computer happens at random, how can it produce a sequence of random numbers? How can there be an algorithm which produces random numbers when the word 'Algorithm' itself stands for 'a set of rules'?

By now, I think I have made my point. A computer can never produce truly random output. We need to integrate computers with physical processes to produce random output. Dice and coin flipping are some primitive example of physical random number generators. In today's world, a physical random number generator can be based on an essentially random atomic or subatomic physical phenomenon like nuclear decay. But we usually prefer to avoid using physical methods when it comes to computers mostly because these are expensive, hard to integrate with computer hardware and also tend to slow down the processing speed. And we cannot just ask every user to buy decaying Uranium-235!

So we are back at square one: Algorithms. We cannot write a computer code that generates truly random sequences, but we have managed to come up with algorithms which give us output which appears to be random over a certain range. These are called pseudo-random  number generators (PRNGs). The simple ones are less reliable, have a short range (the terms might start repeating after a certain interval; we will later know why) and are crackable (someone familiar with the algorithm might be able to predict the output), while the better ones (extremely complicated and used in cases where high apparent randomness is required) are highly unbreakable even for someone who knows how the algorithm works, and also have a significantly larger range. But it must never be forgotten that these generators are always crackable at some point and can never produce numbers that are truly random; in fact, older algorithms are being broken and better ones are being created even today.

In the upcoming sections, we will investigate some commonly used simple PRNGs used in common computer applications and also scratch the surface of the complex ones used in cryptography.

2. Random number generation in C/C++

You can skip this section if you are not a programmer.

C++ comes with a built in PRNG defined in the cstdlib library. It is reliable enough to be used in simple computer programs, but using the same generator while writing a program for the NSA will most probably get you fired. The implementations of the generator may vary from compiler to compiler, but they are usually  Linear Congruential Generators (see section 3 for congruential generators). This section is aimed only at learning to use the built in functionality effectively, not at understanding how it works.

The generator involves two functions rand() and srand(), both defined in the cstdlib header file.

int rand(void);

void srand(unsigned int seed);

The function rand() returns a random number between 0 and RAND_MAX (usually equal to 32767) based on a seed value provided to it through the srand() function. The seed is the initial input value that the algorithm needs to start with; it sets a starting point for the sequence of random numbers. The following two points must be kept in mind while using these functions:

  • The output sequence is directly dependent on the seed, although if the seed is not known, it is almost impossible to find a relation between the successive pseudo random numbers produced by rand(). Same seed will always lead to the same sequence and hence a different seed is required every time the application is executed.
  • Seeding should only done once. If the generator is reseeded before every call to rand(), then given the relations between the successive seeds, the respective outputs of the rand() function can be predicted.

The problem of getting a new seed value each time the program runs is solved by using the system time as the value of the seed. System time is the number of seconds elapsed since 1 January 1970 00:00:00 (this date is known as 'Epoch'). In C++, system time can be retrieved by using the time() function defined in the time.h header file. The function returns the data type time_t which is an int in most systems.

time_t time( time_t *second );

Finally, here is a demo program that generates five random numbers:

#include <iostream>
#include <cstdlib>
#include <time.h>
using namespace std;
int main()
{
    srand((unsigned int)time(NULL));
    for(int i = 0; i < 5 ; i++)
    {
        cout<<rand()<<"  ";
    }
    return 0;
}


The program was executed twice and produced the following output:

Execution 1: 32629  13582  1459  10534  402
Execution 2: 32711  20148  22080  22293  28890

In most cases, we need the numbers to fall inside a specific range. This task can be accomplished by using the following C++ expression:

num = MIN + rand()%(MAX-MIN+1);

Here, num will be a random number which falls between the numbers MIN and MAX (inclusive of both MIN and MAX).

3. Congruential Generators

[to be updated]

Comments

Popular posts from this blog

Setting up a first person camera in OpenGL to move around in your 3D world

Science with OpenGL | #1 | Superposition of two waves

A web based Vector Feild Plotter (2D) written in Javascript and HTML Canvas (With source code)