Anyone with any programming experience understands that computers are deterministic machines. If you provide the same input, you’ll always get the same output. That’s why having computers generate something by chance is trickier than it may seem.
Computers use random numbers for everything from cryptography to gambling, generative algorithms, video games, and more. However, computers are inherently incapable of being random. Instead, programmers rely on pseudorandom number generators (PRNGs). These are simply a category of algorithms that programmatically generate new random numbers from a given starting value called the seed.
These algorithms are not without their own limitations. Since the random numbers are programmatically generated, if someone were able to identify the seed value and the PRNG algorithm you were using, they’d be able to predict the next random number in the sequence. This would allow an attacker to break encryption, predict the next playing card in a sequence, cheat in a video game, etc.
Despite this concern, PRNGs are extremely useful in situations involving modeling and simulations as it allows you to “replay” a series of random events by initializing your random number generator with the same seed.
In situations where the randomness of the random numbers is critical, we use a “true” random number generator (TRNGs). Unlike PRNGs that have an arbitrary seed value, TRNGs pick a seed value from their environment / external data.
Here are a few potential options:
- Mouse movements
- Fan noise
- Atmospheric pressure
- Number of microseconds since the last whole second
We just need to pick a seed that an attacker wouldn’t be able to predict. This seed value will then be passed into an algorithm, similar to PRNGs, that will generate a random number to use.
The use case will generally dictate whether a PRNG will suffice or if a “true” RNG is needed. Regardless, it’s important to understand the practical differences between both approaches.
PRNGs are faster than TRNGs and their determinism is extremely useful in cases where you want to replay a series of “random” events. Additionally, some PRNGs are periodic in nature, but modern PRNGs with the right initialization parameters have a period long enough that it’s not a major concern. Conversely, TRNGs are slower than PRNGs, are non-deterministic, and are not periodic.
Linear Congruential Generator
Let’s take a look at implementing a simple PRNG. We’ll implement a variant called the linear congruential generator (LCG) algorithm. LCG was previously one of the most commonly used and studied PRNGs (more info).
Here’s the recurrence relation for LCG:
The Wikipedia page on LCG documents a few commonly used values for modulus, multiplier, and increment. There’s no consensus on the best values to use hence the differing values across implementations.
We have to be mindful of what values we use for these parameters. Choosing the wrong values can create a period that is too short which would render our random number generator useless.
In the image below, you can see that small changes to our parameters can greatly impact the period length.
For our implementation, we’ll use the values documented in previous standards of the C languages (C90/C99/ANSI C, and C11).
a = 1103515245
m = 2³¹
c = 12345
Whatever PRNG algorithm you choose should result in a uniform distribution of random numbers and a sufficiently long period.
Here’s a simple implementation in Swift:
Simulating Dice Rolls
Let’s say you wanted to simulate a dice roll.
It might seem reasonable to change the modulus to 6, but this would create a period far too short to be usable. We need to stick with well-chosen and tested values for our parameters.
Instead, using the approach in the code above, we can simulate 40,000 dice rolls:
Dice Value: 40,000 Roll Simulations
Looking at the results, we can see that it is indeed a uniform distribution of values.
Generating Random Numbers In A Range
Next, let’s consider generating random numbers that fall in a range. Again, we shouldn’t change our parameters without fully understanding how it affects the period.
Instead, we should map our PRNG’s results to values in our desired range.
After a million simulations across the specified range [30, 80)
If you’re interested in a more modern PRNG, I’d recommend exploring the Mersenne-Twister approach. It’s currently the most popular PRNG algorithm and is currently used in Python (numpy), Ruby, PHP, R, and C++. This was meant to be a high-level introduction to this topic. If you’re interested in learning more about this topic, here are a few other PRNGs to consider.
Hope you enjoyed this article! Feel free to check out my other articles below!