BASIC’s Rnd function – a peculiar sort of function that takes no argument and is different each time you use it – furnishes (pseudo) random numbers that are uniformly distributed between 0 and 1. [*] How can we obtain an analog of the Rnd function that furnishes random numbers with a different probability distribution?
Let X be the analog to Rnd with the desired probability distribution over its sample space from a to b, which could be –∞ to ∞, the whole real line.
A probability distribution can be specified by its “probability density” function, defined from a to b. Let f (x) be the desired probability density function. That means that for any x, the probability that X is within an interval of length Δx at x is f (x) times Δx,
The cumulative distribution can serve as the definition of the probability distribution of X without reference to its probability density, which is then the derivative of F,
The two functions, the probability density function f(x) and the cumulative distribution function F(x), go hand in hand, one determines the other, thus either one can be used to specify the probability distribution of a random variable. [****]
What are these two functions when X = Rnd? Rnd’s probability density is everywhere 1,
Now we return to the general case of the random variable X. Consider the inverse of the desired cumulative distribution function:
Thus we can reduce any continuous probability distribution to the uniform probability distribution.
In practice it helps if the given probability density is integrable in closed form using simple functions and that the integral is invertible likewise, but if not one can resort to numerical algorithms to integrate and invert. [*****]
We illustrate the technique with an example that can be worked out easily in closed form, the following probability density over the entire real line
As a BASIC program function (where Log is the natural logarithm):
Function Bell As Single Local r As Single r = Rnd Function = Log(r / (1 - r) End Function
Note that care must be taken to call Rnd only once though its value might be used more than once in F –1(x). Another way of handling this in the problem above is:
Function CDinverse(x As Single) As Single Function = Log(x / (1 - x) End Function Function Bell As Single Function = CDinverse(Rnd) End Function
In the example above, suppose we wanted Bell to be between certain limits. We could have the main program compute the cumulative distribution at the limit points, then have Bell reject Rnd outside those limits. But better, the rejection can be done before computing CDinverse. In the code below, Bell is limited to between – limit and + limit, where “limit” is some specified constant.
Global CDlowerlimit As Single, CDupperlimit As Single Function CD(x As Single) As Single 'called twice in Main Function = 1 / (1 + Exp(-x)) End Function Function Bell As Single Local r As Single Do r = Rnd If r >= CDlowerlimit And r <= CDupperlimit Then Exit Do Loop Function = Log(r / (1 - r)) 'CDinverse(r) End Function Function Main ... CDlowerlimit = CD(-limit) CDupperlimit = CD( limit) ... End Function
Rnd furnishes “uniformly distributed” numbers between 0 and 1. That is, uniformly in the long run. There will always be clustering. Suppose that you wanted to do something with N random numbers and afterwards you wanted the N numbers to really be uniformly distributed, that is dispersed over 0 to 1 with no clustering.
Rnd does not do this. We have no idea how it will turn out after N times except that in the long run, as N → ∞ the numbers will be more-or-less uniformly distributed. We want the distribution to be truly uniform for finite N.
Well, in that case there is nothing random about the end result; the numbers must be
All that is obvious for the uniform probability density but suppose we wanted to do the same thing with some other probability density?
The method described in the first section, of obtaining any random distribution from Rnd, provides a solution. It reduces the given probability density to Rnd, and we can do to Rnd what we just did above, then disperse the points according to the given probability density.
Function CDinverse(x As Single) As Single Function = Log(x / (1 - x)) 'ill-defined when x = 0 or 1 End Function ' Given: the array a(0 To %N - 1) ' Return: a random permutation of a() Sub Shuffle (a() As Long) Local i As Long For i = %N - 1 To 1 Step -1 Swap a(i), a(Rnd(0, i)) Next End Sub Function Main Local i As Long Dim index(%N - 1) As Long Dim X(%N - 1) As Single For i = 0 To %N - 1 index(i) = i 'pointers into X() ' adjust fraction to avoid CD = 0 and 1 ‘ otherwise could use i / (%N – 1) X(i) = CDinverse((i + 1) / (%N + 1)) Next Shuffle index() For i = 0 To %N - 1 ... X(index(i)) ... Next End Function