Get Any Probability Distribution from the Uniform Distribution

<< Algorithms

Any Probability Distribution From the Uniform Distribution
- Any Random Variable from Rnd -

BASIC’s Rnd function is a peculiar sort of function that takes no argument and is different each time you use it. You could say time itself is the hidden argument. It furnishes what appear to be random numbers that are uniformly distributed between 0 and 1. [*] How can we obtain an analog of the Rnd function that furnishes random numbers with a different, given, probability distribution?

Let X be the analog to Rnd with the desired probability distribution over its sample space from a to b, which could be –∞ to ∞, the whole real line.

A probability distribution can be specified by its “probability density” function, defined from a to b. Let f (x) be the desired probability density function. That means that for any x, the probability that X is within an interval of length Δx at x is f (x) times Δx,

P (x ≤ X ≤ x + Δx) = f (x) Δx

in the limit Δx → 0. To say the same thing without using an infinitesimal, consider this variable definite integral of the probability density, called the “cumulative distribution”:

F (x) = ∫_{_a}^{^x} f (t) dt
for x = a to b

F(x) goes from 0 to 1 as x goes from a to b, always increasing. [**] It is the probability that X is less than or equal to x,

F (x) = P (X ≤ x)

for x from a to b

The cumulative distribution can serve as the definition of the probability distribution of X without reference to its probability density, which is then the derivative of F,

f (x) = F^’(x) [***]

The two functions, the probability density function f(x) and the cumulative distribution function F(x), go hand in hand, one determines the other, thus either one can be used to specify the probability distribution of a random variable. [****]

What are these two functions when X = Rnd? Rnd’s probability density is everywhere 1,

f (x) = 1

for x = 0 to 1

and its cumulative distribution is linear,

F (x) = x

for x = 0 to 1

That second statement means

P (Rnd ≤ x) = x

a fact we will use later.

Now we return to the general case of the random variable X. Consider the inverse of the desired cumulative distribution function:

F ^–1(x)

where x = 0 to 1

We claim that if x is chosen according to the uniform probability density Rnd then F ^–1(x) will have the given cumulative distribution F(x), and hence its probability density will be the desired one as well. The proof is simple. Its cumulative distribution (see the second definition above) is:

P ( F ^–1(Rnd) ≤ x )

Apply F to both sides of the inequality inside P():

= P ( Rnd ≤ F (x) )

and that, tah dah, (see the discussion of Rnd above):

= F (x)

Thus we can reduce any continuous probability distribution to the uniform probability distribution.

In practice it helps if the given probability density is integrable in closed form using simple functions and that the integral is invertible likewise, but if not one can resort to numerical algorithms to integrate and invert. [*****]

We illustrate the technique with an example that can be worked out easily in closed form, the following probability density over the entire real line

f (x) = e^x / (1 + e^x) ^²

The reader can verify that its graph is a symmetrical bell-shaped curve,

f (–x) = f (x)
f (0) = ¼
f (x) → 0 as x → ± ∞

and that the corresponding cumulative distribution function,

F (x) = ∫_{_–∞}^{^x} f (t) dt

= 1 / (1 + e^–x)

(which shows that the integral of the given probability density over the entire sample space, F(∞), is 1 as it should be). So that finally

F ^–1(x) = ln ( x / (1 – x) )

As a BASIC program function (where Log is the natural logarithm):

Function Bell As Single
  Local r As Single
  r = Rnd
  Function = Log(r / (1 - r) 
End Function

Bell can by used just like Rnd and instead of its probability density being uniform it will be bell shaped per f (x) above.

Note that care must be taken to call Rnd only once though its value might be used more than once in F ^–1(x). Another way of handling this in the problem above is:

Function CDinverse(x As Single) As Single
  Function = Log(x / (1 - x)  
End Function

Function Bell As Single
  Function = CDinverse(Rnd)
End Function

-oOo-
Restricting X Within Limits

In the example above, suppose we wanted Bell to be between certain limits. We could have the main program compute the cumulative distribution at the limit points, then have Bell reject Rnd outside those limits. But better, the rejection can be done before computing CDinverse. In the code below, Bell is limited to between – limit and + limit, where “limit” is some specified constant.

Global CDlowerlimit As Single, CDupperlimit As Single

Function CD(x As Single) As Single   'called twice in Main
  Function = 1 / (1 + Exp(-x))
End Function

Function Bell As Single
  Local r As Single
  Do
    r = Rnd
    If r >= CDlowerlimit And r <= CDupperlimit Then Exit Do
  Loop
  Function = Log(r / (1 - r))        'CDinverse(r)
End Function   

Function Main
   ...
  CDlowerlimit = CD(-limit)
  CDupperlimit = CD( limit)
  ...
End Function

-oOo-
Truly Dispersed Distributions

Rnd furnishes “uniformly distributed” numbers between 0 and 1. That is, uniformly in the long run. There will almost always be clustering. Suppose that you wanted to do something with N random numbers and afterwards you wanted the N numbers to really be uniformly distributed, that is dispersed over 0 to 1 with no clustering.

Rnd does not do this. We have no idea how it will turn out after N times except that in the long run, as N → ∞ the numbers will be more-or-less uniformly distributed. We want the distribution to be truly uniform for finite N.

Well, in that case there is nothing random about the end result; the numbers must be

0, 1 / (N – 1), 2 / (N – 1), ..., 1

We could introduce some randomness along the way to that end by choosing a random permutation of the numbers.

All that is obvious for the uniform probability density but suppose we wanted to do the same thing with some other probability density?

The method described in the first section, of obtaining any random distribution from Rnd, provides a solution. It reduces the given probability density to Rnd, and we can do to Rnd what we just did above, then disperse the points according to the given probability density.

Function CDinverse(x As Single) As Single
  Function = Log(x / (1 - x))    'ill-defined when x = 0 or 1
End Function

' Given:  the array a(0 To %N - 1)
' Return: a random permutation of a()
Sub Shuffle (a() As Long)
  Local i As Long
  For i = %N - 1 To 1 Step -1
    Swap a(i), a(Rnd(0, i))
  Next
End Sub

Function Main
  Local i As Long
  Dim index(%N - 1) As Long
  Dim X(%N - 1) As Single
                             
  For i = 0 To %N - 1
    index(i) = i                'pointers into X() 
    ' adjust fraction to avoid CD = 0 and 1
    ‘ otherwise could use  i / (%N – 1)
    X(i) = CDinverse((i + 1) / (%N + 1))    
  Next
  
  Shuffle index()

  For i = 0 To %N - 1
    ... X(index(i)) ...
  Next
End Function

* That is, the probability of a number being in any interval doesn’t depend on the location of the interval. We have to express the uniformity in terms of intervals instead of points because the probability of any point in a continuum sample space is zero whatever the probability density function. (We could consider densities involving the Dirac delta function to get around that.)

** Or at any rate never decreasing. However we will assume that the probability density f(x) is always positive so that F(x) is indeed always increasing, and consequently invertible.

*** The cumulative distribution can be used to determine the probability of any interval in the sample space.

P(x₀ ≤ X ≤ x₁) = F(x₁) – F(x₀)

Intervals and unions of intervals in the sample space are called “events.”

By the way, we can write either < or ≤, it doesn’t matter, because the probability that X equals any one point is zero. In a continuum only intervals can have positive probability. (Again, using the Dirac delta function can get around that restriction, in which case it would matter.)

**** Any function can be a probability density if it is everywhere non-negative and its integral over the whole sample space a to b is one. Any function can be a cumulative distribution if it is everwhere non-decreasing and is zero at a and one at b.

***** The inverse cumulative distribution function is called the quantile function.