Continuous Distributions

Uniform Distribution

X is said to have a uniform distribution on the interval [A,B] if

Exponential Distribution

X is said to have an exponential distribution rate l if

We have previously talked about the memoryless property, and the fact that among discrete distributions on N it is unique to the geometric rv. Now we have

Theorem
X has an exponential distribution iff X is a positive continuous r.v. and P(X>s+t | X>s) = P(X>t) for all s,t > 0.
Proof:
Assume X~E(l). Then

on the other hand assume X is continuous with density f and P(X>s+t | X>s) = P(X>t) for all s,t > 0. In the proof above we saw that this implies P(X>s+t)=P(X>s)*P(X>t). Let h(x) = P(X>x) and let e>0. Note h(0) = P(X>0) = 1 because X is positive.

and so we see X~E(b)

The Gamma Distribution

Recall the gamma function:

The gamma function is famous for many things, among them the relationship G(a+1) = a G(a) which follows from:

This implies G(n)=(n-1)!, so the gamma is a continuous version of the factorial. The Gamma function has many other interesting properties, for example G(1/2) = √p

Now X is said have a gamma distribution (X~G(a,b)) with parameters (a,b) if

By definition we have X>0, and so the Gamma is the basic example of a r.v. on [0,∞], or a little more general (using a change of variables) on any open half interval.

show_gamma shows that for different values of alpha the density has different shapes and for different values of beta it has the same shape but different scales.

Note if X~G(1,b) then X~E(1/b).

Another important special case is if X~G(n/2,2), then X is called a Chi-square r.v. with n degrees of freedom, denoted by X~ c2(n)

There is an important connection between the gamma and the Poisson distributions:

Theorem if X~G(n,b) and Y~P(x/b) then
P(X≤x) = P(Y≥n)
proof

and the theorem follows by induction. Here is an R check, but note that R uses pgamma(x,a,1/b)
pgamma(0.75,10,2.3)=0.09696469
1-ppois(4-1,0.75*2.3) = 0.09696469

The Beta Distribution

X is said to have a beta distribution with parameters a and b (X~Beta(a,b)) if

it is easy to calculate the moments of a beta distribution:

By definition we have 0<X<1, and so the Beta is the basic example of a r.v. on [0,1], or a little more general (using a change of variables) on any open finite interval. Various examples are shown by show_beta()

Special case: Beta(1,1) = U[0,1]

Let's go back to the gamma distribution for a moment. Say X and Y are independent G(a,b) and let Z=X+Y. Then

so we see that Z~G(2a,b). In other words, the sum of independent gamma r.v.'s is again Gamma.

Some special cases:
1) X,Y iid E(l) then X+Y~G(2,l) (and not exponential)
2) X,Y iid c2(n), then X+Y~c2(2n)

Cauchy Distribution

A rv. X has a Cauchy distribution if

the Cauchy has one interesting property:

so the Cauchy has no mean (and therefore no moments at all). The reason is that it has thick "tails", that is the probability of observing a large value (+ or -) is large.

The Normal (Gaussian) Distribution

X is said to have a normal distribution with mean m and variance s2 (X~N(m,s)) if it has density

If m=0 and s=1 it is called a standard normal, and often denoted by Z instead of X.
Careful: some papers and textbooks define the normal as X~N(m,s2), that is they use the variance instead of the standard deviation.

Theorem
a) Z~N(0,1) then X=m+sZ~N(m,s)
b) X~N(m,s), then Z=(x-m)/s~N(0,1)

proof

one consequence of this theorem is that we can often do a proof for the standard normal, and then quickly generalize it to all normals.

Example show that the function above is indeed a pdf.
a) fX(x)≥0 for all x
b) first we show this for a standard normal:

the change of variables above is of course called the change to polar coordinates
the general case now follows easily:
P(-∞<X<∞) = P(-∞<(X-m)/s<∞) = P(-∞<Z<∞) = 1

Example find the mean and the standard deviation of a normal rv.

Theorem
Say X~N(m,s) then
1) the mgf of X is given by MX(t)=exp(m+s2t2/2)
2) P(X>m) = P(X<m) =1/2
3) P(X>m+x) = P(X<m-x)
4) say X~N(m,s) then

proof
1) first we find the mgf of a normal, again starting with a standard normal:

2)-4) are easy.

Example We have seen before that the Cauchy rv. has very thick tails, that is the probabilities P(X>t) are large. On the other hand the normal distribution has very thin tails. cauchy_normal() draws the two.

Multivariate Normal RV

Let m=(m1,..,mn)T be a vector and S=[sij] be a positive definite matrix (ie xTSx≥0 for all x), then the random vector X = (X1,..,Xn)T has a multivariate normal distribution if it has joint density

where |S| is the determinant of S

Theorem
a) E[Xi] =mi
b) V[Xi] =sii2

c) cov(X,Y) = sij
without proof

Example bivariate normal. Say we write

here r=cor(X,Y)

Theorem let X and Y be two normal rv's, then XY iff cor(X,Y)=0
proof one direction is always true. For the other we have if r=0 then

Example the joint distribution of two normal rv's need not be normal
Say X~N(0,1) and let Y=-X is |X|>1 and Y=X if |X|≤1, then

so Y~N(0,1) as well, but for example f(-2,-2)=0.

On the other hand we have the following characterization of a multivariate normal distribution:

Theorem
Let X=(X1,..,Xn)T. Then X has a multivariate normal distribution if and only if every linear combination t1X1+..+tnXn has a normal distribution
proof one direction is obvious because the marginals of a multivariate rv are normal and the sum of normals is normal. The other direction can be shown using mgf's, where the mgf of X is given by

Here are some more facts about normal rv's without proof:

Theorem

say (X,Y) is a bivariate normal rv, then
a) Z = X + Y ~ N(mX+mY,sX2+sY2+2sXsYr)
b) Z = X|Y=y ~ N(mX-r(sX/sY)(x-mY), sX√(1-r2))

R itself does not generate multivariate normals, but there is a package of R routines that does, called MASS. You can use these by typing
library("MASS")
then play around with this using show_mvn