Continuous Distributions

Uniform Distribution

X is said to have a uniform distribution on the interval [A,B] if

Exponential Distribution

X is said to have an exponential distribution rate λ if

We have previously talked about the memoryless property, and the fact that among discrete distributions on N it is unique to the geometric rv. Now we have

Theorem
X has an exponential distribution iff X is a positive continuous r.v. and P(X>s+t | X>s) = P(X>t) for all s,t > 0.
Proof:
Assume X~E(λ). Then

on the other hand assume X is continuous with density f and P(X>s+t | X>s) = P(X>t) for all s,t > 0. In the proof above we saw that this implies P(X>s+t)=P(X>s)*P(X>t). Let h(x) = P(X>x) and let ε>0. Note h(0) = P(X>0) = 1 because X is positive.

and so we see X~E(β)

The Gamma Distribution

Recall the gamma function:

The gamma function is famous for many things, among them the relationship Γ(α+1) = α Γ(α) which follows from:

This implies Γ(n)=(n-1)!, so the gamma is a continuous version of the factorial. The Gamma function has many other interesting properties, for example Γ(1/2) = √π

Now X is said have a gamma distribution (X~Γ(α,β)) with parameters (α,β) if

By definition we have X>0, and so the Gamma is the basic example of a r.v. on [0,∞], or a little more general (using a change of variables) on any open half interval.

show_gamma shows that for different values of alpha the density has different shapes and for different values of beta it has the same shape but different scales.

Note if X~Γ(1,β) then X~E(1/β).

Another important special case is if X~Γ(n/2,2), then X is called a Chi-square r.v. with n degrees of freedom, denoted by X~ c2(n)

There is an important connection between the gamma and the Poisson distributions:

Theorem if X~Γ(n,β) and Y~P(x/β) then
P(X≤x) = P(Y≥n)
proof

and the theorem follows by induction. Here is an R check, but note that R uses pgamma(x,α,1/β)
pgamma(0.75,10,2.3)=0.09696469
1-ppois(4-1,0.75*2.3) = 0.09696469

The Beta Distribution

X is said to have a beta distribution with parameters α and β (X~Beta(α,β)) if

it is easy to calculate the moments of a beta distribution:

By definition we have 0<X<1, and so the Beta is the basic example of a r.v. on [0,1], or a little more general (using a change of variables) on any open finite interval. Various examples are shown by show_beta()

1) Special case: Beta(1,1) = U[0,1]

2) Special case: X~Beta(p,1) then f(x)=cxp-1(1-p)1=cxp-1=pxp-1, 0<x<1, p>0
E[X]=p/(p+1), Var[X]=p/[(p+1)2(p+2)]

Let's go back to the gamma distribution for a moment. Say X and Y are independent Γ(α,β) and let Z=X+Y. Then

so we see that Z~Γ(2a,β). In other words, the sum of independent gamma r.v.'s is again Gamma.

Some special cases:
1) X,Y iid E(λ) then X+Y~Γ(2,λ) (and not exponential)
2) X,Y iid c2(n), then X+Y~c2(2n)

Cauchy Distribution

A rv. X has a Cauchy distribution if

the Cauchy has one interesting property:

so the Cauchy has no mean (and therefore no moments at all). The reason is that it has thick "tails", that is the probability of observing a large value (+ or -) is large.

The Normal (Gaussian) Distribution

X is said to have a normal distribution with mean μ and variance σ2 (X~N(μ,σ)) if it has density

If μ=0 and σ=1 it is called a standard normal, and often denoted by Z instead of X.
Careful: some papers and textbooks define the normal as X~N(μ,σ2), that is they use the variance instead of the standard deviation.

Theorem
a) Z~N(0,1) then X=μ+σZ~N(μ,σ)
b) X~N(μ,σ), then Z=(x-μ)/σ~N(0,1)

proof

one consequence of this theorem is that we can often do a proof for the standard normal, and then quickly generalize it to all normals.

Example show that the function above is indeed a pdf.
a) fX(x)≥0 for all x
b) first we show this for a standard normal:

the change of variables above is of course called the change to polar coordinates
the general case now follows easily:
P(-∞<X<∞) = P(-∞<(X-μ)/σ<∞) = P(-∞<Z<∞) = 1

Example find the mean and the standard deviation of a normal rv.

Theorem
Say X~N(μ,σ) then
1) the mgf of X is given by MX(t)=exp(μ+σ2t2/2)
2) P(X>μ) = P(X<μ) =1/2
3) P(X>μ+x) = P(X<μ-x)
4) say X~N(μ,σ) then

proof
1) the case of a standard normal was done previously, as part of the proof of the Central Limit Theorem. The if X~N(μ,σ)

2)-4) are easy.

Example We have seen before that the Cauchy rv. has very thick tails, that is the probabilities P(X>t) are large. On the other hand the normal distribution has very thin tails. cauchy_normal() draws the two.

Multivariate Normal RV

Let μ=(μ1,..,μn)T be a vector and Σ=[σij] be a positive semi-definite matrix (ie xTΣx≥0 for all x), then the random vector X = (X1,..,Xn)T has a multivariate normal distribution if it has joint density

where |Σ| is the determinant of Σ

Theorem
a) E[Xi] =μi
b) V[Xi] =σii2

c) cov(Xi,Xj) = σij
without proof

Example bivariate normal. Say we write

here ρ=cor(X,Y)

Theorem let X and Y be two normal rv's, then XY iff cor(X,Y)=0
proof one direction is always true. For the other we have if ρ=0 then

Example the joint distribution of two normal rv's need not be normal
Say X~N(0,1) and let Y=-X is |X|>1 and Y=X if |X|≤1, then

so Y~N(0,1) as well, but for example f(-2,-2)=0.

On the other hand we have the following characterization of a multivariate normal distribution:

Theorem
Let X=(X1,..,Xn)T. Then X has a multivariate normal distribution if and only if every linear combination t1X1+..+tnXn has a normal distribution
proof one direction is obvious because the marginals of a multivariate rv are normal and the sum of normals is normal. The other direction can be shown using mgf's, where the mgf of X is given by

Here are some more facts about normal rv's without proof:

Theorem

say (X,Y) is a bivariate normal rv, then
a) Z = X + Y ~ N(μXYX2Y2+2σXσYρ)
b) Z = X|Y=y ~ N(μX-ρ(σXY)(x-μY), σX√(1-ρ2))

R itself does not generate multivariate normals, but there is a package of R routines that does, called MASS. You can use these by typing
library("MASS")
then play around with this using show_mvn