We have previously talked about the memoryless property, and the fact that among discrete distributions on N it is unique to the geometric rv. Now we have
Theorem
X has an exponential distribution iff X is a positive continuous r.v. and P(X>s+t | X>s) = P(X>t) for all s,t > 0.
Proof:
Assume X~E(λ). Then
on the other hand assume X is continuous with density f and P(X>s+t | X>s) = P(X>t) for all s,t > 0. In the proof above we saw that this implies P(X>s+t)=P(X>s)*P(X>t). Let h(x) = P(X>x) and let ε>0. Note h(0) = P(X>0) = 1 because X is positive.
and so we see X~E(β)

Now X is said have a gamma distribution (X~Γ(α,β)) with parameters (α,β) if
By definition we have X>0, and so the Gamma is the basic example of a r.v. on [0,∞], or a little more general (using a change of variables) on any open half interval.
show_gamma shows that for different values of alpha the density has different shapes and for different values of beta it has the same shape but different scales.
Note if X~Γ(1,β) then X~E(1/β).
Another important special case is if X~Γ(n/2,2), then X is called a Chi-square r.v. with n degrees of freedom, denoted by X~ c2(n)
There is an important connection between the gamma and the Poisson distributions:
Theorem if X~Γ(n,β) and Y~P(x/β) then
P(X≤x) = P(Y≥n)
proof

and the theorem follows by induction. Here is an R check, but note that R uses pgamma(x,α,1/β)
pgamma(0.75,10,2.3)=0.09696469
1-ppois(4-1,0.75*2.3) = 0.09696469

By definition we have 0<X<1, and so the Beta is the basic example of a r.v. on [0,1], or a little more general (using a change of variables) on any open finite interval. Various examples are shown by show_beta()
1) Special case: Beta(1,1) = U[0,1]
2) Special case: X~Beta(p,1) then f(x)=cxp-1(1-p)1=cxp-1=pxp-1, 0<x<1, p>0
E[X]=p/(p+1), Var[X]=p/[(p+1)2(p+2)]
Let's go back to the gamma distribution for a moment. Say X and Y are independent Γ(α,β) and let Z=X+Y. Then
so we see that Z~Γ(2a,β). In other words, the sum of independent gamma r.v.'s is again Gamma.
Some special cases:
1) X,Y iid E(λ) then X+Y~Γ(2,λ) (and not exponential)
2) X,Y iid c2(n), then X+Y~c2(2n)


If μ=0 and σ=1 it is called a standard normal, and often denoted by Z instead of X.
Careful: some papers and textbooks define the normal as X~N(μ,σ2), that is they use the variance instead of the standard deviation.
Theorem
a) Z~N(0,1) then X=μ+σZ~N(μ,σ)
b) X~N(μ,σ), then Z=(x-μ)/σ~N(0,1)
proof

one consequence of this theorem is that we can often do a proof for the standard normal, and then quickly generalize it to all normals.
Example show that the function above is indeed a pdf.
a) fX(x)≥0 for all x
b) first we show this for a standard normal:

the change of variables above is of course called the change to polar coordinates
the general case now follows easily:
P(-∞<X<∞) = P(-∞<(X-μ)/σ<∞) = P(-∞<Z<∞) = 1
Example find the mean and the standard deviation of a normal rv.
Theorem
Say X~N(μ,σ) then
1) the mgf of X is given by MX(t)=exp(μ+σ2t2/2)
2) P(X>μ) = P(X<μ) =1/2
3) P(X>μ+x) = P(X<μ-x)
4) say X~N(μ,σ) then

proof
1) the case of a standard normal was done previously, as part of the proof of the Central Limit Theorem. The if X~N(μ,σ)
2)-4) are easy.
Example We have seen before that the Cauchy rv. has very thick tails, that is the probabilities P(X>t) are large. On the other hand the normal distribution has very thin tails. cauchy_normal() draws the two.

Theorem
a) E[Xi] =μi
b) V[Xi] =σii2
c) cov(Xi,Xj) = σij
without proof
Example bivariate normal. Say we write

here ρ=cor(X,Y)
Theorem let X and Y be two normal rv's, then X
Y iff cor(X,Y)=0
proof one direction is always true. For the other we have if ρ=0 then

Example the joint distribution of two normal rv's need not be normal
Say X~N(0,1) and let Y=-X is |X|>1 and Y=X if |X|≤1, then

so Y~N(0,1) as well, but for example f(-2,-2)=0.
On the other hand we have the following characterization of a multivariate normal distribution:
Theorem
Let X=(X1,..,Xn)T. Then X has a multivariate normal distribution if and only if every linear combination t1X1+..+tnXn has a normal distribution
proof one direction is obvious because the marginals of a multivariate rv are normal and the sum of normals is normal. The other direction can be shown using mgf's, where the mgf of X is given by
Here are some more facts about normal rv's without proof:
Theorem
say (X,Y) is a bivariate normal rv, then
a) Z = X + Y ~ N(μX+μY,σX2+σY2+2σXσYρ)
b) Z = X|Y=y ~ N(μX-ρ(σX/σY)(x-μY), σX√(1-ρ2))
R itself does not generate multivariate normals, but there is a package of R routines that does, called MASS. You can use these by typing
library("MASS")
then play around with this using show_mvn