We have previously talked about the memoryless property, and the fact that among discrete distributions on N it is unique to the geometric rv. Now we have
Theorem
X has an exponential distribution iff X is a positive continuous r.v. and P(X>s+t | X>s) = P(X>t) for all s,t > 0.
Proof:
Assume X~E(l). Then
on the other hand assume X is continuous with density f and P(X>s+t | X>s) = P(X>t) for all s,t > 0. In the proof above we saw that this implies P(X>s+t)=P(X>s)*P(X>t). Let h(x) = P(X>x) and let e>0. Note h(0) = P(X>0) = 1 because X is positive.
and so we see X~E(b)

Now X is said have a gamma distribution (X~G(a,b)) with parameters (a,b) if
By definition we have X>0, and so the Gamma is the basic example of a r.v. on [0,∞], or a little more general (using a change of variables) on any open half interval.
show_gamma shows that for different values of alpha the density has different shapes and for different values of beta it has the same shape but different scales.
Note if X~G(1,b) then X~E(1/b).
Another important special case is if X~G(n/2,2), then X is called a Chi-square r.v. with n degrees of freedom, denoted by X~ c2(n)
There is an important connection between the gamma and the Poisson distributions:
Theorem if X~G(n,b) and Y~P(x/b) then
P(X≤x) = P(Y≥n)
proof

and the theorem follows by induction. Here is an R check, but note that R uses pgamma(x,a,1/b)
pgamma(0.75,10,2.3)=0.09696469
1-ppois(4-1,0.75*2.3) = 0.09696469

By definition we have 0<X<1, and so the Beta is the basic example of a r.v. on [0,1], or a little more general (using a change of variables) on any open finite interval. Various examples are shown by show_beta()
Special case: Beta(1,1) = U[0,1]
Let's go back to the gamma distribution for a moment. Say X and Y are independent G(a,b) and let Z=X+Y. Then
so we see that Z~G(2a,b). In other words, the sum of independent gamma r.v.'s is again Gamma.
Some special cases:
1) X,Y iid E(l) then X+Y~G(2,l) (and not exponential)
2) X,Y iid c2(n), then X+Y~c2(2n)


If m=0 and s=1 it is called a standard normal, and often denoted by Z instead of X.
Careful: some papers and textbooks define the normal as X~N(m,s2), that is they use the variance instead of the standard deviation.
Theorem
a) Z~N(0,1) then X=m+sZ~N(m,s)
b) X~N(m,s), then Z=(x-m)/s~N(0,1)
proof

one consequence of this theorem is that we can often do a proof for the standard normal, and then quickly generalize it to all normals.
Example show that the function above is indeed a pdf.
a) fX(x)≥0 for all x
b) first we show this for a standard normal:

the change of variables above is of course called the change to polar coordinates
the general case now follows easily:
P(-∞<X<∞) = P(-∞<(X-m)/s<∞) = P(-∞<Z<∞) = 1
Example find the mean and the standard deviation of a normal rv.
Theorem
Say X~N(m,s) then
1) the mgf of X is given by MX(t)=exp(m+s2t2/2)
2) P(X>m) = P(X<m) =1/2
3) P(X>m+x) = P(X<m-x)
4) say X~N(m,s) then

proof
1) first we find the mgf of a normal, again starting with a standard normal:

2)-4) are easy.
Example We have seen before that the Cauchy rv. has very thick tails, that is the probabilities P(X>t) are large. On the other hand the normal distribution has very thin tails. cauchy_normal() draws the two.

Theorem
a) E[Xi] =mi
b) V[Xi] =sii2
c) cov(X,Y) = sij
without proof
Example bivariate normal. Say we write

here r=cor(X,Y)
Theorem let X and Y be two normal rv's, then X
Y iff cor(X,Y)=0
proof one direction is always true. For the other we have if r=0 then

Example the joint distribution of two normal rv's need not be normal
Say X~N(0,1) and let Y=-X is |X|>1 and Y=X if |X|≤1, then

so Y~N(0,1) as well, but for example f(-2,-2)=0.
On the other hand we have the following characterization of a multivariate normal distribution:
Theorem
Let X=(X1,..,Xn)T. Then X has a multivariate normal distribution if and only if every linear combination t1X1+..+tnXn has a normal distribution
proof one direction is obvious because the marginals of a multivariate rv are normal and the sum of normals is normal. The other direction can be shown using mgf's, where the mgf of X is given by
Here are some more facts about normal rv's without proof:
Theorem
say (X,Y) is a bivariate normal rv, then
a) Z = X + Y ~ N(mX+mY,sX2+sY2+2sXsYr)
b) Z = X|Y=y ~ N(mX-r(sX/sY)(x-mY), sX√(1-r2))
R itself does not generate multivariate normals, but there is a package of R routines that does, called MASS. You can use these by typing
library("MASS")
then play around with this using show_mvn