Inequalities

Inequalities are very important in probability theory, both for the theory and for practical applications.. We start with a lemma:

Lemma let a and b be any positive numbers, and let p and q be any positive numbers with 1/p+1/q=1. Then

with "=" iff ap=bq proof
fix b, and consider the function g with

so g has a minimum at ap-1=b. So

because (p-1)q=p (1/p+1/q=1 → 1+p/q=p → p/q=p-1)
Moreover the minimum of g is unique because g is convex for all a, so "=" holds iff ap-1=b, which is the same as ap=bq

Hölder's Inequality Let X and Y be any two rvs, and let p and q be as above. Then

proof
The first "≤" follows from -|XY|≤XY≤|XY|. For the second "≤" define

The case p=q=1/2 is so famous it has its own name:
Cauchy-Schwartz Inequality

These inequalities are stated here in terms of expectations, but they hold in general for sums and integrals as well.

If we set Y=1 we get
E|X|≤{E|X|p}1/p, 1<p<∞
For 1<r<p, if we replace |X| by |X|r, we get
E|X|r≤{E|X|pr}1/p
and writing s=pr (which implies s>r) we get
Liapunov's Inequality (E|X|r)1/r≤{E|X|s}1/s for 1<r<s<∞

Next a new type of inequality:

Markov's Inequality:
If X takes on only nonnegative values, then for any a>0

proof:

Markov's inequality implies what is perhaps the most famous inequality in probability:

Chebyshev's Inequality:
If X is a r.v. with mean m and variance s2, then for any k>0:

proof:

Example Consider the uniform random variable with f(x) = 1 if 0<x<1, 0 otherwise. We already know that m=0.5 and s=1/√12 = 0.2887. Now Chebyshev says
P(|X-0.5|>k·0.2887)≤1/k2
For example
P(|X-0.5|>1·0.2887)≤1/12 = 1 (rather boring!)
or
P(|X-0.5|>3·0.2887)≤1/32 = 1/9
actually P(|X-0.5|>0.866) = 0, so this is not a very good upper bound.

For the last inequality first recall
Definition
A function g is said to be convex if
g(lx+(1-l)y) ≤lg(x)+(1-l)g(y)
for all x and y

Now
Jensen's Inequality
For any rv X, if g is a convex function we have Eg(X)≥g(EX)

proof let l(x) be a tangent line to g(x) at the point g(EX) Write l(x)=a+bx for some a and b. Now by the convexity of g we have g(x)≥a+bx and so
Eg(X)≥E(a+bx)=a+bEX=l(EX)=g(EX)

Example g(x)=x2 is convex, and so EX2≥(EX)2, which implies V(X)=EX2-{EX}2≥0

Example If x>0 g(x)=1/x is convex, so E(1/X)≥1/EX