Markov's inequality:

If X is a non-negative random variable and a is a positive constant, then

P( X >= a ) <= E[X]/a

Proof: Let I be the indicator random variable that is 1 when X >= a and 0 otherwise. Then

I <= X/a, so

E[I] = P( I = 1 ) = P( X >= a ) <= E[ X/a ] = E[X]/a

This is the best we can do if all we know is the expectation, but it is often too weak. We can often calculate the varaince as well, which allows us to use

Chebyshev's inequality:

P( |X - E[X] | >= a ) <= var[X]/a2

Proof: P( |X - E[X] | >= a ) = P( (X - E[X] )2 >= a2 ) <= E[( X - E[X] )2 ]/a2 = var[X]/a2

where the inequality is an application of Markov's inequality.


It is often useful to calculate the variance of the sum of random variables when applying Chebyshev's inequality. Unlike expectation, variance is not linear. For example, if X has values 0 and 1, each with probability 1/2, and Y = - X, then var(X) = var(Y) = 1/4, but X + Y is always zero, so its variance is 0. Luckily, if X and Y are independent we can sum the variances:

var( X + Y ) = E[ ( X + Y )2 ] - E[ X + Y ]2
= E[X2] + 2E[XY] + E[Y2] - E[X]2 - 2E[X]E[Y] - E[Y]2
= E[X2] - E[X]2 + E[Y2] - E[Y]2
= var(X) + var(Y)

Here we have used linearity of expectation and the fact that the expectation of the product of independent random variables is the product of the expectations.