If X is a non-negative random variable and a is a positive constant, then
P( X >= a ) <= E[X]/a
Proof: Let I be the indicator random variable that is 1 when X >= a and 0 otherwise. Then
I <= X/a, so
E[I] = P( I = 1 ) = P( X >= a ) <= E[ X/a ] = E[X]/a
This is the best we can do if all we know is the expectation, but it is often too weak. We can often calculate the varaince as well, which allows us to use
Chebyshev's inequality:
P( |X - E[X] | >= a ) <= var[X]/a2
Proof: P( |X - E[X] | >= a ) = P( (X - E[X] )2 >= a2 ) <= E[( X - E[X] )2 ]/a2 = var[X]/a2
where the inequality is an application of Markov's inequality.
var( X + Y ) = E[ ( X + Y )2 ] - E[ X + Y ]2
= E[X2] + 2E[XY] + E[Y2] - E[X]2 - 2E[X]E[Y] - E[Y]2
= E[X2] - E[X]2 + E[Y2] - E[Y]2
= var(X) + var(Y)
Here we have used linearity of expectation and the fact that the expectation of the product of independent random variables is the product of the expectations.