Assume that all n outcomes are equally likely.
After one observation, you've seen one outcome.
After two observations, you've seen the same outcome twice with probability 1/n and two different outcomes with probability 1 - 1/n.
After three observations, you've seen the same outcome three times with probability 1/n2. To see three different outcomes, you must have seen two different outcomes on the first two observations. The conditional probability of a new outcome on the third observation is 1 - 2/n, so the probability of three different outcomes is (1 - 1/n.) * (1 - 2/n). The case of two different outcomes can be split into two subcases depending on whether the first two observations were the same or different. The formula is:
1/n * (1 - 1/n) + (1 - 1/n) * 2/n
We could extend this approach, but the formula looks like it's getting messy, so let's try another approach. The probability of seeing a new coupon only changes after each new coupon is seen. While waiting for the ith coupon (after having seen (i - 1) different coupons) the probability of a new coupon is
pi = 1 - (i - 1)/n
so if we let Xi be the waiting time between coupons i-1 and i we get a random variable with geometric distribution and expectation n/(n - i + 1). Let X be the number of coupons bought until all coupons have been collected. X is the sum of the Xi, and (by linearity of expectation) its expectation is the sum of n/(n - i + 1). This sum is n times the nth Harmonic number, so it is approximately n * ln n
We can easily calculate this value, and we can use Markov's inequality to (for example) bound the probability that we have to wait N times this long by 1/N. Chebyshev's inequality gives a better bound, but to use it we need to calculate the variance of X. The Xi are independent, since the probabilities just depend on i and n. This means that the variance is the sum of the variances of the geometric random variables Xi.
For a geometric random variable, G, with parameter p we have seen that the expectation is 1/p. Now we need to calculate
E[G2] = sum over
By differentiating the series used for E[G] term by term,
we get (2 - p)/p2, so
var( G ) = (2 - p)/p2 - 1/p2
= (1 - p)/p2 < 1/p2
The sum we get for the variance is similar to the one for the
expected value, except that each term is squared. In this case,
the series converges to pi2/6, so var(X) <
(n * pi)2/6
By differentiating the series used for E[G] term by term, we get (2 - p)/p2, so
var( G ) = (2 - p)/p2 - 1/p2 = (1 - p)/p2 < 1/p2
The sum we get for the variance is similar to the one for the expected value, except that each term is squared. In this case, the series converges to pi2/6, so var(X) < (n * pi)2/6