A **hypothesis** is a function on the sample space, giving a value
for each point in the sample space. If the possible values are {0, 1}
then we can identify a hypothesis with the subset of those points
that are given value 1. The **error** of a hypothesis is the
probability of that subset where the hypothesis disagrees with the
true hypothesis. **Learning from examples** is the process of making
independent random observations and eliminating those hypotheses
that disagree with your observations.
How many examples do you need to see so that you have a given level
of confidence that picking a hypothesis consistent with your observations
will give you error less than *E*?

Suppose we have a finite set of hypotheses, *H*, and that we make *m*
observations. If *h* is a hypothesis with error greater than *E*,
then the probability that it will be consistent with a given observation
is less than 1 - *E*, and the probability that it will be consistent
with all *m* observations is less than (1 - *E*)^{m},
which is less than exp( -*Em* ). Therefore the total probability that
some hypothesis with error greater than *E* remains after *m* observations
is less than |*H*|exp( -*Em* ). We can set this bound at some
desired level, say *d*, and solve for *m*, giving

*m* > (ln |*H* | + ln (1/*d*) )/ *E*

### Reference:

Tom M. Mitchell, Machine Learning, McGraw-Hill, 1997