Next: Three Problems for HMMs
An Introduction to Hidden Markov Models
James G. Schmolze
Spring 2001
This summary is taken largely from ``An Introduction to Hidden Markov Models''
by Rabiner and Juang, IEEE ASSP Magazine, 3(1), January 1986, 4-16. Several
additions have been made to in this paper.
We begin with a definition of a finite (non-hidden) Markov model (MM),
which is intended to model certain types of stochastic processes.
- There is a finite set of states,
Q = {q1,..., qN}, whose cadinality
is N, and the process being modeled is in exactly one state from Q
at all times. Moreover, the process enjoys the Markov property, namely,
that the history that led to the current state is irrelevant to the future behavior
of the process. All that matters to future behavior is the current state.
- At each clock time, the process transitions to the next state (which may be
the same as the previous state) based upon a transition probability distribution,
ai, j, that depends only upon the previous state. Here ai, j
is the probability that, when in state i the process will transition
next to state j.
Clearly, a proper MM enjoys the property that the probability of transitioning
from any given state to some next state is 1:
A finite hidden Markov model (HMM) is similar but the state information
is hidden from the observer. (Some call an HMM a partially observable
Markov process.) Moreover, there is a second stochastic process where the process
produces an observation in each state.
- There is a finite set of possible observations,
V = {v1,..., vM},
whose cardinality is M.
- There is a probability distribution of producing a given observation in each
state. Here, bi(k) is the probability of observing vk when
the process is in state i.
Clearly, a proper HMM enjoys the properties of MMs and also the fact that the
probability of producing some observation in each state is 1:
Moreover, we can view a MM as a special case of an HMM where V = Q and
where the observation produced at each state is always the state itself,
i
Q.bi(i) = 1.
To complete our notation, we let
be the initial state distribution.
Here,
represents the probability that the process begins in
state i where
We use
to refer to a particular HMM model, i.e., to a particular
< Q, V, a, b,
>.
In what follows, we will work with respect to a specific sequence of observations
that we call
O = < O1,..., OT > where T is the length of
the observation sequence and where each
Ot
V. We will often refer
to the state sequence that produced O and will call it
I = < I1,..., IT >.
Next: Three Problems for HMMs
Webmaster: Jim Schmolze, schmolze@eecs.tufts.edu. [Get postscript version]