next up previous
Next: Three Problems for HMMs

An Introduction to Hidden Markov Models

James G. Schmolze

Spring 2001

This summary is taken largely from ``An Introduction to Hidden Markov Models'' by Rabiner and Juang, IEEE ASSP Magazine, 3(1), January 1986, 4-16. Several additions have been made to in this paper.

We begin with a definition of a finite (non-hidden) Markov model (MM), which is intended to model certain types of stochastic processes.

Clearly, a proper MM enjoys the property that the probability of transitioning from any given state to some next state is 1:

$\displaystyle \forall$i $\displaystyle \in$ Q.$\displaystyle \left(\vphantom{ \sum _{j\in Q}a_{i,j}}\right.$$\displaystyle \sum_{j\in Q}^{}$ai, j$\displaystyle \left.\vphantom{ \sum _{j\in Q}a_{i,j}}\right)$ = 1.

A finite hidden Markov model (HMM) is similar but the state information is hidden from the observer. (Some call an HMM a partially observable Markov process.) Moreover, there is a second stochastic process where the process produces an observation in each state.

Clearly, a proper HMM enjoys the properties of MMs and also the fact that the probability of producing some observation in each state is 1:

$\displaystyle \forall$i $\displaystyle \in$ Q.$\displaystyle \left(\vphantom{ \sum _{k\in V}b_{i}(k)}\right.$$\displaystyle \sum_{k\in V}^{}$bi(k)$\displaystyle \left.\vphantom{ \sum _{k\in V}b_{i}(k)}\right)$ = 1.

Moreover, we can view a MM as a special case of an HMM where V = Q and where the observation produced at each state is always the state itself, $ \forall$i $ \in$ Q.bi(i) = 1.

To complete our notation, we let $ \pi$ be the initial state distribution. Here, $ \pi_{i}^{}$ represents the probability that the process begins in state i where

$\displaystyle \left(\vphantom{ \sum _{i\in Q}\pi _{i}}\right.$$\displaystyle \sum_{i\in Q}^{}$$\displaystyle \pi_{i}^{}$$\displaystyle \left.\vphantom{ \sum _{i\in Q}\pi _{i}}\right)$ = 1.

We use $ \lambda$ to refer to a particular HMM model, i.e., to a particular < Q, V, a, b,$ \pi$ >.

In what follows, we will work with respect to a specific sequence of observations that we call O = < O1,..., OT > where T is the length of the observation sequence and where each Ot $ \in$ V. We will often refer to the state sequence that produced O and will call it I = < I1,..., IT >.




next up previous
Next: Three Problems for HMMs
Webmaster: Jim Schmolze, schmolze@eecs.tufts.edu.  [Get postscript version]