One method is to use the law of total probability and condition on all possible
state sequences, I, of length T:
P(O|
) =
P(O| I,
)×P(I|
).
The probability of a given sequence of states is simple:
P(I|
) =
×aI1, I2×...×aIT - 1, IT.
The probability of a given observation sequence given a state sequence is also
simple:
P(O| I,
) = bI1(O1)×...×bIT(OT).
But the number of state sequences is exponential, so this is a poor method.
The standard approach is to use the forward-backward procedure, which takes advantage of the Markov property, namely that the path to a state is irrelevant to the future behavior of the process after being in that state. First we define
To see this, we first apply the chain rule and get
(j) = P(It = j|
)×P(O1,..., Ot| It = j,
).
Next we note that the law of total probability tells us that
Moverover,
is easily calculated iteratively:
| 1. |
| 2. |
Notice that calculating
P(O|
) requires on the order of N×T
calculations using
as compared to requiring an exponential
number of calculations.
represents the forward part, and
represents the backward part. Let
We can also use
to determine
P(O|
).
First we calculate
P(O1|
):
Next, we calculate
P(O2,..., OT| O1,
):
But once we condition on I1 = j then we can ignore the conditions O1 because knowing the obsevation produced at time 1 gives us no more information once we know the state at time 1. Thus, we get the simplification
Putting these together, we get
is also calculated easily.
| 1. |
| 2. |
There are a few useful quantities that we can calculate with
and
that we now note.