The Aho-Corasick Algorithm

Definition: Given a set of patterns {P1, P2, ..., Pz} the keyword tree K is a rooted directed tree with edges labelled by single characters and no two edges out of a node labelled by the same character. Each leaf corresponds in the natural way to a pattern, and each pattern corresponds to a node (which may not be a leaf for a pattern that is a prefix of another pattern.)

Temporary assumption: no pattern is a substring of another pattern

Definitions: For a node v, L(v) is the concatenation of labels on the path from the root to v, and lp(v) is the length of the longest proper suffix of L(v) that is a prefix of some pattern. A failure link is (v, nv), where nv is the node in K labelled by this suffix.