MATLAB: Hidden Markov Models
We can model things with Markov Chains. But sometimes we don’t have access to the full information.
For instance, we observe the movements of a stock price. Price changes follow some trends of the economy, but we don’t get to see the state of the economy, which is too big and complicated.
\[\begin{matrix} H_1 & \rightarrow & H_2 & \rightarrow & \cdots & \rightarrow & H_{T-1} & \rightarrow & H_T & \rightarrow \\ \downarrow & & \downarrow & & & & \downarrow & & \downarrow \\ V_1 & & V_2 & & \cdots & & V_{T-1} & & V_T \end{matrix}\]In a Hidden Markov Model, we have a first order stationary process \(H_1, H_2, \ldots\)
- These can be in a finite number of spaces
- These have transition probabilities
On top of this, there is a mechanism that produces variables \(V\), which are observable - \(H\) variables are hidden.
In other words, only \(V_T\) is impacted by other variables through \(H_T\)
Use cases of Hidden Markov Models
Speech Recognition
Assuming a sound recording. We want to recover the text from the sounds. Now, text follows a Markov Chain very closely.
Speech can be thought of as emissions from text. We can easily find the parameters of this Markov Chain
Join Distribution
Here is a Hidden Markov Chain with:
- Hidden states \(s_1, s_2, s_3\)
- Transition probabilities \(p\)
- Emission probabilities \(b\) (found from the emission matrix \(B\))
we calcuate the probability of join state as:
\[\mathbb{P}(H_1 = s_3, H_2 = s_1, H_3 = s_2, H_4 = s_1, V_1 = x_2, V_2 = x_2, V_3 = x_1, V_4 = x_3) \\ = p_3^{\text{initial}} p_{3,1} p_{1,2} p_{2,1} p_{3,2} b_{1,2} b_{2,1} b_{1,3}\]Filtering
Filtering problems are about finding the final states of the HMM.
Suppose we have the following observations:
\[v_1, v_2, \ldots, v_T\]We want to find the following probability for all states \(H_T = s_i\):
\[\mathbb{P}(H_T = s_i, v_1, \ldots, v_T)\]Naive Approach
To do this, we just sum over all paths that end in \(H_T = s_i\)
But if there are \(N\) states, then there are \(N^{T-1}\) paths in the chain. So there is actually a better way of doing this.
Alpha
We are interested in the probability alpha, which is:
\[\alpha_T(i) = \mathbb{P}(H_T = s_i, v_1, \ldots, v_T)\] \[\begin{matrix} \text{time} & & \cdots && T-1 && T \\ &&&&&p_{i,j} & \tiny \leftarrow \text{(transition probability)} \\ \text{hidden states} &&&& \bullet & \searrow &&& \\ \\ &&&& \bullet & \rightarrow & \bullet s_j \\ \\ &&&& \bullet & \nearrow &&& \\ \\ \text{emission} && \cdots && V_{T-1} && V_T \end{matrix}\]Now for each alpha we need \(N\) operations and there are \(NT\) values of alpha - much better than \(N^{T-1}\)!
\[\mathbb{P}(H_T = s_i, v_1,...,v_T) = \alpha_T = \mathbb{P}(V_t|s_j)\sum_{i=1}^N \alpha_{T-1}(i)p_{i,j}\]This sort of trick goes under the name of dynamic programming (=optimisation).
MATLAB Implementation
function a = alpha_dynamic(M,p,B,v)
% ALPHA DYNAMIC(M,p,B,v) calculates the matrix of alpha's the
% hmm with transition matrix M, emission matrix B, and initial
% probabilities p, given the observations v
[np, ~] = size(p);
[N, ~] = size(M);
T = length(v);
%initialise alphas
a = zeros(N,T);
% making sure p is a column
if np == 1
p = p';
end
% alpha 1 from initial probability
a(:,1) = p.*B(:,v(1));
%subsequent alphas
for t = 2:T
a(:,t) = M' * a(:,t-1) .* B(:,v(t));
end
end
Example
\[M = \left( \begin{matrix} 0.8 & 0.2 \\ 0.3 & 0.7 \end{matrix} \right) \quad B = \left( \begin{matrix} 0.3 & 0.4 & 0.1 & 0.2 \\ 0.2 & 0.2 & 0.3 & 0.3 \end{matrix} \right) \quad p = \left( \begin{matrix} 0.4 \\ 0.6 \end{matrix} \right)\]Observations in the sequence:
\[x_4, x_1, x_2\]