I love you! I listened the lecture of my professor and I couldn't even understand what they were trying to say. I listened to you and things are so clear and easily understandable! I wish you were my professor! Also very entertaining!
I don't think anyone is gonna hit a dislike button on this series of video. Prof Patterson truly explained the abstract concept from an intuitive point of view. A million thanks Prof Patterson!
This is the best series on HMM, not only the Professor explains the concept and working of HMM but most importantly he teaches the core Mathematics of the HMM.
One of the main things that has always confused me with HMM's is the duration T. For some reason, I thought the duration T needed to be fixed, and every sequence needed to be the same duration. Now, I believe I finally understand the principles of the HMM. Thank you!
There's a small mistake in the equation for the update of b_j(k), see 22:37. In both, the denominator and the numerator, gamma_t(i) should be gamma_t(j) instead. Other than that, this is a fantastic series!
Thanks for the great Series. This series helped me to clearly understand the basics of HMMs. Hope you'll make more educative videos! Greets from Germany!
There is a good chance that I am wrong, but I think that your description of Beta is backwards. You say (e.g., at 7:40 ) it answers "what is the probability that the robot is here knowing what is coming next", but it should be "what is the probability of what is coming next, knowing that I am here". (in any case, thanks a lot! I am trying to learn this in details, and I found the Rabiner paper quite hard to digest, so your videos are super helpful)
Your lectures are great, thanks, one note is that, beta is wrongly expressed in your video, and it should be the following: β is the probability of seeing the observations Ot+1 to OT, given that we are in state Si at time t and given the model λ, in other words, what is the probability of getting a specific sequence from a specific model if we know the current state.
@@djp3 In 7:00 u said that beta captures the probability that we would be in a givent state knowing what's going to come in the future. So it's the other way round, you should condition on current state not future observations.
Hi Donald, Thanks for putting this easy to understand HMM series. I wanted to know a little bit more on how to apply it in other fields. How can I connect with you to discuss this.
Excellent content. If I got it right, you state that the EM-algorithm is called gradient ascent or decent. If I got it right, this is not the same. The algorithms result can be in the same local optima, but they are not the same.
if you abstract the two algorithms enough they are the same. But most computer scientists would recognize them as different algorithms that both find local optima.
Ahem: "ξ" ("xi") is pronounced either "ksee " or "gzee". You were pronouncing "xi" as if it were Chinese. But... still a great video on HMM and Baum-Welch. Thank you!
05:45 is this the right interpretation of alpha? Alpha is P(O1...Ot, qt=Si), which is the probability of observing O1..Ot AND being in state Si at timepoint t. But you said it is the probability of being in state Si at timepoint t GIVEN the Observations O1..Ot. That would P(qt=Si | O1...Ot) which is different.
I noticed this too, it is better to use the alternate formulation for gamma, which is \gamma_t(i) = \alpha_t(i) * \beta_t(i) / \sum_i (\alpha_t(i) * \beta_t(i)). This should give you the correct dimension
there is a matrix of gamma's for each t and each i and a 3-D matrix Xi's for each t,i,j. Each gamma_t is the sum over as set of Xi's at that time. You could also notate gamma as being gamma(t,i) and Xi and Xi(t,i,j)
@@alexmckinney5761 yes. I did it. However I still am getting error in my code. My a matrix goes to 1 on one side and zero on the other side. I am still trying to figure out the problem, but without success till then.