Correction: 2:39 I said likelihood=0.03 for mu=30, but mu=28 is in the equation. Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
This channel, along with 3blue1Brown make me feel like a mathematician who is actually discovering this techniques for the first time in a logical way, rather than just handed to me through some obscure formulas that one just take for granted, because most academics forget about the importance of visual intuition.
You know, how do I feel right now after taking this lesson? I feel like I had a very bad stomach after lunch and just gotten out of the toilet. It feels so calming and relaxing.
i learned this concept long time ago, and several times i felt that i needed to review it, but i have delayed it for a long time. But, I did this now for only 10min with skipping the things that i could remember.
Is there also such a best video explaining Bayesian estimation. Which is another way of finding the point or interval of some occurrence? (from chapter parameter estimation)
In this case however the denominator if the sd is not n-1. yes, we are not estimating the population sd, but wouldn't be right for the same reason of underestimation of the variance? Thanks a lot for these videos!
This will give you the "maximum likelihood estimate" of the variance, and, as we saw in this StatQuest: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-sHRBg6BhKjI.html , that estimate, the maximum likelihood estimate, although it fits the data great, it underestimates the true variance.
Thank you for the informative video. I have a question about estimating parameters in statistical analysis. Could you explain the difference between: Using Maximum Likelihood Estimation (MLE) to estimate the parameters of a normal distribution, where the variance formula includes n in the denominator, and Calculating an unbiased estimate of the population variance from a sample, where n−1 is used in the denominator? Both methods use measurements to estimate the parameters of the population from which the measurements are derived, yet they approach the calculation of variance differently. This can be confusing. Could you clarify this?
I have a video about why we use n-1 when we estimate variance from a small set of data here: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-sHRBg6BhKjI.html
I'm a teaching assistant in stats at both UCL and Birkbeck (University of London), and I can say, hands down, I have rarely ever seen teaching as phenomenal as this in stats and maths. I wish I had access to your videos when I was learning, and struggled with all these topics (with teachers who didn't have anywhere near as as much commanding knowledge, and tended to over-complicate things), and not only, now, learn from you how to teach effectively (and obviously, still, a lot of maths & stats :). You don't begin to appreciate how much in-depth understanding a teacher has, to make stats&maths so seemingly simple and quick to swallow, until you start teaching these things yourself. Bravo! (This is one of my few comments on anything online ever)
@@haleemamisal2432 Unless you end up in a good group with either a supervisor who is still active in getting their hands dirty with experiments and etc., or strong postdoc(s) in your group (edit: who are obliged/willing to help), PhD studentships can be like walking in a pitch-dark forest with your hands tied behind your back whilst having overdosed on hay-fever medication. It's ridiculous! I would take a strong and/or enthusiastic group/leader in a shitty uni over a shitty group/supervisor at a Russel-group/famous-department/uni any day.
I was reluctant to start this video, seeing its length. I thought it will be some other university guy explaining it mundanely. I am glad I clicked. Even more glad that I was wrong.
I am amazed how you teach these concepts and make them easy to understand. It is the first time I fully understand the standard deviation. Having this 'Eureka' moment was only possible because of your effort and your skills! You are really good! Thank you so much!
Ahhhhhhh I’m so mad! I found this channel while studying for my stats FINAL and it’s so good. My life would have been so much easier this semester if I had found this earlier! You explain things so intuitively so I always understand what is really happening instead of just watching numbers move around completely confused. You’re doing a great thing man, wish I could have been able to use your channel more haha.
You are a true hero!! Especially now with online classes and everything. Thanks soo much! This lesson was super intimidating to me, but learning from this video has made me realize it's not that hard at all. Good teaching makes miles of difference. Thank you so much!!
Classic. I have seldom seen a teacher taking his students/audience from the base of the hill to the top in such a smooth fashion! Excellent. P.S. I can only pity on those who disliked this video, better get admitted into kindergarten :)
I love this comment!!! Thank you. I love the analogy - some teachers just yell down from the top of the hill, other teachers have heard what the top of the hill is like, but have never been there themselves, so they just repeat what they have heard, and other teachers, like sherpas, know the way and are willing to do the trek. Also, I think of the "likes" and "dislikes" as a lesson about the world. The "dislikes" tell me that no matter how I try, I can't help everyone. However, the "likes" tell me that if I try really hard, I can help a lot of people - and that's what motivates me.
Truth is even after years of working extensively with data, it is not only always informative to go back studying the most basic concepts possible in Statistics but also a real relaxing moments to listen to you teaching. It takes my head off overcomplicated matters, cleanses my reasoning and lets me walk out of this 20-minutes coffee break feeling more refreshed than any caffeine intake. Thank you !
This explained what I tried to learn in our uni course for 3 weeks in just 20 minutes. If the teacher just showed the class this video there wouldn't be any confusion anymore
Man, I really really hope from the bottom of my heart that you teach actual classes somewhere IRL too. You could have been the the difference between me failing out of university because of math and stat and finishing it with flying colors. Too bad I only found you now.
That is absolutely amazing : I'm from France so English isn't my native language, though I still understand it better than with my french teacher lessons! Hundred thanks
I'm a current graduate student in a Machine Learning class, and I just wanted to take the time to say thank you from the bottom of my heart! It's online (due to the pandemic) and my professor's videos are honestly hit or miss (and his MLE video for me was definitely a miss....) The example REALLY helped me wrap my head around what we are doing. Thank you so so so so much... definitely donating (or buying some merch lol) as soon as I'm able!
OMG , my feeling just like riding in a roller coaster. when you bring up the formula with derivative, my head start bumping. but when you explain the slope = 0 till the end, it just like riding down the slope and make it so excited. LOL thank you josh
When I compare the usefulness of this video to me, engineers, society in general and I see it gets 131k views as opposed to a justin bieber song...I feel sad. Thanks man, you helped a lot! Any recommendations for good reference books for these topics? Thanks again! BEST EXPLANATION IN AGES!
Thank you very much! You made me laugh a little bit. Yes, this video is useful, but maybe Justin Bieber brings people happiness... I wonder if I could combine the two! Maybe one day Justin Bieber will narrate a StatQuest? Something to think about. Unfortunately I don't have a good reference for Statistics. :(
You are making statistic fun, enjoyable and most importantly easy to understand.....wait for it.....(infinite)*"bam!!!", Thanks for clearing our doubts.
Question @ 4:46: for the standard dev....you need to have the data(x), the mean (µ), and the number of samples (n)... if we have all those values then why do we need to plot the likelihood of sigma? Answer @ end of video: EXACTLY!!!
the likelihood I learned 20 years ago, and taught college students for 2 years before I went abroad. Never, never really understood it before watching these three of your videoes. I know the formula, know how to calculate. But it's you let me know the visual meaning.
Can only second @Pejvak. I too give little feedback, but this video definitely warrants it. Thanks so much for explaining supposedly complex theories in very simple ways.
tldr: when guessing the population mean from sample, your best bet your sample mean; and when guessing population variance, it's most likely to be the sum square of all residuals divided sample size. (proven by maff)
Now that I have the solution for the normal and Bernoulli distribution proof, off I go to understand the poison, chi-square and t distribution if they have any 😭😭😭
I was struggling with a research paper n using restricted maximum likelihood estimation for analysis... n this video made me understand the whole thing clearly!!! Bam * infinity!!!
I am truly staggered of how well-made this explanation video is. A lot of textbooks don't bother to explain their derivations and proofs, as they consider them to be "self-explanatory". Thank you a lot Mr. StatQuest, I learned more through your videos than in any statistics lecture! Keep up the good work!
Hi Josh thanks for your video. Why we multiply Likelihood of each data when we have two or more data? For example, why don't we add these values? And by the way When do you think upload video about Time Series Analysis for FbProphet?
Because, based on probability theory, when we have "something AND something else", we multiply them together. If we had "something OR something else", we would add them.
@@statquest If there is datas in the Dataset that we need to consider together, we can say that we multiply. Thank you very much for your answer. And I just wonder that Do you think upload video about Time Series Analysis for FbProphet?
We want to find the maximum of likelihood function, which is probability of (x1, x2 ... xn) was derived from some distribuition with some parameters. But all of x_i was derived independently, so we can just multiply each individual probabilities of x_i was derived from the distribution. Correct me if its wrong and sorry for my bad Engrish :)
Holy hellllll, now I know where do those formula comes from. It's funny when kids are taught to use the formula that have been found by geniuses in the past but don't know how they are found
I really felt like you were telling me a bedtime story here, and rewinding was like asking you to go over the good bits. The ending of and the mean, equals the mean, and the standard deviation equals the standard deviation, was a pretty good ending.
thank you so much, here is the first video where i got my answer that what actually likelihood function does and how it's different from ordinary least square method under regression approach
What a hero. It takes a lot of time to do things in such detail that even the worst could still understand everything without having to ponder at why a step is valid ! GG