Amazing videos; if you only knew the way in which your channel had changed things for me… what started as a “hey this video looks cool” has largely been responsible for my wife enrolling into a data science postgrad program in the University of Waterloo.
Amazing videos, just wanted to thank you for all you have taught me. Now I am doing a Master's Degree in Artificial Intelligence and planned to do a Ph.D. and all thanks to your videos.
Why do I have to attend college classes when you explain these concepts so thoroughly and in a third of the time? Great video as always man you are a Godsend
When can i know why we divide sd by n-1? I hav seen ur videos on sd and know we need to subtract something but why 1? U said we need to know about expected values first. So when can i learn? Love ur videos. Thanks a lot
Thanks for that and another videos with all that useful explanations... I'll keep binge watching. Regarding your explanation at 10:06... The second term and all following appear wrong to me, because the intervals are 10s each and not 20s, 30s, 40s... this the area of the rectangle should be likelihood x 10(s) for all terms (discrete rectangle) You are the expert but I can't wrap my head around that.
I'm not sure I fully understand your equation. The area of each rectangle (the probability) is 10 x likelihood (obviously the values are approximated and rounded in the figure). The time, 's', should only be used to determine the likelihood at a specific point.
I'm in the library and I did not realize my laptop speaker was on, and the video started, as always, with a song. :D I'm not even embarrassed. Everybody should hear your songs, they are amazing haha
Amazing video josh!!! I'm waiting for the video on why we divide by n-1 when we compute the sample variance. Thank you for the very informative content that you put out.
Currently i'm studying Statistics at Federal University of São Carlos (UFScar) and just wanted to thank you for all the helpful and fun content that you've been posting... Not only had helped me to understand but also has made me like Statistics even more
Number is not right around 8:25, the rectangle area should be approximate 0.3894. I was very surprised that the given rectangle area of 0.4 is bigger than the given integrated result of 0.39, because the rectangle area looks slightly smaller than the area under the curve ... still I love these videos. And of course, for the purpose of this lecture, having more digits here is distracting and not helpful.
I think I've had too much to drink, because I read the notification as "Star Quest with Josh Starmer", and I was confused about your sudden shift to astronomy. I'm going to have to watch this tomorrow.
I am a bit confused is the value in y axis the likelihood of meeting a person after xth seconds or the number of people we meet after x th seconds cause during initial explanation the dots at each x were number of people we met after xth time.
@@samieweee7468 At 2:37, each dot represents an individual person. In other words, StatSquatch is creating a histogram. However, histograms have problems - gaps in the data and the data are not continuous (they have to be put in bins). Thus, we use an exponential distribution to approximate the histogram. The exponential distribution doesn't have gaps and is continuous. And, from there on out, we use the distribution, which gives us likelihoods on the y-axis, rather than the histogram, which gives us the number of people on the y-axis.
Josh, I'm really thankful to all your vdos, it's enlightening, lol. Please can you make some vdos on distribution functions (e.g., normal distribution)?
I've got a basic video on the normal distribution here: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-rzFX5NWojp0.html and a video on maximum likelihood estimation with the normal distribution here: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-Dn6b9fCIUpM.html All of my videos can be found here: statquest.org/video-index/
1) Why cant we just take average of all waiting times to get how much on average we would wait? 2) How often in real life we can use some sort of formula like exponential to describe a distribution? I think in most of the cases we can use NN to fit the distribution
Expected values are helpful for doing Statistics and getting a sense of how likely future events will be. This is why we go through the trouble to do all this math instead of just taking an average or fitting an NN to the data.
At 9:00 we calculate the y-axis values (the likelihoods) for each rectangle by 10 because it is the distance between each tick-mark on the x-axis. This gives us the probabilities of each event. However, the events, or outcomes, themselves, are the x-axis values. So, starting at 9:10, we multiply the probabilities by the outcomes (the x-axis coordinates). When we add these products up, we get a weighted average of the outcomes (weighted by their probability of occurring). Is that what you are asking about?
Did we finally get the final equation as at 7:40? Both are just area under the curve? Width wasn't being used before 11:00 - we were just using the specific outcome x, how did both width and specific outcome come in the final equation?
The equation at 7:40 is just the equation for the area under a curve defined by an exponential distribution, not the equation for the expected value of an exponential distribution. The equation for the expected value of an exponential distribution doesn't show up until 12:39. The big difference is that we multiply the formula for the exponential distribution by 'x', a specific outcome. Width isn't part of this equation because we take the limit earlier as the number of rectangles goes to infinity and their widths go to 0.
Looks like the math is wrong for the approximation of expected value of distribution: 1) for each bin we compute: f(x) = lambda * e ^ (-lambda * x), where f(x) is the value of probability density function (PDF) for the middle point 2) we compute probability of each bin: p(x) = f(x) * delta, where delta is the width of the bin, in our case = 10 3) This is a step with mistake: we calculate contribution for each bin and sum everything up: E(x) = sum ( p(x) * x ), where x is the middle/average point for each bin, but in your video you took upper bound instead of an average value for each bin. If you do calculation this way, you get 19.23 which is closer to true value
I wouldn't say the math is wrong because the purpose is only to illustrate a concept, rather than how the math is actually done. In practice, we don't do a summation, we take the integral.
@@statquest sure, in practice we take the integral. But for approximation it makes more sense to take average value for each bin rather than it's top value. [5, 15, 25, ... 95] instead of [10, 20, 30, ... 100]. In case that's you and not some hired assistant who's answering comments here: thank you for your work, you're amazing🤗 You're the main reason i managed to remember everything i learned in university more than 10 years ago, started to master ml and deep learning and began working as data analyst
@@VladLanz That's me! Thanks! :) (and I still wouldn't say taking the edge is 'wrong' - different, and maybe it doesn't make as much sense for the sake of getting the best approximation, but not wrong).
Hi Josh. What video(s) after this should I watch to get closer to understanding why we divide by (n - 1) when finding the sample variance and sample covariance? An intuitive explanation for this seems nowhere to be found across the entirety of the internet, and the StatQuest channel has thus far been a divine gift of comprehension.
Have you seen this one: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-sHRBg6BhKjI.html Other than that, the best I can do is refer you here: online.stat.psu.edu/stat415/lesson/1/1.3 One day I turn that page into StatQuest, but not for a while.
20 represents the specific, discretized, outcome. In other words, we have discrete time points, 10 seconds, 20 seconds, 30 seconds, 40 seocnds, etc. that we have to wait before we meet someone in StatLand. Since we have a histogram, everyone we meet between 0 and 10 seconds is categorized as "someone we met in 10 seconds". Likewise, everyone we meet between 10 and 20 seconds is categorized as "someone we met in 20 seconds". etc.
@@statquest then it should be like = (10*0.4) + (20*(0.4+0.2)) + (30*(0.4+0.2+0.1)) and so on.. and that's because for x=10 , probability will be area under curve till 10 which is 0.4, and for x=20 probability will be area under curve till 20 which is 0.63(roughly equal to 0.6) ... Please correct me if I am wrong
Probability =( height*weight). Of rectangle How did you find that? Can you please explain how it indicates the probability? .. and also explain me that what is the value of "x" in continous random variables "mean"?
The area under the curve between two points represents the probability of something happening between those two points (see: 4:48). This is simply how probability distributions are defined. We can solve for that probability exactly using calculus (see: 4:49), or we can approximate it using rectangles (height * weight).
5 years later: if you watched this video hoping to learn exactly why we divide by n-1, you are one step closer to understanding this mystery, but not quite there yet.
THANK YOU for your great videos I just have a question, I think( I am not sure) that this data in this example follows Poisson distribution, not exponential!!!!! , am I right???
Poisson is discrete and models something very differently from what we are modeling here with an exponential distribution. For details, see: en.wikipedia.org/wiki/Poisson_distribution
Thanks for all of your videos they are great. I'm waching the full playlist "statistics fundamentals" and have to admit that since I understood all of the previous videos thanks to your great explanations I was lost on this one when you used integrals :/ I think it's the video that requires the more calculcus knowledge.
Dear Sir, could you do video on Linear Mixed models and GEE? Maybe you could create a separate donation target with threshold which you need to achieve to make one? It would be extremely useful. It is long waited and hard topic with a lot of contradictory info.
Dear Professor, I would like to ask you if you have a very good lecture note for this book, introduction to mathematical statistics Robert v. hogg, I am ready to pay for that, I like your methodology in presenting.
@@statquest Ok, I would like to take this chance again to ask you if possible to add topics about Bayesian statistics. Many thanks and I am still following you :)
When you approximate the expected value it is confusing that you add areas whose base is not the constant 10 interval. The areas of the rectangles should be in your example variable height given by the formula times 10 and not 10, 20, … Otherwise the rectangles are meaningless.
I'm sorry that was confusing to you. However, each rectangle has the same width (as you can see in the illustration). However, each rectangle represents a different specific outcome. So 10, and 20 are not different widths, but different outcomes. 10 is the outcome represented by the first rectangle (with width = 10) and 20 is the outcome represented by the second rectangle (also with width = 10).
@@statquest then it should be like = (10*0.4) + (20*(0.4+0.2)) + (30*(0.4+0.2+0.1)) and so on.. and that's because for x=10 , probability will be area under curve till 10 which is 0.4, and for x=20 probability will be area under curve till 20 which is 0.63(roughly equal to 0.6) ... Please correct me if I am wrong
@@priyankjain9970 No, it's (10*0.4) + (10 * 0.2) + (10 * 0.1) + (10 * 0.09) + ... etc. To approximate the area under the curve, we add up the area of each rectangle. The area of each rectangle is the width (10) times the height (0.4 for the first, 0.2 for the second, 0.1 for the third, etc.).
@@statquest Thanks for reply. Actually the height is 0.04 for first, 0.02 for second, 0.01 for third and so on ( as explained by you @8.29 in video). Therefore area of rectangle will be 0.4 for first, 0.2 for second, 0.1 for third and so on. My concern is following As you stated E(X) = Σ x * p(X=x) .. This means E(X) = 10*(probability till 10) + 20 * (probability till 20) + 30 * (probability till 30) and so on. Now probability till 10 means area under curve till 10 which is = 0.4 probability till 20 means area under curve till 20 which is = 0.6 (approx) probability till 30 means area under curve till 30 which is = 0.7 (approx) Therefore E(X) should be (as per my understanding) = 10*0.4 + 20*0.6 + 30*0.7 + .... Please help me to understand this
@@priyankjain9970 Sorry about the typos with the area vs height. That said, the probability of observing an event between 10 and 20 seconds is not the cumulative probability of observing an event between 0 and 10 or between 10 and 20. Your equations use the cumulative probabilities, which is not correct in this situation. To clarify, the expected value is "the probability of observing an event between 0 and 10 seconds times the outcome, 10 (this is just the label for the any event that occurs between 0 and 10 seconds) + the probability of observing an event between 10 and 20 times the outcome, 20 (again this is just the label for any event that occurs between 10 and 20) + the probability of observing an event between 20 and 30 seconds time the outcome 30 + ....