No video :(

Variance: Why n-1? Intuitive explanation of concept and proof (Bessel‘s correction)

statsandscience

Подписаться 491

Просмотров 15 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

6 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 71

@cannot-handle-handles 2 года назад

The explanation given at 24:55 why we divide by 2 ("each value is in here twice") does not seem intuitive: If we only added (x_i-x_j)^2 for i

@statsandscience 2 года назад

Not sure if I understand you correctly, but your suggestion is basically to only look at one of the triangles of the matrix without the diagonal. Is that correct? In that case you would have to divide by n(n-1) only and not by 2 to get the same value.

@cannot-handle-handles 2 года назад

@@statsandscience I'll try and elaborate more: The sum of all the squares is 812. Divided by n^2 (because that's the number of squares we're considering), that's 16.571… Divided by 2, that's 8.28571… And finally, with Bessel's correction, it's 9.666… The sum of half of the squares is 406. Divided by n(n-1)/2 (because that's the number of squares we're considering), that's 19.333… So we still have to divide by 2 to get 9.666… I'm not saying the formula is wrong, just the explanation ("each value is in here twice"). In both cases, you have to divide by the number of squares AND by 2. So the division by 2 is not explained by counting the squares twice.

@cannot-handle-handles 2 года назад

@statsandscience But the number of squares in one of the triangles is 1+2+3+4+5+6+7=21, not 42.

@statsandscience 2 года назад

@@cannot-handle-handles Yes, I deleted my comment because I noticed my mistake prior to your answer. It makes sense what you say, I did not do this test before. Do you have a good intuitive explanation for the 2?

@statsandscience 2 года назад

Is it because you basically calculate the means of all pairs of points?

@J-sh4gf 7 месяцев назад

“The expected difference between the “correct” formula of the variance and the “wrong” one with n and the sample mean, is equal to the variance of the sample mean!” This one sentence untied a knot! Thank you very much. This video is by far, and I watched various explanation of degrees of freedom etc., the best I've seen.

@MathAndComputers 3 года назад

Thanks for the explanations! I'd been meaning to learn about this for ages, but just hadn't gotten around to it, haha. 😅 Something that might be helpful is that if you put times and labels in a list in the description, RU-vid will now automatically split up the play bar into chapters, as long as the first one is 0:00, so something like: 0:00 - Intro 1:33 - Terminology 4:28 - Estimating the mean or variance 9:20 - Why is the version with n biased? 17:03 - Why does n-1 save it? (explanation 1) 21:56 - Why does n-1 save it? (explanation 2) 28:42 - Summary

@statsandscience 3 года назад

That was super helpful, thanks! And extra thanks for providing all the correct time stamps!

@phil2888 8 месяцев назад

This is great, I have grappled with this for quite a while.

@user-kh5ju1du4d 10 месяцев назад

Thank you SO much for this video. It has been so hard to find a proper explanation of this.

@gokulkrishna2667 2 года назад

The greatest video on this aspect on the internet!

@statsandscience 2 года назад

Thanks, glad you liked it!

@mariuskornovan5520 3 месяца назад

Great video! Helped me finally understand the derivation of the sample standard deviation

@pramodabandaru3566 5 месяцев назад

I did not get how (x(sample mean)-population mean) squared/varience of sample mean is equal to variance of population/n or sample size. Cd anyone pls explain? 20:00

@danielheckel2755 2 года назад

Very enjoyable explanation. Thank you! Greetings from Mexico.

@ckq Год назад

Before I watch, basically the average cuts the variance by a factor of n, but when we find the difference between a sample value and the sample average, the average contains 1/nth of that sample so the calculated variance is shrunk by a factor of (1-1/n).

@statsandscience Год назад

I am not sure if I understand correctly but I think there might be something to it framing the mean as containing 1/n parts of the information within the sample... Would you say you were right after watching?

@DanTee2_718 Год назад

This was honestly such a great watch, thank you for the video

@statsandscience Год назад

Thank you!

@lurkertech 10 месяцев назад

Thanks for the best video I've ever seen on the n-1. Referring to the key questions at 8:37 I was hoping to find the answer to a more specific question #2: not just "why isn't it n-2 or n-pi" but "why does the correction factor (n-1/n) not depend on the ratio between the sample size and the population size?" That is, if I know the population is 1000, and I choose a sample of 10 vs. a sample of 999, why wouldn't I use different correction factors to get the best answer? After all, my sample of 999 is going to be darn close to the true population variance whereas my sample of 10 is going to be way off. Your video kind of implies, but doesn't say directly (wish it did) that the n-1 "solution" provides the "average" correction factor you might need for any possible sample size relative to the population size, or to say the same thing in another way, the n-1 is the best you can do if you don't actually know the population size. Is that correct? If we DO know the population size exactly, then can we choose a better correction factor that is tailored to that particular sample size : population size ratio?

@lurkertech 10 месяцев назад

To make an even clearer statement of the problem...suppose my sample size is always 499. Now suppose that the actual population is either 500 or 1000. So that's 2 cases in total. According to the n-1 rule, I should apply the same correction (499/(499-1)) to 499 samples in a 500 population as I should apply for 499 samples in a 1000 population. That doesn't seem to be the best we can do if we know the actual population size, since I should not need to correct as hard when sample size is very close to population size. So is the n-1 rule designed only for the case where one does not know the population size? If we do know the population size, can we do better? Using what formula?

@statsandscience 9 месяцев назад

Sorry that I did not come around earlier to answer this question. You put a lot of effort into this and I hope you still benefit from an answer! When you take intro stats classes, a quite basic assumption that lurks basically everywhere is that the population you are dealing with is infinite. Of course, this assumption is also basically always wrong. Usually that does not matter though, as populations are usually "big enough", so that wrong estimates of the actual population size do not influence our outcomes to a degree we would care about. The same is true here in this formula: It is not the "average" correction factor for all possible samples, but the one for an infinite population - again, it usually does not matter what the actual size is, except when the sample size comes close to the population size. Now you correctly identified that this can cause problems because in this case you actually know a lot more than what the formula is giving you credit for. What people came up with for this case it the Finite Population Correction (FPC) - I would advise to just google it and look for yourself as the space here is quite limited (of course you can also ask follow-up questions about that here if you like!). However, in a nutshell this correction does what you pointed out - it prevents that you correct "to hard".

@lurkertech 9 месяцев назад

@@statsandscience Thank you, it is a very useful answer. I didn't know about that assumption, and so when your examples had a population size of 7, I was extra confused. Thanks for clearing it up. That makes it clearer why the correction should be greater when the sample size is smaller. Maybe mention that assumption in your video description to help others in the same boat as me?

@Sid-ge9vb Год назад

this is an amazing explanation, Thank you so much ! I was so frustrated by the hand wavy explanations on youtube , even in lectures !

@statsandscience Год назад

Thank you, I really appreciate it!

@jkally123 2 года назад

How did he get to the statement made on 20:00 - that var(sample mean) is equal to population variance divided by n?

@statsandscience 2 года назад

I brushed over this a bit because it was not the focus here. Intuitively, it makes sense I think that the variance of the sample mean must be smaller than the population variance and that this depends on n because as I explained, there is no way to get the most extreme observed values as means, and the mean will always become "less extreme" in comparison the higher n is. However, I don't know an intuitive explanation for the exact formula, but the reasoning goes like this: You try to calculate the variance of the sample mean, that is, the sum of the observations divided by n, like so: Var(obs1+obs2+obs3.../n). You can rewrite this to Var((1/n)*obs1 + (1/n)*obs2 + (1/n)*obs3...). A linear combination like this has a variance equal to the sum of whatever the factor is squared (in this case 1/n^2) times the variance of the individual components: (1/n^2)*Var(obs1) + (1/n^2)*Var(obs2)... When you then assume identical variances for the observations, this equals (1/n^2*)n*Var(obs) which is Var(obs)/n. You can find that a bit nicer formatted also here: online.stat.psu.edu/stat414/lesson/24/24.4 Hope this helps, thank you for the comment!

@Titurel 8 месяцев назад

I was wondering too!

@Number_Cruncher 3 года назад

Nice, now it is clear to me.

@statsandscience 3 года назад

Great, thank you for the comment!

@faresmhaya 6 месяцев назад

The explanation for why we devide by 2n² in the second formula is not intuitive to me, despite it working on a small example I tested. I feel redundency in dividing by both 2 and n². If we have two instances of each distance measurement, okay we can divide by two, reducing the number of distances we're taking into consideration. But why would we then need to also divide by a second n if we reduced the number of distances we're taking into consideration from n² when we divided by 2?

@iwatchtvwithportal5367 10 месяцев назад

I always thought the n-1 was related to degree of freedom spent, but actually it isn't!

@statsandscience 9 месяцев назад

Well, it is, but you can sort of getting around that in an explanation like this one. If you are interested, feel free to watch my video on degrees of freedom. :)

@osaabd390 Год назад

Thank you so much for this great video. I appreciate it. I have to give you feedback though on the quality of the sound. I found it sometimes difficult to hear well what you say. Two things I would suggest you do, as I think your understanding of these concepts and ability to communicate them visually need not go to waste. The two solutions I suggest are 1. a better microphone (Shure and Rode are the best and not that expensive) and 2. read slower pleeaassee. I had to stop multiple times and go back to understand fully what you say. If you think you need to keep your videos below a certain time threshold, then cut off unnecessary words from your script, using shorter words, trim wordy phrases (e.g. use 'most' instead of 'the majority of'). Thanks again for the great effort, keep it going.

@statsandscience Год назад

Hey, thank you so much to take the time to give detailed feedback. 1) I am actually using such a microphone, but maybe it wasn't well positioned? I will check that. 2) thanks, I will try! It is not that I want to shorten videos, I am just used to talk fast I guess...

@osaabd390 Год назад

@@statsandscience good luck with your work and thank you from the bottom of my heart, I really do understand why we divide by n-1 now :D .

@Hossein_am98 Год назад

thanks for the video, really good way to explaine! Frankly to me, it seemed you are reading from a written text, because your speaking was too constant(no stress on the words no up and downs no nothing) and that made it really difficult for me to understand what you're saying

@statsandscience Год назад

Thanks! I will try to improve speaking next time!

@milanradovanovic3693 Год назад

This puzzeled me for a long time... Thanks for explanaition... P. S. Always thought it was spelling mistake in book(s)

@andrew.schaeffer4032 Год назад

What kind of statistics exactly do I need to learn in order to follow along? This looks really interesting, but I don't fully understand how it all works. Thanks!

@statsandscience Год назад

You will probably find the general concept in any applied statistics textbook. As I said it is a basic step from descriptive statistics where you only draw conclusions about a particular sample to inferential statistics where you use a sample to draw conclusions about a bigger population and that is basically what is always needed and taught in applied statistics. The issue is that those books tend to be shallow in that regard and other books with more detail might only be helpful with a serious understanding of the math behind it. Which is why I made the video to bridge between these two. Let me know if that was what you had in mind!

@ajaydalvi1378 2 года назад

Finally Understood !...

@shpensive 2 года назад

Fantastic, I've been wondering this for a long time..

@chris_7711 7 месяцев назад

Herzlichen Dank! Sehr aufschlussreich!

@user-ws5sq8fm4k Год назад

Thank you for this great video. I hope you continue uploading more videos. Do you have e a written text for this video? As a non-native English speaker, I face some difficulties to follow your speaking. I need to repeat hearing of many parts of the video to catch the words.

@statsandscience Год назад

Thank you! Yes, I do have that and I always wanted to make proper subtitles but just did not get to it yet. RU-vid auto generates subtitles as you probably know but I don't really like them. I will try to look into that soon and let you know.

@user-ws5sq8fm4k Год назад

@@statsandscience Thank you for your reply. I will wait for this precious script.

@statsandscience Год назад

@@user-ws5sq8fm4k English subtitles are up now! I hope you will find them helpful

@user-ws5sq8fm4k Год назад

Does the explanation using pairwise differences apply in sampling without replacement where diagonal zeros don't occur?

@statsandscience Год назад

They would still occur, wouldn't they? Because the margins of the table are identical either way, so there would be zeros on the diagonal. Sampling without replacement is also a separate issue, as for instance discussed here: stats.stackexchange.com/questions/70124/unbiased-estimator-of-variance-for-samples-without-replacement

@user-ws5sq8fm4k Год назад

@@statsandscience Thank you for your reply. Zeros occur when we substract each data point from itself and this doesn't happen in case of sampling without replacement.

@se0271 2 года назад

So instead of the sample lying somewhere much lower than the true population mean, what if it's lying much higher? Would it be correct to use n+1 instead of n-1 in order to deliberately make the sample variance smaller?

@statsandscience 2 года назад

The main problem is that you don't know that. Remember that we do all this with samples because we do not have access to the population - and this is a problem that happens because of sampling, but not when you can use the population values. Imagine a student who goes to the school cafeteria every day, and who knows that the staff tends to hand out portions that are too small most days. So they ask for something extra every day (and receive it). This will move the portion size to the optimum most days, but on days where the portion size was correct in the first place or even greater, the request will make it worse. However, this is still better because on most days the size is too small, so the average will be closer to the optimum. Does that help?

@se0271 2 года назад

@@statsandscience Thank you for your response. It definitely helps but I still have the question of how you would know that the data values from a sample are too small. You cannot infer that it's too large, but why can you infer that it's too small? Shouldn't it go both ways? Maybe naturally, samples tend to gravitate around smaller data values as with the portion size example you gave? If that's the case then it does actually make sense since you'd typically not want to exceed the normal portion size so you don't run out (and this idea of scarcity can be applied to any other examples).

@statsandscience 2 года назад

@@se0271 you indeed don't know that for a particular value. It can be too big or too small. It is just more likely that it is smaller. I'm afraid that when I go into more details I would just repeat what I said in the video but when you have specific questions I would be happy to help!

@se0271 2 года назад

@@statsandscience I see, I appreciate the explanation- thank you!