Another approach I'm thinking about is using INDEPENENT Z TEST ...where the null hypothesis would be that there is no difference in the proportion of heads and tails. And, if the null hypothesis is rejected we can say that the coin is not fair.
We could use the binomial distribution, it would be more accurate. And we would compute the probability of getting less then 30 heads plus probability of getting more than 70 heads. And then we would compare it with 0.05 threshold.
Remember, we are learning this for interviewing. "Z Test" is doing p values based on the cumulative distribution function of the normal distribution. Since binomial distribution converges due to CLT, its much easier to use. The alternative is to use the cumulative (mass) distribution function of the binomial distribution, which in fact changes based on N, which makes it much harder to do in practice.
The variance you've computed here is from Bernoulli and I would like to know how it follows the normal as n increases. isn't it the binomial distribution which does so?
Hi! Though the outcome is binary, you can represent this in proportion. When your sample size is large enough, given CLT, the sampling distribution of proportion becomes normal. Z-test assumes that the sampling distribution is normal for the test to work. In this case, the condition is satisfied as mentioned above. So that’s why Z-Test for proportion works here.
what if we calculate the confidence Interval for Population Proportion using the standard error of the estimate using sqrt(0.7*0.3/100) and check whether 0.5 lies in the resulting Confidence Interval?
It's based on the CLT (central limit theorem). I am not sure how you intend to use MLE for it, maybe I don't understand enough but it seems like MLE for that kind of task is just the mean. If you can generate more data, the law of large numbers would let you estimate mu directly!
@@heyman620 LLN just provides that a stochastic process will tend to the sample mean asymptotically if the process is stationary. CLT will already start converging by 100 data points. That being said, MLE is a statistical method to try and determine the PDF of the data, but doesn't make statements like "bias". You still need to do hypothesis testing, whether that be chi square or otherwise.
@@kylerasmussen4921Remember that at the end, all you test is that the means of two groups are different, you can do it by gazillion ways. When you have a small amount of data you can make some assumptions regarding the distribution, e.g. assume it's normal. That being said, you don't have to, you can actually use Chebyshev's inequality to mimic the test I believe. But assuming normality makes a lot of sense in this setup. What I think is, that all you need is the means, the variance, and a way to know how likely it is to be an estimation error. I guess I just stated it implicitly but you are right. Given infinite data and finite variance you don't need a test though, i.e., LLN :). Very nice comment, thanks!
It's a nice solution but I think that once you figure out you can use the normal distribution to do so talking about "computing z value" is a little 3rd grade. I think what's more important is knowing the assumptions, i.e, independence. And understanding that this test is, in fact, based on the CLT (this mean is sampled from the distribution of the means!). Honestly - sorry, I am not sure I would give you a perfect score for the interview since you just used a statistical test blindly (pass for sure).
So much better to just simulate it... You can assume normality because of the convergence in distribution to normal in this setup, which stems from the CLT. However, this convergence happens as n -> \infty and here n is fixed to 100. Instead, you can actually simulate with P=0.5 and get a better estimation. Just find how many of the outcomes in your simulation have at least 70 heads, let K be this number and N the number of simulations. Your p-value is K/N (the null hypothesis is that P=0.5 and you count instances in which it is as observed in the description). That's a form of Bootstrapping.