Тёмный

False Discovery Rates, FDR, clearly explained 

StatQuest with Josh Starmer
Подписаться 1,2 млн
Просмотров 209 тыс.
50% 1

One of the best ways to prevent p-hacking is to adjust p-values for multiple testing. This StatQuest explains how the Benjamini-Hochberg method corrects for multiple-testing and FDR.
For a complete index of all the StatQuest videos, check out:
statquest.org/video-index/
If you'd like to support StatQuest, please consider...
Buying The StatQuest Illustrated Guide to Machine Learning!!!
PDF - statquest.gumroad.com/l/wvtmc
Paperback - www.amazon.com/dp/B09ZCKR4H6
Kindle eBook - www.amazon.com/dp/B09ZG79HXC
Patreon: / statquest
...or...
RU-vid Membership: / @statquest
...a cool StatQuest t-shirt or sweatshirt:
shop.spreadshirt.com/statques...
...buying one or two of my songs (or go large and get a whole album!)
joshuastarmer.bandcamp.com/
...or just donating to StatQuest!
www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
/ joshuastarmer
#statistics #pvalue #fdr

Опубликовано:

 

9 янв 2017

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 427   
@statquest
@statquest 2 года назад
Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
@ronnieli0114
@ronnieli0114 3 года назад
My PhD dissertation relies heavily on bioinformatics and biostatistics, although my background is neuroscience. Naturally, I had a lot of learning to do, and your videos have helped me immensely. Every time I want to learn about a stats concept, I always type in my Google search, "[name of concept] statquest." Seriously, this is almost too good to be true, and I just wanted to thank you for providing this absolute gold mine.
@statquest
@statquest 3 года назад
Wow! Thank you very much and good luck with your dissertation.
@simonpirlot2720
@simonpirlot2720 4 года назад
You make without a doubt the best videos about statistics on RU-vid: funny, clear, intuitive, visual. Thank you so much.
@statquest
@statquest 4 года назад
Thank you! :)
@chetanatamadaddi8370
@chetanatamadaddi8370 3 года назад
Totally second this...
@dysnomia6413
@dysnomia6413 4 года назад
God bless you, I made screenshots of this video to explain this concept to my lab. This isn't the first time you've helped me with RNA-seq procedures. I have bumbled through a differential expression analysis. Trying to understand the statistical methods and knowing which option amongst several is the most logical is a mental hurdle. I am the only student in my lab currently undertaking bioinformatics and I am essentially trying to teach myself. There is a huge vacuum of knowledge in this realm amongst biologists and it's daunting. We all can generate data until we're blue in the face, but it doesn't do anyone any good until someone knows how to analyze it.
@statquest
@statquest 4 года назад
Awesome! Good luck learning Bioinformatics.
@meg7617
@meg7617 3 года назад
Can't thank you enough!! Your methods are truly amazing. Being able to deliver them to us so cleverly is a true indication of how much effort you must have put into understanding these concepts .
@statquest
@statquest 3 года назад
Wow, thank you!
@didismit1766
@didismit1766 5 лет назад
BAM BAM BAM, thanks a lot man...Your 20 minutes most likely saved hours of trying to understand from wikipedia...
@statquest
@statquest 5 лет назад
Sweet!!! Glad I could help you out. :)
@Demonithese
@Demonithese 7 лет назад
Fantastic video, thank you for taking the time to put this together.
@wenbaoyu
@wenbaoyu 3 года назад
Wow wow wow how intuitive and visual. Can’t thank you enough for saving me from spending hours struggling to understand this concept🙏
@statquest
@statquest 3 года назад
You're very welcome!
@zebasultana930
@zebasultana930 6 лет назад
Awesome explanation !! Thanks for taking the time to make these videos and also answering questions from viewers so well. Going through them already answered some queries that I had :)
@user-gx3eg5sz9n
@user-gx3eg5sz9n 3 года назад
Im from China and I watched your channel in Bilibili but I cant had enough so I catch you all the way up ended here, a paradise of data science! thank you Josh, wish you the best!
@statquest
@statquest 3 года назад
Wow, thank you!!!!
@fmetaller
@fmetaller 4 года назад
I love you ❤️. I was so afraid of FDR adjustment because I thought the math behind was empirical and worked like magic but you made it surprisingly intuitive.
@statquest
@statquest 4 года назад
Thank you! :)
@kakusniper
@kakusniper 6 лет назад
I'am currently learning to do RNAseq data analysis, these videos are extremely helpful.
@daeheepyo3053
@daeheepyo3053 5 лет назад
OMG!! This is the most beautiful explanation I've ever experienced...... Thank you so much professor.
@statquest
@statquest 5 лет назад
Awesome!!! Thanks so much.
@docotore
@docotore 3 года назад
Simple, informative, and to the point. Absolutely perfect.
@statquest
@statquest 3 года назад
Glad you liked it!
@chunhuigu4086
@chunhuigu4086 4 года назад
Great tutorial for FDR. The adjusted p-value is a p-value for the remaining result after cutting off some results you know that are not significant just by the distribution. It will be better if you can tell something about Q-value and how Q-value reflects the quality of a experiment.
@yanggao8840
@yanggao8840 4 года назад
I have always hated math and you just make it clear and interesting! Can't thank you enough
@statquest
@statquest 4 года назад
Hooray!!! I'm glad the video is helpful. :)
@loretaozolina8414
@loretaozolina8414 2 года назад
Thank you! This was really helpful and made me smile during my intense evening revision :)
@statquest
@statquest 2 года назад
Glad it helped!
@RobertWF42
@RobertWF42 6 лет назад
Cool, thanks for posting this, very intuitive! An equivalent method for eyeballing the # of true null hypotheses is to plot ranked 1 - p-value on the x-axis and the hypothesis test rank on the y-axis, then fit a line to the scatter plot, starting at the origin. Where the line hits the y-axis is your estimate of the # of true null hypotheses.Would like to see an intuitive explanation for the Benjamini-Yekutieli procedure, used in studies where the tests are not completely independent!
@niklasfelix7126
@niklasfelix7126 4 года назад
Thanks for the awesome explanation! Really informative and easy to follow. And the DOUBLE BAM in the end actually made me laugh out loud :D
@statquest
@statquest 4 года назад
Awesome! :)
@broken_arrow1813
@broken_arrow1813 5 лет назад
The clearest explanation of BH correction so far. Quadruple BAM!
@FengyuanHu
@FengyuanHu 6 лет назад
This is simply great!!! Thanks for sharing Joshua.
@shanoodi
@shanoodi 7 лет назад
This is the best video that explains FDR. Thank you,
@ryanruthart581
@ryanruthart581 6 лет назад
Great video, your example was clear and very will illustrated.
@hossam86
@hossam86 3 года назад
This is amazing. Very well explained and easy to understand!
@statquest
@statquest 3 года назад
Glad it was helpful!
@AnkitDhankhar-uv6qd
@AnkitDhankhar-uv6qd Месяц назад
First and foremost, I extend my heartfelt gratitude for providing such a series that elucidates concepts in an easily comprehensible manner. Bam !☺
@statquest
@statquest Месяц назад
Thank you!
@frrraggg
@frrraggg Год назад
As always, by far the best explanation on the web!
@statquest
@statquest Год назад
Thanks!
@ericshaker9377
@ericshaker9377 3 года назад
Wow was seriously struggling with my research since I dont know the first thing about statistics and I love this so so so much. So instructional I had to like
@statquest
@statquest 3 года назад
BAM! :)
@tinAbraham_Indy
@tinAbraham_Indy 2 года назад
Thank you very much indeed for the perfect explanations and examples of the FDR concept. I really get my answer.
@statquest
@statquest 2 года назад
Thanks!
@ramazanaitkaliyev8248
@ramazanaitkaliyev8248 Месяц назад
Great explanation, thanks ! clear explanation, amazing balance between theory and examples
@statquest
@statquest Месяц назад
Thank you!
@fgfanta
@fgfanta 7 дней назад
From the way my university teachers (didn't) explain to me Benjamini-Hochberg, and after watching this video, I can claim I now understand Benjamini-Hochberg better than them, at a 99.7% confidence level!
@statquest
@statquest 7 дней назад
BAM! :)
@archanaydv995
@archanaydv995 5 лет назад
Just wow!! Thank you for this.
@li-wenlilywang8856
@li-wenlilywang8856 6 лет назад
Thank you so much for this great movie!! Great explanation.
@abdullahalfarwan1458
@abdullahalfarwan1458 3 месяца назад
شكرا جاش. ماقصرت. مقطع مختصر ومفيد
@statquest
@statquest 3 месяца назад
Thank you!
@diegocosta2383
@diegocosta2383 3 года назад
Nice video, simple and fast.
@statquest
@statquest 3 года назад
Thanks!
@reflections86
@reflections86 Год назад
Josh is a genius. Really appreciate your work statquest.
@statquest
@statquest Год назад
Thank you! :)
@ieserbes
@ieserbes 7 месяцев назад
As always, it is a great explanation. Thank you Josh 👏
@statquest
@statquest 7 месяцев назад
Thank you!
@PedroRibeiro-zs5go
@PedroRibeiro-zs5go 6 лет назад
Dude thanks so much, this video is AWESOME!!!
@weihe3639
@weihe3639 6 лет назад
Very nice explaination!
@rodrigohaasbueno8290
@rodrigohaasbueno8290 5 лет назад
I have to keep saying that I love this channel so much
@statquest
@statquest 5 лет назад
Hooray!!! Thank you so much!!! :)
@annawchin
@annawchin 3 года назад
This was SUPER helpful, thank you!
@statquest
@statquest 3 года назад
Thank you! :)
@adelinemorez8072
@adelinemorez8072 11 месяцев назад
I love you StatQuest. Thank you for never letting me down. You were always present to answer my deepest and most shameful doubts. You never abandoned me during the darkest hours of my PhD.
@statquest
@statquest 11 месяцев назад
I'm so happy to hear my videos helped you. BAM! :)
@afraamohammad1001
@afraamohammad1001 4 года назад
Thanks for your effort and simplified explanation!!! live saver ))
@statquest
@statquest 4 года назад
Glad it helped!
@junymen223
@junymen223 7 лет назад
Thanks a lot. Mr. Joshua
@user-dk4ss4gp3l
@user-dk4ss4gp3l Год назад
This is my first time fully understanding FDR ...
@statquest
@statquest Год назад
bam!
@agnellopicorelli4751
@agnellopicorelli4751 3 года назад
I just love your videos. Thank you so much!
@statquest
@statquest 3 года назад
Thank you! :)
@RavindraThakkar369
@RavindraThakkar369 2 года назад
Nicely explained.
@statquest
@statquest 2 года назад
Thank you!
@telukirIY
@telukirIY 6 лет назад
Good explanation
@karinamatos4253
@karinamatos4253 3 года назад
Great explanations!
@statquest
@statquest 3 года назад
Thanks!
@maryamsediqi3625
@maryamsediqi3625 3 года назад
Thank you sir, was very useful 🙏
@statquest
@statquest 3 года назад
Glad it helped
@timokvamme
@timokvamme 3 года назад
Nice explaination!
@statquest
@statquest 3 года назад
Thanks!
@barbaramarqueztirado7567
@barbaramarqueztirado7567 2 года назад
Thank you very much por the explanation, very very clear!!
@statquest
@statquest 2 года назад
Muchas gracias!
@kezhang1460
@kezhang1460 3 года назад
BAM!!!finally i understand it, which confused me half a year!!
@statquest
@statquest 3 года назад
BAM! :)
@ygbr2997
@ygbr2997 Год назад
the second half is hard to understand, but I know I will come back later and watch it again, and again, and again until I finally understand it
@statquest
@statquest Год назад
Let me know if you have any specific questions.
@yoniashar3179
@yoniashar3179 5 лет назад
This is a great video. And, could help me understand how the intuitive understanding (the histograms of p values coming from two distributions) connects to the mathematical procedure of the B-H procedure? thank you!
@isaiasprestes
@isaiasprestes 6 лет назад
1 thumb down is a case of FDR :)
@statquest
@statquest 4 года назад
So true! :)
@ilveroskleri
@ilveroskleri 4 года назад
Thanks, that was preciuos (and spared me hours of frustration)
@statquest
@statquest 4 года назад
Thanks! :)
@poiskkirpitcha2003
@poiskkirpitcha2003 4 года назад
Thank you, bro!
@sunjulie
@sunjulie 3 года назад
It's so good, I want to give it more than one thumb up!
@statquest
@statquest 3 года назад
Double BAM! :)
@noahsplayground2564
@noahsplayground2564 3 года назад
Hey Josh, love you videos on stats, specifically centered around hypothesis testing. Can you do more videos on the different techniques of hypothesis testing, like (group) sequential testing and multi-armed bandit?
@statquest
@statquest 3 года назад
I'll keep that in mind.
@unavaliableavaliable
@unavaliableavaliable Год назад
This video is so beautiful.. Thank you so much
@statquest
@statquest Год назад
I'm glad you like it!
@arem2218
@arem2218 3 года назад
Thank you, nicely expalined
@statquest
@statquest 3 года назад
You are welcome!
@worldofinformation815
@worldofinformation815 3 года назад
Thank you Sir🌹
@statquest
@statquest 3 года назад
Thank you!
@RobertWF42
@RobertWF42 6 лет назад
One part I don't quite understand is how the intuitive eyeball method translates into the B-H p-value adjustments you explain starting at ~15:00. To me, plotting a line along the H0 = True p-values sounds like you would be fitting a linear regression & identifying the outliers < .05.
@karolnowosad886
@karolnowosad886 3 года назад
I love the explanation!
@statquest
@statquest 3 года назад
Thank you! :)
@torquehan9404
@torquehan9404 Год назад
I don't understand one thing. If samples are taken from the same population, p-value bins would NOT be evenly distributed, rather it is also skewed toward p=1 because it is normally distributed and most of the time samples close to average values are likely to be picked.
@statquest
@statquest Год назад
By definition, p-values are uniformly distributed. By definition, a p-value = 0.5 means that 5% of the random tests will give results equal to or more extreme. a p-value = 0.1 means 10% etc etc. etc.
@torquehan9404
@torquehan9404 Год назад
Thanks a lot!
@rongruo2624
@rongruo2624 4 года назад
I'd like to know why when samples come from the same distribution, the p values are uniformly distributed? Thank you!
@vaibhavijoshi6443
@vaibhavijoshi6443 4 года назад
This is amazing. thank youu.
@statquest
@statquest 4 года назад
Thank you! :)
@BadalFamily
@BadalFamily 4 года назад
Hi Josh! Great stuffs here. Could you please make a video on "Significance Analysis of Microarrays". Mainly how it differs from T-stat/Anova. Really appreciate you for all the videos.
@statquest
@statquest 4 года назад
I'll keep it in mind, but I can't promise I'll get to it soon.
@biancaphone
@biancaphone 5 лет назад
Would love a video about the target decoy approach
@statquest
@statquest 5 лет назад
OK. I've added it to the to-do list. :)
@congchen170
@congchen170 7 лет назад
Very nice video and I learned a lot from it. The only thing is when you give examples and told us when you do 10,000 times P value calculation, the distribution of P values will be like this or like that. But I don't know that's true or not. So, I am wondering can you explain a little bit more or is there any further reading I can do about P value and adjusted P value?
@dingdingdingwen
@dingdingdingwen 2 года назад
Great channel and fantastic content! I am wondering if you could make an episode about IDR, Irreproducible discovery rate. It is difficult to find a good explanation or usage guide on it.
@statquest
@statquest 2 года назад
I'll keep that in mind.
@ucheogbede
@ucheogbede Год назад
This is very great!!!
@statquest
@statquest Год назад
Thank you!
@hedaolianxu2748
@hedaolianxu2748 4 года назад
AWESOME! Thank you!
@statquest
@statquest 4 года назад
:)
@thomasalderson368
@thomasalderson368 6 лет назад
thanks josh!
@statquest
@statquest 6 лет назад
You are welcome!!! I'm glad you like the video! :)
@Ken-vp6xc
@Ken-vp6xc 5 лет назад
Hey thanks for the video. Just a question, don't you have higher chance of having samples that come from the middle of the distribution than the tails resulting having more large p-values than small ones? I don't get why p-values are uniformly distributed? Thanks :)
@statquest
@statquest 5 лет назад
You know, I found this puzzling as well. However, imagine we are taking two different samples from a single normal distribution. If we did a t-test on those samples, 5% of the time the p-value would be less than 0.05. Now imagine we created 100 random sets of samples and did 100 t-tests. 5 of those p-values will be less than 0.05. 10 will be less than 0.1, 15 will be less than 0.15.... 50 will be less than 0.5.... 90 will be less than 0.90, etc. This isn't a mathematical proof, but it makes sense - the whole idea of having any p-value threshold, x, is that we are only expecting, x percent of the tests with random noise to be below that threshold. Thus, we have a uniform distribution of p-values.
@RobertWF42
@RobertWF42 5 лет назад
Also keep in mind that when computing p-values for the difference between two sample means, p-values of .05 or less cover a wider range of x values than say p-values between .50 and .55.
@Tbxy1
@Tbxy1 3 года назад
@@statquest Wow, I had the same question as Ken. Thanks for giving this super intuitive explanation!
@lizheltamon
@lizheltamon Год назад
@@Tbxy1 me too! been struggling to understand that part and thank god Ken asked 😅
@zijianchen4775
@zijianchen4775 5 лет назад
It is a crystal clear about FDR and BH method, rather than my professor said
@StephenRoseDuo
@StephenRoseDuo 6 лет назад
Awesome, this may be too niche but could you do a video on local FDR please?
@thiagomaiacarneiro2829
@thiagomaiacarneiro2829 Год назад
Great video! Congratulations. I've seen the paper of Benjamini and Hochberg 1995, but (guided by my very limited knowledge of math) I was not able to find the formula in the way you explained. Please, could you give some clarifications on this issue, as some kind of transformation of the mathematical procedure? Thank you very much. Best wishes.
@statquest
@statquest Год назад
I'll keep that in mind.
@donnizhang5960
@donnizhang5960 5 месяцев назад
I have the same questions. Did you figure out the logic behind the mathematical procedure? Thank you!
@mihaellid
@mihaellid 4 года назад
BAMMMM! Thank you!
@statquest
@statquest 4 года назад
Hooray! I'm glad you like the video. :)
@tysonliu2833
@tysonliu2833 5 месяцев назад
I think you previously talked about how to calculate p value for one sample set that tells us how likely the sample set belongs to the distribution, but in here we are calculating the p-value of two sample sets, and try to tell whether they belong to the same distribution, how is it calculated? Or is it simply just comparing one sample set to the distribution and another and if they both likely belong to the same distribution we say we fail to reject the null hypothesis?
@statquest
@statquest 5 месяцев назад
In this video I believe I'm using t-tests. To learn about those, first learn about linear regression (don't worry, it's not a big deal): ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-nk2CQITm_eo.html and then learn how to use linear regression to compare two samples to each other with a t-test: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-NF5_btOaCig.html
@mihirgada5585
@mihirgada5585 Год назад
Thanks for these videos! They are great!! Can you help me understand the intuition behind why the p-values are uniformly distributed in the samples from the same distribution?
@statquest
@statquest Год назад
Think about how p-values are defined. If there is no difference, the probability of getting a p-value between 0 and 0.05 is... 0.05. And the probability of getting a p-value between 0.05 and 0.1 is also 0.5 etc.
@thomasmatthew9515
@thomasmatthew9515 7 лет назад
Question on the application of the B-H method: I have a distribution of p-values and KS D-values from comparing two distributions: 1) a distribution of transcriptional changes (observed), and 2) a distribution of transcriptional changes formed from random shuffling (null). I wish to adjust the p-values to weed out any false positives. When I rank the p-values, can I simply choose all p-values in the "< 0.05 bin" of the observed distribution? That kind of mimics what you did in the first example starting @ 14.47. But in the second example @ 17:07, how did you actually compute the adjust p-vales? Did you just repeat the method on the blue boxes (observed) and on the red boxes (null) separately? Thanks, and keep up the great videos!
@thomasmatthew9515
@thomasmatthew9515 7 лет назад
That makes sense. Your approach eliminates p-value adjustment: just select a cutoff where no more than 5% of the combined (and sorted) p-values come from the permuted set. Then for any p-value from that combined set I can say "this p-value has an FDR of
@thomasmatthew9515
@thomasmatthew9515 7 лет назад
Joshua Starmer I'll try all three and see which samples get eliminated. Thanks again for your feedback, you're more helpful than most of my professors!
@JadAssaf
@JadAssaf 6 лет назад
Thank you so much.
@statquest
@statquest 6 лет назад
Hooray! I'm glad you like the video! :)
@JadAssaf
@JadAssaf 6 лет назад
I've been reading publications for an hour and you solved my problem in 10 minutes.
@statquest
@statquest 6 лет назад
Awesome!!! This is definitely one of those things that's easier to "see" then to read about. Glad I could help. :)
@krisdang
@krisdang 7 лет назад
This is awesome. Imma save it for later reference hah
@TaylanMorcol
@TaylanMorcol Год назад
Hi Dr. Josh, I'm curious to get your thoughts on a simulation I'm running. It's very similar to the simulation in this video where you calculate 10,000 p-values by sampling from the same distribution. When I run my simulation using a Welch t-test and n=3, only ~3.5% of p-values are less than 0.05. The percentage converges on 5% when I increase the sample size or use the Student's t-test. It seems as though forgoing the equal variances assumption sacrifices some power, especially at low sample sizes. But I'm still trying to grasp why that is and what the implications are for using the Welch t-test with low sample size in real-life situations. For example, if the null hypothesis is that both samples come from the same population, then why not just assume equal variances and use Student's t-test all the time? (I know that last question is probably conflating some concepts that should be separate, but I'm having a hard time keeping track of it all, and I'm really interested to hear how you would respond to that question). You seem to have a great way of explaining things like this intuitively. I'm curious to hear your thoughts. Thanks so much! I've benefited greatly from your videos.
@statquest
@statquest Год назад
It makes sense to me that welch's t-test has less power with low sample sizes because it makes less assumptions - and thus, has to squeeze more out of the data by estimating more parameters.
@TheRonakagrawal
@TheRonakagrawal 8 месяцев назад
@statquest: Josh, Thank you. I have a follow-up though. Sure, we could adjust the p-values to reduce the False positives, but could this adjustment cause an increase in False negatives? Is there a way to quantify that? Apologies if I am missing something obvious.
@statquest
@statquest 8 месяцев назад
There are different methods to control the number of false positives, some do a better job than others at keeping the number of false negatives small. FDR is one of the best methods for limiting both types of errors. In contrast, the Bonferroni correction is one of the worst.
@Priestessfly
@Priestessfly 3 года назад
great video
@statquest
@statquest 3 года назад
Thank you!
@urjaswitayadav3188
@urjaswitayadav3188 6 лет назад
Great video! I have a question on distribution of p-values: I am doing a likelihood ratio test and calculating significance p-values from Chi-square test. I see that the distribution of my uncorrected p-values is not uniform near p-value 1. It has a large peak at p-value 1 i.e. most of my data-points has p-value of 1. Do you have any insights on how that might happen? And what can be the best way to correct for multiple hypothesis testing in this case. Because, using BH, I lose all the significance :( Thanks!
@urjaswitayadav3188
@urjaswitayadav3188 6 лет назад
Thanks Joshua!
@chimiwangmo1512
@chimiwangmo1512 3 месяца назад
Thank you for the intuitive video. I am awfully new to statistics so I have three questions: Suppose it is a classification problem 1. Are "samples" referred to as "classes" (types of genes) or is it samples of genes? 2. Will the null hypothesis be: there is no dependency between the gene and the samples? 3. Why 10,000 times? (I am bit confused what is relationship between 10,000 genes and 10,000 test as I understand for each test, the distribution plot is based on values of genes)?
@statquest
@statquest 3 месяца назад
1) I'm not sure I understand the question because we are trying to classify the expression as being "the same" or "different" between two groups of mice or humans. 2) The null hypothesis is that there is that all of the measurements come from the same population. 3) When we do this sort of experiment, we test between 10,000 and 20,000 genes to see if they are expressed the same or different between two groups of mice or humans or whatever. So, for each gene in the genome, we do a test to see if it is the same or different. This allows us to identify genes that play a role in cancer or some other disease.
@TheJosephjeffy
@TheJosephjeffy 2 года назад
I am glad to see this video as i am doing some FDR tests in my project. I have a question: what if the false positive samples remained after adjustment? Is it still acceptable if FDR is < 0.05?
@statquest
@statquest 2 года назад
You can not eliminate false positives, but you can use FDR to control how many there are. So typically people call all tests with FDR < 0.05 "significant".
@jbeebe2
@jbeebe2 6 лет назад
Thanks
@zeyads.el-gendy4227
@zeyads.el-gendy4227 3 года назад
I truly love you...
@statquest
@statquest 3 года назад
Thank you! :)
@bzaruk
@bzaruk 2 года назад
in 6:45 when you mentioned the p-value or 3 technical samples - how do you calculate a p-value of 3 technical samples into one number? do you average them before calculating the p-value? summing them up? or average the p-values of each one of the 3 technical samples?
@statquest
@statquest 2 года назад
I'm not sure I understand your question. However, essentially what I'm saying at that point is that we start with a single normal distribution and randomly select values from it (for details, see: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-XLCWeSVzHUU.html ), then we perform a statistical test (for example, a t-test) to calculate the p-value. We then repeat this process 10,000 times and create a histogram of the p-values. This will create a histogram of p-values for when the null hypothesis is true (for details, see: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-0oc49DyA3hU.html )
@karimnaufal9792
@karimnaufal9792 4 года назад
Holy freaking nuts!! Thank you haha...
@statquest
@statquest 4 года назад
Yes! :)
@zihanyang7565
@zihanyang7565 4 года назад
Could you kindly explain the post hoc tests for ANOVA?
@sergiooterinosogo4286
@sergiooterinosogo4286 3 года назад
Thank you for your very helpful video. I have one question here: what I have understood from the calculation of the FDR is that it will make only the smaller p-values still be significant after the correction, am I right? (you suggested it in 12:09) Nevertheless, I got distracted at 17:20 because there are small-er values in the red area that, based on this, would not be "false positives" if I got your explanation. Could you clarify this? Thank you :)
@statquest
@statquest 3 года назад
The numbers in the blue boxes are p-values that were created from two separate distributions. Some of those p-values are below the standard threshold of 0.05 and some are not. The ones that are not are "false negatives". The numbers in the red boxes are p-values that were created from a single distribution. Some of those p-values are below the standard threshold of 0.0.5 and some are not. The ones below the threshold are false positives. However, in this specific example, after we apply the BH procedure (at 18:02 ), all of the false positives end up with p-values > 0.05 and are no longer considered statistically significant so the false positives are eliminated.
@yuyangluo7292
@yuyangluo7292 3 года назад
i love how he made that joke about wild type with monotone lol
@statquest
@statquest 3 года назад
:)
@annas.1403
@annas.1403 5 лет назад
Hey sorry to bother you (or anyone else who reads this comment) but I am currently trying to understand the connection between FDR and p-hacking. I am not sure if I understood this right but: Can an inflated FDR appear if researchers trying to get a significant result through multiple comparisons by running more than one independent test on the same data set. Or have I misunderstood FDR completely?
@chadmoon3139
@chadmoon3139 10 месяцев назад
Awesome!!
@statquest
@statquest 10 месяцев назад
Thanks!
@oliveros9
@oliveros9 6 лет назад
1000 Thanks! One naive question: Why the distribution of p.values in testing samples taken from the same distribution is flat? I'd rather expected a distribution skewed towards high p.values (non significant). Thanks again!
@oliveros9
@oliveros9 6 лет назад
Thanks! In fact, we just made a simulation (in R language) and we obtained the described behaviour (flat distribution). And it is true for any number of replicates. Your explanation is crystal clear to me. Thanks again! Nice channel!
@mausunk
@mausunk 4 года назад
Bump, I have the exact same question
Далее
Statistical Power, Clearly Explained!!!
8:19
Просмотров 284 тыс.
How to calculate p-values
25:15
Просмотров 398 тыс.
КАКОЙ У ТЕБЯ ЛЮБИМЫЙ МАРМЕЛАД?
00:40
Et toi ? Joue-la comme Pavard ! 🤪#shorts
00:11
Просмотров 1,6 млн
Power Analysis, Clearly Explained!!!
16:45
Просмотров 299 тыс.
The Bonferroni Correction - Clearly Explained
7:33
Просмотров 81 тыс.
Covariance, Clearly Explained!!!
22:23
Просмотров 540 тыс.
The Central Limit Theorem, Clearly Explained!!!
7:35
Просмотров 743 тыс.
Bootstrapping Main Ideas!!!
9:27
Просмотров 436 тыс.
FDR - Benjamini-Hochberg explained
10:12
Просмотров 20 тыс.
Word Embedding and Word2Vec, Clearly Explained!!!
16:12
Conditional Probabilities, Clearly Explained!!!
10:56
Просмотров 193 тыс.
КАКОЙ У ТЕБЯ ЛЮБИМЫЙ МАРМЕЛАД?
00:40