Introduction to Bayesian data analysis - Part 2: Why use Bayes?

Подписаться 7 тыс.

Просмотров 81 тыс.

50% 1

Try my new interactive online course "Fundamentals of Bayesian Data Analysis in R" over at DataCamp: www.datacamp.c...
----
This is part two of a three part introduction to Bayesian data analysis. This second part aims to explain why Bayesian data analysis is useful. If you haven't watch part one, I really recommend that you do that first: • Introduction to Bayesi...
More Bayesian stuff can be found on my blog: sumsar.net. :)

Опубликовано:

28 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 81

@DavenH 3 года назад

This was so clearly and amusingly demonstrated. Great video!

@hellojeyjey 4 года назад

I would be happy to delve further into interpretations of ML algorithms from Bayesian perspective which you talk about at around 20 minutes into the video. I get the linear regression, but curious to learn more.

@MyDarkestFlower 6 лет назад

"I think he is smoking tobacco, but i don't know" hahahaha

@alvarosalgado3121 5 лет назад

Great explanations, thanks! I just didn't quite understand why in the A/B testing you should accept only results that match "both" datasets at the same time. Is it simply to apply arithmetics to the resulting generated data (like you did for getting the difference between A and B)? Thanks!

@ruthsindie2660 5 лет назад

finally i have an idea on how to apply Bayesian Analysis to optimization

@markstrong3018 5 лет назад

Please, could you provide a method how to calculate alfa & beta parameters for a beta-distribution given a particular probability? I noticed that in your example, probability of success ranges between 5% and 15%, and you genarated a distribution with params Beta(3,28). So the question is how did you achieve those alfa = 3 and beta = 28, respectively? Thanks!

@nickjames1066 4 года назад

Would also like to see explicit code for how one creates beta distributions in R. Currently trying with dbeta() but I can't get the same distribution you get for alpha=3, beta =25.

@ElGtheTS 4 года назад

@@nickjames1066 Use rbeta, just like runif and rbinom. E.g. hist(rbeta(n = 10000, shape1 = 3, shape2 = 25), col='darkgreen')

@angelf.escalante7825 6 лет назад

Dude, you're a hero!

@luigineri4364 7 лет назад

Congrats for the videos. They are amazing. I really like the way you explain. It's brilliant. Also, your voice is a good fit for videos. When you calculate the profit for method B, shouldn't you take out 300 for the salmon and 30 still for the mail since you send both?

@rasmusab 7 лет назад

Ah, but the postage system work in mysterious ways. If you're already sending a salmon, the you don't need to pay postage for the brochure, which was the main cost of the 30 kr. :)

@clapdrix72 4 года назад

Why do we need to discard samples when only one draw matches the real data (A:4, B:10)? Why not just throw away the sample for the non-matching method or even just sample them separately? It isn't a bivariate distribution so each method's draws are independent of the other method's.

@alebachewtaye7454 4 года назад

it is very interesting ! how to import data from other software in to winbgs for analysis

@wahabfiles6260 5 лет назад

Please Please make a video on Gaussian Process.....none of the video on the youtube gives intuition like your videos

@RottenMonkeyderAffenkopf 6 лет назад

really really good

@avidreader100 5 лет назад

Can we use the posterior of one round of computation as the informative prior of the next round of improved estimate? When it has no relation to reality such as an expert opinion, and still has its origin from a wild guess of uniform distribution, would it be a better estimate?

@shaelynnwatt2107 4 года назад

I came with the same question!!

@frashertseng9426 6 лет назад

Hi, thanks the great talk. May I have a question that do I need to follow the order of rateA and rateB when computing the diff, or I can randomly draw the value from A and B to compute the diff?

@rasmusab 6 лет назад

Generally order matters, but in this specific case it doesn't as the two rates are completely unrelated. :)

@JohnDraper1993 4 года назад

@@rasmusab A great video, thank you. I do have a question though ;), if my array of rates a and B is of different size then how can i calculate the rate_diff distribution? Or have i made a mistake and they should be of equal size?

@benjaminthomas1369 2 года назад

@@JohnDraper1993 @UCO7kJ__JJ4v4RQU3ZymR3Kw: Nice lecture! I actually came across the same problem - do you have a solution for that? Thanks so much for your teachings - will check our your new course for sure!!!

@benjaminthomas1369 2 года назад

@@rasmusab Nice lecture! I actually came across the same problem as John - do you have a solution for that? Thanks so much for your teachings - will check our your new course for sure!!!

@ayushpandey5261 7 лет назад

Unable to find part 3. Could you help me with it?

@rasmusab 7 лет назад

ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-Ie-6H_r7I5A.html

@siqizhang 6 лет назад

do you realize that your Nordic accent is very sexy?

@pedrocolangelo5844 3 года назад

Rasmus, you should really start making videos again. Seriously, the way you teach is, by far, one of the best I've seen on RU-vid (and I watched plenties of videos here since I am a self-taught student of economics). I am amazed how tricky and deep concepts seem so simple when brought by you. Thank you for this series on Bayesian Statistics.

@qqq_Peace Год назад

I can't agree more with what you said, so I second!

@andrewkostandy9510 4 года назад

Thank you for the excellent introduction series! Shouldn’t the profit calculation at 17:58 also take into consideration the cost of the campaigns for the non-respondents? So it would be profitA = (rateA*(1000-30))+(1-rateA)*-30. And then profitB = (rateB*(1000-300))+(1-rateB)*-300

@BiancaAguglia 4 года назад

I too was confused a little about the results, then I realized Rasmus calculated the average profit per person. Think of it this way: let say you use campaign A for n people and your signup rate is r. Then: 1. your total_profit is: total_profit = n*r*1000 - n*30 2. your average_profit is average_profit = total_profit / n = (n*r*1000 - n*30) / n = r*1000 - 30 Using Rasmus's notation, your average_profit = rateA*1000 - 30 😊His numbers are right. I hope this helps.

@HansPeterSloot 5 лет назад

One question why don't you keep t=268 the 72% rate2 for model B but throw both tries away?

@antonsuess3238 7 лет назад

Really like your 3 part introduction on Bayesian modelling! Clearly structured, focussed and entertaining - thank you!

@pascaltorvic6246 6 лет назад

Nothing to add..You did sump up my feelings perfectly

@ivan_toriya 2 года назад

"How much we should trust our CEO? I think he is smoking tobacco, but I don't know"

@nicolapsychotica3350 5 лет назад

Do you make the slides available? Such great explanations. Love your presentations. Thank you so much 🙏

@panagiotissourtzinos9175 7 лет назад

Great videos!!! Thank you very much. I have a question though that troubles me. I didn't understand why, while calculating the posteriors for both A and B methods we have to take the one in consideration with the other. Why do we need to keep the probability values only when both model responses agree with the observed responses. Couldn't we produce the posteriors by running the models independently?

@rasmusab 7 лет назад

Yes, for this specific model, that is correct! If you can figure out winch parameter are independent of each other then you can run those part independently. In general, however, you would have to run it all at the same time. :)

@jenyasidyakin8061 5 лет назад

you a normalizer P(data) to be the same for both A & B in order to compare them so you need accept the parameter for both in the same time

@jenyasidyakin8061 5 лет назад

Answer to the question "Why to accept both A & B at the same time": I think got this one :). for both A & B you need a normalizer often call the evidence or the marginal likelihood P(data) So it must be proportional and the same for A & B , thats why when you want to compare A and B you must accept both. It is an evidence for the joint distribution of A & B

@karannchew2534 2 года назад

17:30 should be profitB = rateB x 1000 - (300 + 30)

@BharathAllu 7 лет назад

Is there a R tutorial for this video?

@yairgs9899 5 лет назад

¡Breathtaking! ¡Bravo! No words,just thank you for sharing.

@jiayupeng3515 6 лет назад

Great introduction of Bayesian thinking. Much clearer than textbooks.

@jrdetka 5 лет назад

Thank you Rasmus! A very clear and accessible introduction with lots of opportunties to actually apply what we learned. Thank you for all you work on this.

@KeisOhtsuka 2 года назад

17:34 Method B: Don't you send out a brochure (30 Kr) with a salmon (300 Kr) unless it is already included (Fish 270 Kr Brochure 30 Kr). Don't forget to include a QR code. :)

@karannchew2534 2 года назад

Notes for my future revision. Why Bayesian Data Analysis? 0:29 How easy it is to change Bayesian model while the computation stay the same. 0:32 You have great flexibility when building Bayesian models, and can focus on that, rather than computational (algorithmic) issues. There are often computational (processing) issue in fitting Bayesian model. But since there is clean separation between specifying and fitting model in a bayesian framework, you often don't have to focus too much on how your model is computed when you construct it. That mean you can focus on what assumptions are reasonable and what information you should use, rather than on algorithm when doing the actual modelling. There are many tools to help fitting Bayesian models (Stan, PyMc), just specifying the model might just be enough.

@avidreader100 5 лет назад

At 11:50 you say the CEO suggested the rate of sign up is usually between 5% and 50%. But a little after 9:12, he was quoted as saying 'between 5% and 15%'. I guess the accuracy does not matter so much. You are perhaps interested in getting closer than the initial model with uniform distribution.

@KeisOhtsuka 2 года назад

7:50 I am confused with rate_diff. Simulated subscription rates for Method A and B are not related in any way other than the order in which these are generated (cf. unlike repeated mesures pre- and post-test scores). Is it meaningful to calculate difference scores between two numbers that are not related? Can we just look at confidence intervals?

@randomized6105 3 года назад

Finally Bayesian is making sense

@schinkelaner93 5 лет назад

All three parts are super helpful. Thanks a lot!

@buffalobill212 6 лет назад

Great video! For the decision analysis, why do you need bayesian analysis? You could just use the maximal likelihood estimation with the cost equation and see that the brochure alone is better. (0.38*1000 - 30 > 0.63*1000 - 300). If you're making a decision based off of this, it doesn't seem beneficial to do the bayesian analysis.

@rezaghaiumy5415 4 года назад

on 17:57, I think the Profit B should be =rateB *1000 - (300 for salmon +30 for brochure)

@josuesdf 4 года назад

Very good explanation. Shouldn't profitB = rateB*1000-300-30?

@donolegario 6 лет назад

Awesome video! But I think in the method B the cost of the brochure is missing, it's only been considered the cost of salmon, so the profitB would be (1000rateB-330) instead of (1000rateB-300), Anyway, the whole idea is perfectly explained. Cheers!

@rasmusab 6 лет назад

Ah! So when you pay for shipping the salmon, due to the postage system in Scandinavia, you can slip in a brochure at no extra cost in postage :)

@nickjames1066 4 года назад

@@rasmusab Do they also give you the paper and print it for free? :p - Great bayes series btw, thank you!

@victoriaeshelby766 4 года назад

@rasmusab, there is a small mistake in your tutorial at @4:40. The second rate is not incorporated into the model so 64 =/= 72 (which was from the previous draw).

@sakkariyaibrahim2650 Год назад

Great lecture

@angeld5093 6 лет назад

Great introduction, thank you!

@alexlev4631 7 лет назад

Congratulations! Perfect video and brilliant explanation! I wonder, what's wrong with your BayesianAid project? I missed it on cran.r-project.org?! In fact I use it and even have made a little fix for RNG.

@Apolozx4 7 лет назад

Thanks for posting these videos, man. In a economics student and I am very interested in this kind of thing.

@kjyfhjjj 5 лет назад

Very clear and helpful! Best resources I've seen now.

@paulmihalyov7799 6 лет назад

Thanks for the upload, your explanations are great. Just FYI, during the 13th minute, your x axis label on the "Informative" histogram might be a mistake? It caused a little confusion for me.

@rasmusab 6 лет назад

Yep! Definitely a misstake, nicely spotted. Both axes should read "Posterior on the rate of signup". :)

@stefanstojanovic1735 7 лет назад

Shouldn't we deduct 330 from expected profit of B since we are sending both salmon and a pamphlet?

@stefanstojanovic1735 7 лет назад

Ah nvm, saw your answer below :) Great video by the way.

@jamesmburu886 7 лет назад

very informative. Many Thanks

@CalzOmon 6 лет назад

Thanks for the great videos! Learnt a lot

@ashleyjones1054 6 лет назад

Tack Rasmus, javligt bra!

@fissehaberhane7341 7 лет назад

Simply great!

@megis127 5 лет назад

hail papous with tsimpouk

@user-or7ji5hv8y 6 лет назад

really like the practical example

@danroche8014 4 года назад

Excellent content!

@taylorallred 5 лет назад

Nice work!

@tukmyjob 7 лет назад

Please answer a question. How to generate Informative rates using beta(3,25) for n draws?

@rasmusab 7 лет назад

Hi tukmyjob. I'm sorry, but I'm not sure I understand the questions... :)

@ralphdamico5627 6 лет назад

If you go to time point 11:20 the bottom distribution shows a discrete graph of the continuous Beta distribution. The values are randomly created from this distribution. The fastest way to generate random values of specific distributions is to use uniform random numbers and plug them into the inverse of the cumulative distribution curve. What does the inverse of the Beta distribution look like? Alternate methods also exist. This question assumes a pre supplied package of functions are not being used. Have you had any experience with Polynomials (see Peter Fleischmann, 1978) which was enhanced by Todd Headrick (2002) for generating random values from non-Gaussian distributions. Sadly, this is something that I just recently became acquainted with. Thank you for your videos!!!