Тёмный

Statistics 101: Multiple Regression, Forward Selection 

Brandon Foltz
Подписаться 290 тыс.
Просмотров 27 тыс.
50% 1

In this Statistics 101 video, we explore the regression model building process known as forward selection. We also take an in-depth look at how the sum of squares is allocated in the full model. This is done through conceptual explanations and by analyzing computer output from JMP. Enjoy!
My playlist table of contents, Video Companion Guide PDF documents, and file downloads can be found on my website: www.bcfoltz.com
JMP by SAS: www.jmp.com/en_us/software.html
Happy learning!
#statistics #machinelearning #datascience

Опубликовано:

 

13 апр 2021

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 47   
@cutestbear3327
@cutestbear3327 Месяц назад
not only have I learned more about forward selection, I feel like I have taken a masterclass on how to present information in a clear and concise manner. Thank you very much Mr. Foltz.
@gauravagrawal1192
@gauravagrawal1192 3 года назад
As always, Brandon at his best. I read other notes on Forward selection and could not understood anything, but after watching the video, things are so much clear. Thank you so much Brandon.
@aaronm9491
@aaronm9491 2 года назад
Brandon the way you teach is just amazing. thanks so much
@monkpad
@monkpad 3 года назад
Brandon, I've been watching your videos since 2017 and I've to say this channel is GOLD !! Thank you for creating this.
@BrandonFoltz
@BrandonFoltz 3 года назад
Wow, thanks! Thank you for watching Vishnu!
@user-ze2ju3rm7u
@user-ze2ju3rm7u 2 года назад
Thank you making the series! I'm a linguistics student but want to switch to NLP. Your videos help me a lot.
@newsupdates3622
@newsupdates3622 3 года назад
Thanks for demystifying FS technique in such a simple way. Beautiful!
@BrandonFoltz
@BrandonFoltz 3 года назад
Glad you like it!
@QZainyQ
@QZainyQ 3 года назад
Great job Brandon, sending appreciation from a place far away.
@BrandonFoltz
@BrandonFoltz 3 года назад
🌍
@killthedark7283
@killthedark7283 2 года назад
Once again! This is fantastic!b Hope you can keep doing more videos like this!!!! Good for u
@smtxtv
@smtxtv 3 месяца назад
Awesome explanation and analysis. Thx !
@MAX-ho6wg
@MAX-ho6wg 3 года назад
Thanks Brandon for the video.
@Eduardo_Martinelli
@Eduardo_Martinelli 3 года назад
Amazing video! Please do a series on machine learning
@shivsharma9153
@shivsharma9153 Год назад
You are the best of the best!!!
@pinkhairedlily
@pinkhairedlily 2 года назад
Another factor on the increased price of homes might be the inflation. Regardless, that was indeed a good model. And kudos for helping me touch up on regression model-building!
@jasonthomas2908
@jasonthomas2908 2 года назад
I like your videos, thanks
@jansafar5371
@jansafar5371 2 года назад
Masterful
@rohitekka2674
@rohitekka2674 3 года назад
An amazing video yet again. You've brilliantly addressed the concept. Thank you for simplifying it.
@arokiageorge9673
@arokiageorge9673 2 года назад
Excellent teacher. Could benefit more if you could take up on more machine learning topics
@jagritibhattacharyya5114
@jagritibhattacharyya5114 2 года назад
Your videos are so helpful. Do you have on CFA, EFA and also other regression techniques like Cox/Probit/Tobit ?
@villejunttila1425
@villejunttila1425 3 года назад
"Massaging" my data sounds so much better than "tampering" :) I think I'll use that
@stefanodepaoli
@stefanodepaoli 2 года назад
I suppose where you have Baths in the model it should be instead Bedrooms (I guess there was an error in naming the columns in the dataset used for this video). The dataset I downloaded from your website gives the same parameters (used SPSS) but with Bedrooms rather than Baths (i.e. column naming is correct there).
@BrandonFoltz
@BrandonFoltz 2 года назад
Correct! I fixed that in the next video. Somehow this columns got switched.
@seanpitcher8957
@seanpitcher8957 Год назад
Oh this is a very cool series! I do a LOT of MLE and this explains so much I'd see happening and wondered about. A question - at 14.50 you talk about the combined explanatory power of the variables. If we see this in our regression, should we add an interactive term to account for this?
@ehsanshahini6146
@ehsanshahini6146 Год назад
Thanks for the amazing video, What software did you use for the calculation?
@alexviveslliset1464
@alexviveslliset1464 2 года назад
Great video! Could anyone tell me what is the formula for the F ratio calculated for each model?
@eraleks
@eraleks 3 года назад
Hi and thanks for fantastic videos! I am working with a paper and found your videos by searching for the specific topic of forward selection, so I haven't seem them all in succession. Apologies if you have explained this in a previous video but: how can I include categorical variables into the forward selection method? I have a dataset which contains both categorical and continuous variables. Your video has helped me greatly in understanding which continuous variables I should keep and which I should skip, but I can't find out how to interpret categorical variables in the same way.
@BrandonFoltz
@BrandonFoltz 3 года назад
Hello! You can absolutely have categorical features in your model. How to do that depends on your software. Some programs / packages just let you enter them as is; text. Others might want you to create dummy variables. ALL programs are creating dummy variables / one-hot-encoding behind the scenes whether you do it manually or not. What software are you using?
@impossiblemission4ce
@impossiblemission4ce 2 года назад
Hey Brandon, thanks for the clear explanation. However, in the dataset that you linked to in the description, it seems to me you've switched the values (or column names of course) of beds and bath, compared to how you have it in the video. The values are the same however, so if you're calculating along with the video, that's something to keep in mind.
@vangelis9911
@vangelis9911 3 года назад
Brandon it's time for that meta-analysis course to hit RU-vid. You are a great teacher thank you for your effort
@QZainyQ
@QZainyQ 3 года назад
If I may ask a question, in the context of forward selection and other model building techniques, how do we account for diagnostics such as checking and adjusting for multicollinearity etc, do we do it first or after? thanks gain.
@BrandonFoltz
@BrandonFoltz 3 года назад
Hello! Diagnostics and EDA should always be part of model building to understand the zero-order relationships. In general, model building techniques are based on the rule that only the unique contribution of each variable (reduce SSE) determines entry (the opposite would be true for backwards and both the case for stepwise). The _redundant_ sum of squares (overlap) is accounted for. So, in theory if two variables are correlated highly only the one with the highest explanatory power may make it in the model. Finally, as with all models, in the end the analyst must use domain knowledge, subjective judgement, and common sense to choose the best model. Hope that helps!
@QZainyQ
@QZainyQ 3 года назад
@@BrandonFoltz it really did, I can't thank you enough 🙏
@Onedance175
@Onedance175 3 года назад
Hey Brandon! Really awesome video and series - I can't thank you enough for how much you have helped me! Quick question -- in the "Keeping or Excluding Features" video you came to the conclusion that bathrooms did not impact the DV strong enough to be included in the regression model. But in this video, bathroom is being included in the regression model due to its impact. Why is there a difference in the variables that were ultimately chosen??
@BrandonFoltz
@BrandonFoltz 3 года назад
Hi Prerna! Thanks for your comment. I explained in this video that somehow the feature headers got switched. The numbers are the same but the columns are swapped. Sorry about that. Just one of those things that happens when my eyes get blurry and click or move something.
@jeffersonlyle6829
@jeffersonlyle6829 2 года назад
I guess I am quite off topic but do anyone know of a good website to watch new series online ?
@terrancedaxton3781
@terrancedaxton3781 2 года назад
@Jefferson Lyle Meh I use flixportal. Just google after it :P -terrance
@jeffersonlyle6829
@jeffersonlyle6829 2 года назад
@Terrance Daxton Thanks, I signed up and it seems like they got a lot of movies there :) I appreciate it!!
@terrancedaxton3781
@terrancedaxton3781 2 года назад
@Jefferson Lyle Happy to help :D
@Abha-com
@Abha-com 2 года назад
Hi Brandon, I religiously follow and appreciate your videos. I have a doubt, What is the F Ratio we are talking about at 12:59 .At this Point , i see there are two different f ratios in Linear regression. Number One is in ANOVA Table which says about the significance of the whole model, Number 2 is F ratio which is used to compare two models which you discussed in Video 6. So which one of them are you discussing at 12:59. Are they both same ?
@user-ll8dr9bm5v
@user-ll8dr9bm5v 8 месяцев назад
Can you explain what exactly the SS column is expressing in terms of notation? It's clearly not sequential SSR as one might find in mini-tab. Sequential SSR adds upto SSR for selected variables, which doesn't happen here, instead, there is a part of SSR that is missing as you said.
@areejkhalid5496
@areejkhalid5496 2 года назад
can you please make a video on ven diagram by taking the same example of houses ...
@ankurmondal7613
@ankurmondal7613 3 года назад
How are you calculating overlapping sum of squares?
@BrandonFoltz
@BrandonFoltz 3 года назад
Venn diagrams! 😊
@nickvarney8148
@nickvarney8148 2 года назад
@@BrandonFoltz Would love to see a step by step video on how it's done. Huge fan of the channel btw!
@robertpollock8617
@robertpollock8617 Год назад
With respect to the Venn Diagram and the lecture notes - 117234+18283+46393 does not equal 125304 if I am understanding what you are saying in the lecture.
Далее
Cat Plays with Window Washer
00:22
Просмотров 2,7 млн
Regression analysis in R: backward selection
22:42
Просмотров 4 тыс.
Statistics 101: Multiple Regression, Best Subsets
11:59
Gaussian Processes
23:47
Просмотров 119 тыс.