Thank you so much and well done! I love your content and your Excel approach. Video suggestion: statistical classification methods (supervised and unsupervised)
Exactly the motivation behind this video :) Logit/probit are actually much easier to estimate and interpret than many people think and working it through in Excel is very much possible and quite rewarding in that!
Congratulations for the clear and instructive video. How would that work with a dependent variable that rather than binary (0, 1) would be defined on more than two categories ( e.g. 1,2,3)?
Hi Fernando, and thanks for the excellent question! To estimate a model like that, you can use ordered logit or ordered probit. The technique is the same in spirit but slightly different in implementation, and I might touch upon that in a later video someday!
Dear Sava, sorry for bothering but I am wondering which is the alpha here, for the p-value' s 2 tail test: is that set at 0.05, as it is conventionally ? Plus, I am not sure about the role of the constant here...Sorry for my very begin level questions
Hi Marina, and many thanks for the questions! The significance testing for coefficients here works exactly the same way, a p-value below 10%, 5%, or 1% means the relationship between an independent variable and the binary variable is significant at the respective level. The constant here can be interpreted as the odds ratio (for logit) or a z-stat (for probit) for the binary variable if all other independent variables are zero. Most of the time, we are not necessarily interested in the statistical significance of the constant. Hope this helps!
@@NEDLeducation Aaa ok ok! Constant is maybe what we call intercept b0, whose probability is calculated through inverse logit P= e^b0 / (1+e^b0), when x=0! I think now everything it is clear EVEN to me! THANK YOU!
Thanks for your great video, I have two questions please . First, How do we know that our data contains sufficent number of "1" values. Meaning, what is the desired number of "1" and "0". how did you come tot he conclusion that 25% are ok. Suppose I collect 10,000 observations, and only 12% of them are "1" value, how do I know this is sufficient or not? is there any indepednet methodology to relax such dilema. Second, I have this dillema when I want t insert a dummy explantory variable in a simple ols regression. Is there any apriori methodology to know if it worth inserting the dummy variable based on the numbr of "1" values...or in this case we leave it to the p-value will do the work for determining if the dummy variable has any impact.
Hello @NEDL, I search through google drive , but there is no file for this video ? if you have, can I have it to understand better your eexplanation. thanks
Hello, another great video!!!! I have one question, is it ok to use the Logit model on variables that are not categorical or binary? for instance, I want to know what variables impact the market share of a company, market share data is bounded by 0 and 1 but is not binary, is it ok to estimate a Logit model in this case? if not, what kind of model shoud I use? Thanks in advance and keep doing your excellent videos!!!!
Hi Victor, and thanks for the excellent question! For this, a censored regression model such as Tobit would be ideal. I have got a video on that here: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-QS3OAYML2nM.html
Hi. Thanks for the concise explanation. Can you do a video on how to estimate the parameters when the dependent variable has more than two categories or how to perform a Multinomial Regression in excel?
You can find the spreadsheets for this video and some additional materials here: drive.google.com/drive/folders/1sP40IW0p0w5IETCgo464uhDFfdyR6rh7 Please consider supporting NEDL on Patreon: www.patreon.com/NEDLeducation
Hi Akash, and thanks for the question! Keeping things simple, the logic of maximising log-likelihood is to get the best possible fit of the model (the best possible explanation of what happens) to the data by varying the parameters. This is due to the probability density function representing the derivative of the distribution function and being interpretable as likelihood. Log-likelihood is maximised instead of simple likelihood as manipulating a sum is easier computationally than manipulating a product (log converts a product to a sum of logs). Hope it helps!
Hi Ramesh, in terms of how regulators view prediction of loan defaults in various models I would suggest this source: www.bis.org/ifc/events/ifc_8thconf/ifc_8thconf_4c4pap.pdf
Hi, and thanks for the question! It is on the top right of the Data tab. If you have not got this add-in installed, you can go Home -> Options -> Add-ins -> Excel add-ins -> Go and tick the Data analysis in the appearing menu (similar to how you enable Solver). Hope this helps!