Ordinal Logistic Regression or Proportional Odds Logistic Regression with R

Подписаться 45 тыс.

Просмотров 54 тыс.

50% 1

R file: drive.google.com/file/d/1B8lp...
TIMESTAMPS
00:00 Ordinal Logistic Regression with R
00:06 Read Data
02:36 Partition Datasets
03:28 Ordinal Logistic Regression Model
05:47 Calculating p-values
07:25 Prediction
09:08 Equations for Calculating the Probabilities
12:29 Model Building with all Variables
16:18 Confusion Matrix for Training Dataset
17:12 Confusion Matrix for Test Dataset
Time-Series videos: goo.gl/FLztxt
Machine Learning videos: goo.gl/WHHqWP
Becoming Data Scientist: goo.gl/JWyyQc
Introductory R Videos: goo.gl/NZ55SJ
Deep Learning with TensorFlow: goo.gl/5VtSuC
Image Analysis & Classification: goo.gl/Md3fMi
Text mining: goo.gl/7FJGmd
Data Visualization: goo.gl/Q7Q2A8
Playlist: goo.gl/iwbhnE
R is a free software environment for statistical computing and graphics, and is widely used by both academia and industry. R software works on both Windows and Mac-OS. It was ranked no. 1 in a KDnuggets poll on top languages for analytics, data mining, and data science. RStudio is a user friendly environment for R that has become popular.

Опубликовано:

20 июл 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 131

@gnomzb5070 5 лет назад

while I was looking for an example project on ordered logit model in R, I came across with this superb video. Thanks a lot, Bharatendra!

@bkrai 5 лет назад

Thanks for comments!

@flamboyantperson5936 6 лет назад

Really great tutorial. Thank you Sir.

@euphorockz 4 года назад

This video really helps alot for my project! Thank you!!!!!

@bkrai 4 года назад

Thanks for the feedback!

@jc.nogueira 2 года назад

Great video! Many thanks for sharing this wonderful material. I will subscribe to your channel. Greetings from Uruguay, South America! All the best, jc

@bkrai 2 года назад

Thanks and welcome!

@victorhenostroza1871 4 года назад

Thank you so much for this contribution...congratulations from Peru

@bkrai 4 года назад

Thanks for comments!

@hermanhyde7000 7 лет назад

Absolute genius. I would pay a million bucks to be your student.

@user-mo4gb2xb2h 2 года назад

Thank you so much!! This video are extremly helpful and clear!!!

@bkrai 2 года назад

You're so welcome!

@gabriellamartinez7985 2 года назад

Hello thank you for this video, its been super helpful! I have a question regarding the dependent variables. How would you interpret the polr function output for dependent variables that are factors? For example, Tendency (levels: -1,0,1) was used as a dependent variable, how would you interpret each of the coefficients?

@datascience1274 2 года назад

Hello Professor. Great lesson. Quick question. I was wondering if we could have used as.ordered(data$Tendency) instead of as.factor. Can you please share some light about this? Thanks a lot in advance

@aadvikpanda3339 7 лет назад

Hello Sir , Great video. I did not get the way you calculated probability from the t-stat using this formula pnorm(abs(ctable[ ,"t value"]),lower.tail=FALSE)*2 .Could you please explain each term you have used in this formula and why?

@Sandra-tq6yb 2 года назад

Very helpful video. Thank you very much!

@bkrai 2 года назад

You're welcome!

@hayonimengi4171 5 лет назад

How would you interpret the predicted probabilities from a reference category of a categorical predictor? In other words I’m trying to present the probabilities which I get in my model however I’m confronted with my reference category and hence what would be the best way to derive these? Thanks

@wasafisafi612 2 года назад

Big thanks for your video. It helps a lot

@bkrai 2 года назад

You are welcome!

@DAMGood73 5 лет назад

Perfect, thanks for sharing!

@bkrai 5 лет назад

Thanks for comments!

@abhishekbansal5182 4 года назад

Thanks for making this video its very helpful for us Plz sir can you explain how we get alpha values for categories. is there any formula to calculate tha alpha (@) plz explain it

@88MSRobby 7 лет назад

Very good video!

@dr.bheemsainik4316 2 года назад

Hi Sir... can you please explain the Ordered Probit model for the same data with a tendency with 3 levels as the dependent variable?

@fileniaantoniou8649 5 лет назад

Hello and great video! Would you suggest this model for modelling the results of a football game where the points earned in the end are 0,1 or 3?

@bkrai 5 лет назад

Yes, it should work for such data.

@mangaikalai82 4 года назад

Sir, This video was helpful. Can you make a video on Brant test for proportional odds assumption?

@bkrai 4 года назад

Will try

@parthshah9451 4 года назад

Great Video Dr. Rai, Could you also help for Partial Proportional Odds Model

@bkrai 4 года назад

Thanks, I've added it to my list.

@WillIsGoodAtStatistics 4 года назад

Excellent video. Thank you

@bkrai 4 года назад

You are welcome!

@shoumicshahid9315 4 года назад

Hello Professor, how can I rank the significant variables from an ordinal logit model? I previously performed dominance analysis on the binary logit model but in case of an ordinal logit model that seems inappropriate.

@bkrai 3 года назад

One way could be to use p-value.

@yubarsubedi2781 3 года назад

Hello Sir, Thank you so much for this tutorial. I leaned a lot. However, I encountered a problem. When I ran the summary commend, I encountered ..Error in svd(X) : infinite or missing values in 'x'.. message. how to fix this problem.

@ganneesh 6 лет назад

Indeed, its a great video on Ordinal Logistic regression. Thanks professor, I am trying to create a model for my data set. i am facing an issue. When i ran predict command for my training data set, i am getting probability as very small value (summation of the probability is not equal to one). what could be the reason?

@bkrai 3 года назад

Seeing this today. Probably resolved by now.

@leliaglass1568 5 лет назад

thanks for the video! very helpful

@bkrai 5 лет назад

Thanks for comments!

@internetjunkie247 3 года назад

Thanks for the video. To calculate probabilities, why did you use alpha-b1x1+.... and not the conventional alpha+b1x1+... It seems different software uses different form of the equations (?) I believe, it its the former in R, perhaps SPSS too.

@khadijabenmoussa8064 4 года назад

Hello, Thanks a lot for your video it is very helpful. Could you pelase explain what s the meaning of the confusion matrix error. Also please, how can we compute the R square of our model

@bkrai 4 года назад

For confusion matrix you may refer to: ru-vid.com/group/PL34t5iLfZddvv-L5iFFpd_P1jy_7ElWMG Also note that when response is a factor variable, we do not use R-square.

@smitagupta1771 5 лет назад

what should be the change in Input file , if the independent variables have 3-4 level of ordinal category ? Should the independent variable be marked at 1,2,3,4 and then converted to ordinal factor like you did for NSP ?

@bkrai 5 лет назад

You can use ordered() for independent ordinal variable. Some researchers also recommend changing then to numeric variable as it leads to much simpler model.

@miccoligno1 5 лет назад

Hi Bharatendra, my respond variable is the score of a likert scale from 0 the worst condition to 4 the best. Should I use the function as.order? if yes, I should I keep the 4 as the best condition and the zero as the worst? Thanks

@bkrai 5 лет назад

Yes, that would work fine.

@nimeshcheedella8124 6 лет назад

Sir , very nicely explained. I tried with my data by following your vedio step by step. But one issue. I have a data independent variables are also ordinal in nature . I made into categorical is it correct? which regression you suggest to predict a ordinal variable and independent variable also ordinal.?

@bkrai 3 года назад

The method depends on the dependent variable and not much on the independent variable.

@1612kanika 6 лет назад

how to calculate bias and variance for ordinal.

@taniamendoza9247 4 года назад

Dr, Thanks a lot for your example, but could you help me with a question, Which is the differece between clm and polr, becasue i was traying to use polr in financial rates to stimated rating, but your when i use this waring Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred But if i use clm dont happend that Could you help me to undertand this 2 functions Thanks a lot Regards from Ecuador

@bkrai 4 года назад

Note that warning messages in R are ok. It's not an error.

@drkim2 6 лет назад

excellent

@dearcollynn3498 7 лет назад

Hello, thank you for your great video. I have a question. Is AIC important here? Isn't AIC here big for the model since it is larger than 1000 already?

@bkrai 7 лет назад

Yes it is high. In the same example when we made a model with three variables, it was over 1700. By adding more variables it came down to about 1038, which is a significant improvement.

@Astronoom 4 года назад

When you add more variables the AIC goes down, but then you select variables which have a significant level >0.1 and the AIC goes back up, isn’t it? Wouldn’t you use the model with the lowest AIC, and if not why use the AIC at all? Can I compare models with the AIC as well when in some models variables are log transformed as in others they are not log transformed?

@MKmadhurima 2 года назад

Is there any way to do a ordinal logistics regression for panel Data?

@alainataylor4181 Год назад

is there a way to add nested effects into the model???

@AddisuYohannes-h8p 12 дней назад

Consider my dependent variable is Anaemia status thesis on "mixed effect ordinal logistic regression"1. How can I obtain table on percentage of anaemia status by region in R software? 2. How can I obtain table on prevalence of anaemia status by predictors for anaemia among reproductive age of women in R software? 3. How can I obtain table on Adjusted odds ratio(AOR) and 95%CI of adjusted odds ratios(AOR) for mixed effect ordinal logistic regression in R software?

@mdtanimhasan3312 3 года назад

The video is really helpful. I am struggling to see the dependent variable's factors outcome combined by or | Could anyone please explain? TIA

@bkrai 3 года назад

1 | 2 means level-1 given level-2, and 2 | 3 means level-2 given level-3.

@mdtanimhasan3312 3 года назад

Dr. Bharatendra Could you please explain how to interpret the outcome of the dependent variable combined with | For example here is the summary and p-value of my model, I am struggling to interpreter the dependent variable outcome, TIA. Coefficients: Value Std. Error t value H 0.10955 0.06687 1.6381 AGR 0.05929 0.06825 0.8687 NP2 -1.00909 0.30407 -3.3186 NP3 -1.69956 0.40289 -4.2184 NP4 -0.28106 0.44589 -0.6303 Intercepts: Value Std. Error t value 1|2 -1.1571 0.6301 -1.8363 2|3 -0.0505 0.6090 -0.0829 3|4 0.9036 0.6022 1.5005 4|5 2.2627 0.7164 3.1584 5|6 5.1148 1.5859 3.2253 6|7 16.5213 9.1049 1.8145 Residual Deviance: 631.3888 AIC: 653.3888 Value Std. Error t value p-value H 0.10954539 0.06687426 1.6380799 0.1014 AGR 0.05928751 0.06825109 0.8686676 0.3850 NP2 -1.00909459 0.30407139 -3.3186107 0.0009 NP3 -1.69956102 0.40288860 -4.2184390 0.0000 NP4 -0.28105858 0.44589078 -0.6303306 0.5285 1|2 -1.15712803 0.63014735 -1.8362817 0.0663 2|3 -0.05048673 0.60902379 -0.0828978 0.9339 3|4 0.90356996 0.60219631 1.5004575 0.1335 4|5 2.26273192 0.71641548 3.1584073 0.0016 5|6 5.11484231 1.58585762 3.2252847 0.0013 6|7 16.52126027 9.10488998 1.8145480 0.0696

@subashghimire1604 7 лет назад

Do you have any tutorial for goodness of fit test for ordinal logistic regression?

@tariqawanish 6 лет назад

is goodness for fit test is ap plied in stata

@bkrai 4 года назад

It already includes test of significance.

@SandeepKumar-me6qr 5 лет назад

Very Nice explanation sir. Can you please upload the Cardiotocographic.csv file?

@bkrai 5 лет назад

Here is the link: goo.gl/Xc4G7J

@kaapiglass 4 года назад

I'm getting this kind of error do you know what this mean? Warning message: In polr(AccessOnlineRecord ~ ., trainHint, Hess = TRUE) : design appears to be rank-deficient, so dropping some coefs..........

@bkrai 4 года назад

It is just a warning message, not an error.

@AymanTurkistani 2 года назад

Thank you!

@bkrai 2 года назад

You are welcome!

@nageshgoud4266 6 лет назад

Hi Sir, It's a nice video, I always follow you other videos, they are very good. I am running the ordinal LR on my own data i.e., insurance to find the EMlevel and this dependent variable contains 6 levels i.e., 1,2,3....6. So as per your instructions I converted EMlevel variable to ordered and str is appearing as "EMLevel : Ord.factor w/ 6 levels "1"

@Astronoom 4 года назад

Is this approach equal to the CatReg function in SPSS with ranking?

@bkrai 4 года назад

I've not checked it in SPSS. But I guess results should be same.

@sunilbobb 6 лет назад

sir - can u show how to do we interpet abalone data from kaggle or UCI

@bkrai 4 года назад

I saw this today, hope it's taken care of.

@alfredkik3675 3 года назад

Excellent tutorial!

@bkrai 3 года назад

Thanks!

@alfredkik3675 3 года назад

@@bkrai Hello again Dr Rai, I tried to perform an OLR but the brant test assumption did not hold. Omnibus plus other variable were less than 0.05. What else should I do? is there any alternative test for ordinal dependent variables? Your kind advice will be greatly appreciated.

@hayonimengi4171 5 лет назад

Superb!!!!

@bkrai 5 лет назад

Thanks!

@nasamumusa5044 7 лет назад

Thank you Bharatendra Rai. I get your explanation and have adapted my work well following the steps shown in your video. I have one issue please. Where columns with independent categorical data having 3 or more levels like the column of "Tendency" shown in your video; the model gives different "Value", "Std. Error", "t value" and "p value" for each level of such variable. This seems challenging and confusing to interpret and write out the equation of the model as some of the p values of the levels may not be significant, which should be removed while the other levels been significant are left. How can such a model be clearly written out and explained? Gracias!

@bkrai 7 лет назад

When a independent variable is categorical and takes three values, the correct way to represent it in a regression based model is with the help of 3-1=2 dummy variables. That's what you see here. When Tendency0 & Tendency1 are both zero, then Tendency = -1. When Tendency0 =1 & Tendency1 = 0, then Tendency = 0. When Tendency0 = 0 & Tendency1 = 1, then Tendency = 1. Note that in the equation Tendency0 & Tendency1 can only 0 or 1.

@nasamumusa5044 7 лет назад

Bharatendra Rai in my case I used dummy variables of 1,2,3 for the three levels my independent categorical data. (Probably I should start with zero?) I converted them to factors. With some independent variables which were continous or categorical and the dependent variable, I ran the model using polr. The output gave me always a coffeficient value for the continous independent variables whereas the categorical ones had different coffeficient for each level. Like with yours Tendency 0 had different coffeficient and p values from Tendency 1 and both were significant. However, when I found the significancy of my data from their p values. I observed that the p value of the various levels differ in some variable (say e.g. edu with levels 1,2,3. R choose level 1 as reference level and so level 3 had value greater than 0.05 while level 2 had p value less than 0.05). I should remove the level 3 too as I remove the non significant variables from the equation I suppose. How can I do so and what may be the following interpretation. Thanks for your kind offer to help.

@bkrai 7 лет назад

For categorical variables, even if one level is significant, do not drop the variable from the model.

@nasamumusa5044 7 лет назад

Bharatendra Rai I sincerly appreciate your explanation. It is noted.

@bkrai 7 лет назад

+Nasamu Bawa great 👍

@lauualb 7 лет назад

hi sir, how do you know the variable Max is causing the warning?

@bkrai 7 лет назад

+lauualb it was based on trial and error.

@subashghimire1604 7 лет назад

Hello, could you please tell me how did you get equations for probability, at 9:31/19:21 in above video

@bkrai 7 лет назад

It is similar to steps shown in the link below at 4:13, ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-fDjKa7yWk1U.html

@yujiaoli947 6 лет назад

I have the same question. Only z-statistics' p-value can be calculated by pnorm() while hereby it is t-statistic.

@Pinky-pb6od 6 лет назад

Hi sir. Can u please code support vector learning with ordinal regression

@bkrai 6 лет назад

Thanks for the suggestion, I'm adding it to my list for future.

@landersebastian7886 Год назад

good day professor how can I use Ordinal Logistic regression with bmi

@bkrai Год назад

See if this research paper helps: www.researchgate.net/publication/260273192_Does_Consumer_Behaviour_on_Meat_Consumption_Increase_Obesity_-_Empirical_Evidence_from_European_Countries

@nicolasaguirre8170 4 года назад

how can i fit a model with ordinal response without proportional odds?

@bkrai 4 года назад

You can try this: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-dJclNIN-TPo.html

@R.K.3010 7 лет назад

Hello sir, I am getting the following error "Error in optim(s0, fmin, gmin, method = "BFGS", ...) : initial value in 'vmmin' is not finite In addition: Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred" can you explain this?

@bkrai 7 лет назад

Send the codes that you used to look at.

@R.K.3010 7 лет назад

mod

@natasabajic7072 7 лет назад

@Rahul Kadge could you find a solution for this error, I got the same and would like to know how you solved it. Thanks,

@zahradidarali5804 4 года назад

What are your thoughts on AIC?

@bkrai 4 года назад

It estimates model related error. It is lower the better type of metric and helps to assess model quality. It is used for model selection or comparison.

@micheleannarumma4690 5 лет назад

thank you :)

@bkrai 5 лет назад

Thanks for your comment!

@adarsha1981 6 лет назад

Sir, does Ordinal Regression and Ordinal Logistic Regression are one and the same or are they different?

@bkrai 6 лет назад

Ordinal logistic regression is one type of ordinal regression.

@adarsha1981 6 лет назад

ok.. what kind of ordinal regression you would suggest to a situation where, i have 15 features with 3 features integer, 3, numeric and 8 categorical (binary) and 1 count variable (dependent).. i followed logistic ordinal but not a better result.. i have zero inflated count and tried ZIP model too.. not that great.. ..and cumulative link model(clm) is not fitting as well..kindly suggest

@bkrai 6 лет назад

what is your response variable?

@adarsha1981 6 лет назад

@@bkrai it's count and also I tried with ranking it .. I have more zeros

@seant7907 4 года назад

what does it mean to be 'rank defficient'?

@bkrai 3 года назад

Which part of the video are you referring to?

@Nientjuh22 4 года назад

Does anyone know if there is a maximum of independent factors R can handle for this model? I have 6 factors and it gives me an error. However, if I only use 5 of them, no matter which of them, R works perfectly normal

@bkrai 4 года назад

It must be some other issue. In this example I've used 21 variables without any problem.

@Nientjuh22 4 года назад

@@bkrai Thanks! But the error I get is: attempt to find suitable starting values failed In addition: Warning messages: 1: glm.fit: algorithm did not converge 2: glm.fit: fitted probabilities numerically 0 or 1 occurred