I am actively searching job. Sometime I feel like I won't get a job but after watching your videos I feel really learned something and it's give some confident. Thanks for the video. Please keep sharing videos.
It will be really helpful if you can provide a video lecture in which you put all the assumptions to a test on a Kaggle dataset(any). Cheers.. great work sir..
@@UnfoldDataScience all the assumptions as in multicollinearity, normality of residuals, autocorrelation.. all these assumptions applied on real dataset (basically executing all the assumptions in python)
Hi, Could you please answer how should we approach this situation in regression problem: The target variable is distributed in a biased manner(50% of the values lie in the range 0-300 and 30% in 300-500 and 10% in remaining 500-1000) , how will you approach such scenario?
Thanks I just have couple equestion . 1- What is the disadvantages of multicolinearity 2- in several cases, the distibutiin of error vector is not following the normat distribution. How can I deal with that
Hi Aman, I think we need strong justification for point 3,4,5. Why it should not happen??. I was asked in an interview and I was not able to justify point 3,4,5 . Could you please elaborate little more on these points.
could you please answer my question What is the similarities and the difference between a generalised linear model(Glm) and gradient boosting machine(Gbm)?
1. Linear relationship 2. Very low/No multicollinearity (independent variable correlation each other 3. heteroscedasticity 4. No autocorrelation Normally distributed error 5. All the observations are independent of the each other
Thank you so much for this wonder content. It was really helpful. In multicolinearity part, I have a small doubt. I understood through your example, it is better to remove one feature out of 2 if they are positively correlated. Does the same applies for negatively correated features too? I mean shall I drop one feature, in case two features are positively correlated?
How to remove multicollinearity from tha data set if the features are highly correlated can we solve the problem without removing any features or any information loss does PCA helfull?
I think we can use VIF "Variance inflation factor" and then decide which features should be included in the model. In addition, we also need to check the significance value from the OLS regression model. There is a threshold limit(generally for VIF
Just a note. The relationship between dependent and independent variables should be linear, linear in terms of coefficients but not in variables. When we are doing polynomial regression, the linearity between variable with target will not hold true. As we have raised power terms.
can you explain by taking a real life example more deeply because whatever you explained are the basic things with no depth explanation, so if possible please explain deeply by taking a good example even if the video becomes longer
You are extremely good in teaching I am looking for. Can I have your email address? I am from Bangladesh, beginner in research (M.Phil). I'm struggling in some topics of data analysis. I would like to contact with you if you approve. Thanks