Тёмный

From Scratch: How to Code Linear Regression in Python for Machine Learning Interviews! 

Emma Ding
Подписаться 55 тыс.
Просмотров 15 тыс.
50% 1

Linear Regression in Python | Gradient Descend | Data Science Interview Machine Learning Interview
Correction:
8:40 gradients should be divided by "m" instead of "n".
🟢Get all my free data science interview resources
www.emmading.com/resources
🟡 Product Case Interview Cheatsheet www.emmading.com/product-case...
🟠 Statistics Interview Cheatsheet www.emmading.com/statistics-i...
🟣 Behavioral Interview Cheatsheet www.emmading.com/behavioral-i...
🔵 Data Science Resume Checklist www.emmading.com/data-science...
✅ We work with Experienced Data Scientists to help them land their next dream jobs. Apply now: www.emmading.com/coaching
// Comment
Got any questions? Something to add?
Write a comment below to chat.
// Let's connect on LinkedIn:
/ emmading001
====================
Contents of this video:
====================
00:00 Linear Regression Overview
03:58 Gradient Descend
06:32 Linear Regression Implementation
10:48 Time and Space Complexity

Опубликовано:

 

1 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 32   
@diegozpulido
@diegozpulido 3 года назад
Hi Ema. Thank you very much for your videos. Thanks to them I got a Senior Data Scientist position at Facebook. I will forever thank you for your exceedingly good work.
@emma_ding
@emma_ding 3 года назад
Glad to hear the videos are helpful. Congrats on your position at Facebook!
@jijyisme
@jijyisme 3 года назад
The explanation is very clear and concise! Thank you so much. Please keep it going.
@jeoffleonora4612
@jeoffleonora4612 3 года назад
I like the step by step guide. And I learned a lot from your implementation.
@sitongchen6688
@sitongchen6688 3 года назад
Thanks Emma!!
@XuJiBoY
@XuJiBoY 3 года назад
Very helpful video, clear and concise! One minor correction: at around 8:40 (updating the gradients in method compute_gradient()) I think you meant to divide by m instead of n.
@emma_ding
@emma_ding 3 года назад
Yes, the gradients should be divided by m. Thanks for the correction!
@emma_ding
@emma_ding 3 года назад
Correction: 1. Thanks to Ji Xu - at 9:39, it should be divided by m instead of n. 2. Thanks to Hui Yi - at 9:39, beta_1 should be beta_other.
@leon_0907
@leon_0907 3 года назад
Should the beta_1 here be beta_other at 9:39? As beta_1 was not previously defined in the function.
@YEIYEAH10
@YEIYEAH10 3 года назад
Excellent, please keep it up!
@emma_ding
@emma_ding 2 года назад
Thanks, will do!
@LuciaCasucci
@LuciaCasucci 3 года назад
Thank you for the videos, Emma, I am on the very last round of Amazon and Microsoft for senior data scientist and I am finding your material excellent for prep and review!!! One thing I cannot understand what does the subscript j mean? how is it any different than i (slide at min 6 and 6 30)
@jessieshao1397
@jessieshao1397 3 года назад
Waiting for the new video for a week! It's coming!
@cccspwn
@cccspwn 2 года назад
One of the important interview questions that I've seen for linear regression is: "What are the linear regression assumptions"
@emma_ding
@emma_ding 2 года назад
Thank you for mentioning this! What a great tip to share with us!
@brandonhuang508
@brandonhuang508 2 года назад
great video
@jamiew3986
@jamiew3986 2 года назад
Hi Ema, thanks for your video. I wonder how we can explain concepts like gradient decent, maximum likelihood, loglikelihood verbally in an interview?
@nanfengbb
@nanfengbb 3 года назад
Thanks for posting a great video. Quick question, I noticed that you chose iteration = 100 and learning rate = 0.01 here. Is there a relationship between iteration and learning rate, e.g. iteration*learning=1?
@user-me2mm2xu7j
@user-me2mm2xu7j 3 года назад
Bear with my rusty math .... Calculating derivative of y over betai, at 6:27, how come it become Xji? It is making more sense if it is the derivative of y_hat over betai... consistent to the earlier slide at 2:50?
@techedu8776
@techedu8776 5 месяцев назад
Time complexity is assuming this is the code that would be used, which does not leverage GPU. With parallelization, a vector multiplication may be computed in constant time, reducing time complexity to O(M) from O(MN)
@florachen9654
@florachen9654 3 года назад
I believe in the gradient descent step, it should be beta[i] -= (gradient_beta_other[i ] * learning_rare) Since traversing down a slope requires taking the opposite sign of the computed the gradient
@shisk1
@shisk1 3 года назад
As she explained at 10:28, it's because the error term is calculated the other way around (derror_dy = 2 * (y_i_hat - y[i]), where y[i] is subtracted from y_i_hat. If you were to reverse it as in the traditional way, that is subtracting y_i_hat from y[i], then using your suggestion would work.
@Han-ve8uh
@Han-ve8uh 3 года назад
Thanks for sharing the implementation details. The explanation at 10:30 was hard to understand, specifically the reasoning that "if yhat>y, derror_dy is negative and thats why we add". Beta is being updated here, so the gradient is wrt B, so before reaching the error, B goes through yhat, then yhat contributes to E. That slide shows dE/dy, but where is dy/dB? It feels like a part of the chain rule was missing in that explanation, and directly jumped to "negative gradient" which includes dE/dy * dy/dB, but the latter term is not talked about in that slide.
@junqichen6241
@junqichen6241 3 года назад
since y_hat = b1*x + b0, the derivative of y_hat with respect to b0 would you 1 and the derivative of y_hat with respect to b1 would be x.
@arieljiang8198
@arieljiang8198 2 года назад
Hi Emma thanks for the video, but n shouldn't be in the compute_gradient function. it should be m instead
@jiahuili2133
@jiahuili2133 3 года назад
is the gradient computed correctly at 6:20? the negative sign seems to be missing when doing derivatives of error wrt to yhat
@annad8214
@annad8214 2 года назад
yeah, I agree, in the chain rule it should be the derivation of y hat wrt beta instead of y wrt beta
@nhandam1168
@nhandam1168 2 года назад
Yes, I agree with both of you. It should be the partial derivative of error w.r.t yhat, not y.
@Bookerer
@Bookerer 2 года назад
thats what i thought too!
@wongkitlongmarcus9310
@wongkitlongmarcus9310 4 месяца назад
is this code too long for an interview?
Далее
😱КТО БУДЕТ ЛЕДИ БАГ А4⁉️ #а4
00:50
Amazon Data Science Interview: Linear Regression
23:09
ML Was Hard Until I Learned These 5 Secrets!
13:11
Просмотров 254 тыс.
Python Coding Interview Tips for Data Scientists
12:39
Linear Regression From Scratch in Python (Mathematical)
24:38
Linear regression using R programming
20:01
Просмотров 99 тыс.