Тёмный
No video :(

Linear Regression 2 [Matlab] 

Steve Brunton
Подписаться 358 тыс.
Просмотров 19 тыс.
50% 1

Опубликовано:

 

21 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 30   
@sergiohuaman6084
@sergiohuaman6084 3 года назад
these videos should have 10K+ views. the instructor is amazing and the methodology is clear and can be applied directly. congratulations Steve keep up the great work!
@interminas08
@interminas08 4 года назад
Looks like the housing dataset isn't available in Matlab R2020a anymore. However, you can find it in the DATA zip file for Python on the book website. :)
@Eigensteve
@Eigensteve 4 года назад
Thanks for the tip
@syoudipta
@syoudipta Год назад
Thank you for pointing that out. This awesome course would have felt incomplete without this exercise.
@qatuqatsi8503
@qatuqatsi8503 6 месяцев назад
Thanks, Im looking at this a full 4 years later and I couldn't find the dataset anywhere! XD
@slevon28
@slevon28 4 года назад
Thank you very much for the great content! I like the fact that you keep this topic more on the mathematical side not just this "use toolbox X"-approach. This is of high value for me. I am currently walking through all your videos and I also just bought your book a minute ago. Finally, this channel seems massively underrated to me. Best regards from Germany.
@gustavoexel3341
@gustavoexel3341 2 года назад
Choosing whether to watch the Matlab or the Python version is like choosing to watch a dubbed movie or with subtitles, on one hand you're watching the version made originally by the author, on the other hand if you watched the other version you would understand much better
@Eigensteve
@Eigensteve 2 года назад
That is such a good analogy! FWIW, we always watch subbed over dubbed :)
@_J_A_G_
@_J_A_G_ Год назад
Why choose when you can watch both? I also found it interesting to read comments and see that some questions are repeated, but some seem to come from the different mindset of those choosing respective language (rather than being language related). Though observation not statistically significant. :)
@songurtechnology
@songurtechnology 4 месяца назад
Thank you Steve ❤
@alikadhim2558
@alikadhim2558 4 года назад
Thanks for the great illustration.
@delaramra5572
@delaramra5572 3 года назад
many thanks. Is it similar if giving the fft (as the Fourier transform of input and output )data into the regression command? it can have a better answer in some cases in comparison with time-domain data. both parts of fft (real + imaginary) data should be given to the regression solver?is there any key point?
@ankitchatterjee5343
@ankitchatterjee5343 4 года назад
Sir can you share more insights on the validation part?
@athenaserra8010
@athenaserra8010 2 года назад
Multiple linear regression?
@kouider76
@kouider76 3 года назад
Excellent as usual thanks
@nasirbudhah3063
@nasirbudhah3063 3 года назад
To interpret regression correctly, both x and y must be collected randomly. If x is series such as time sequence, then this cannot be called regression; it is called instead least squares line fitting
@_J_A_G_
@_J_A_G_ Год назад
I don't think there is such a distinction, maybe I misunderstood your point. "Linear regression" would usually have "least square distance" as the objective to minimize. Any way you see it, you have a linear combination of features to approximate a known (possibly approximate) target value.
@FelipeCondo
@FelipeCondo 3 года назад
The video is pretty helpful, thank you Professor. How could i put in a b=Ax a sinusoidal wave, from internal waves from the ocean. the model is SSH=A cos(k x cos(theta)+k x cos(theta)- w t- phi). where A, theta and phi are my variables. I mean i am trying to fit a plane wave, but i do not get the direction (theta right). could you give me some tip please?
@happysong4631
@happysong4631 4 года назад
I think if there is something wrong with the Xlabe? Since we only have 4 ingredients.
@Eigensteve
@Eigensteve 4 года назад
Good call. This label would be more accurate if it was "mixture of ingredients".
@David-pe2dt
@David-pe2dt 4 года назад
Can someone explain this: when plotting the significance / correlation of the different attributes, the response vector b has been sorted in the previous section, but A and A2 have not been sorted accordingly prior to performing the new multilinear regression... surely by doing this, the attribute matrix and response vector do not match as intended?
@David-pe2dt
@David-pe2dt 4 года назад
Another question that I would like to ask concerns computing Pearson or Spearman correlation coefficients between the original attributes matrix A and the response vector b. If the correlation coefficient for a given attribute has opposite sign to the slope of that attribute from multilinear regression, does that imply that the linear model is not a good fit for that particular attribute?
@_J_A_G_
@_J_A_G_ Год назад
> b has been sorted in the previous section, but A and A2 have not been sorted Old question, but seems to be relevant! Looking at 9:39 the line 19 sort(b), but also get sortind back. This sortind is then used on line 22 to rearrange A for the plot. Instead, it would have been good to update A (as did with b) to make sure that it was correctly sorted everywhere. Line 32 uses original A (which is ok) to create A2. Line 39 regression then should have used sortind on A2 (or the original b) for the regression. I've only looked at the code on screen, perhaps fixed elsewhere. --- About the correlation coeff: Does that even happen? If it's close to zero, of course a sign fluctuation doesn't cause much error. If big, yes that sounds like a bad fit, on the other hand it might also be the feature that is useless, but then the correlation should have indicated that.
@PenningYu
@PenningYu Год назад
I believe that is a mistake. It makes no sense to do regression with sorted b
@nami1540
@nami1540 2 года назад
I am confused about the design of the data matrix A at this point. DIdn't we state at the beginning, that each column is a snapshot? How can each row contain a measurement now? It makes sense when I look at this from the perspective of regression. It does not combine well, though
@_J_A_G_
@_J_A_G_ Год назад
If you by "snapshot" means "sample" I agree that the initial videos stacked the features in columns and each column was a sample (a set of measurements for the same situation). It was sometimes even an entire image reshaped into a column. To me that was confusing, I'm used to put samples into rows. Unfortunately the convention seems to be different for different situations, as you say, and I don't think he mentioned the change. My understanding is that columns of a matrix X could be the rows of another matrix T. That would be T = X' = V S U' (same U,S,V as from svd(X) = U S V') so in essence you have the correlation among columns and rows either way. See earlier video for that hint: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-WmDnaoY2Ivs.html
@user-ym8rz6mw5r
@user-ym8rz6mw5r 2 года назад
So here you use svd that reduces data to the square matrix of the same size as number of variables. What happens if you use less components than variables? Is that even possible?
@_J_A_G_
@_J_A_G_ Год назад
If you remember, the components are ordered by importance from SVD. Discarding components makes a less accurate approximation, but sometimes that's fine (perhaps you hade lots of noise in measurement, and getting rid of that is actually a bonus). This is also related to "feature reduction", where you can figure out that some of the data (e.g. shoe size) is marginally relevant for your target (e.g. house price) and you should exclude it. Anyway, the "features" or components selected by SVD are rarely physically relevant or matching the actual features you had in your data. The other aspect of this was covered in Linear Systems video. Depending on your matrices, the system may be underdetermined or overdetermined. If you have only a few "variables in X" the degrees of freedom are limited. Again, this very overdetermined system leads to a more approximative solution and you may find that it "wasn't useful" even if possible.
@camiloruizmendez4416
@camiloruizmendez4416 3 года назад
sorry to bother housing.data is missing in the webpage
@camiloruizmendez4416
@camiloruizmendez4416 3 года назад
This code fix it filename = 'housing.txt'; urlwrite('archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data',filename); inputNames = {'CRIM','ZN','INDUS','CHAS','NOX','RM','AGE','DIS','RAD','TAX','PTRATIO','B','LSTAT'}; outputNames = {'MEDV'}; housingAttributes = [inputNames,outputNames]; formatSpec = '%8f%7f%8f%3f%8f%8f%7f%8f%4f%7f%7f%7f%7f%f%[^ ]'; fileID = fopen(filename,'r'); dataArray = textscan(fileID, formatSpec, 'Delimiter', '', 'WhiteSpace', '', 'ReturnOnError', false); fclose(fileID); housing = table(dataArray{1:end-1}, 'VariableNames', {'VarName1','VarName2','VarName3','VarName4','VarName5','VarName6','VarName7','VarName8','VarName9','VarName10','VarName11','VarName12','VarName13','VarName14'}); % Delete the file and clear temporary variables clearvars filename formatSpec fileID dataArray ans; delete housing.txt housing.Properties.VariableNames = housingAttributes; X = housing{:,inputNames}; y = housing{:,outputNames};