The Beauty of Linear Regression (How to Fit a Line to your Data)

Подписаться 68 тыс.

Просмотров 153 тыс.

50% 1

In this video, we'll explore the concepts surrounding linear regression. Linear regression is very useful in math, science, and engineering, and is a gateway to other kinds of regression, and optimization problems in general.
Download the Linear Regression Example Code here: pastebin.com/7cgh951s
Thanks to fesliyanstudios.com for the background music! :)

Опубликовано:

25 ноя 2022

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 256

@RichBehiel Год назад

Hi everyone, this video has been getting a lot of views lately so I just wanted to say thank you, and I really appreciate all the positive feedback. It’s great to see such a positive response, and I’m glad that so many people are enjoying linear regression! :) I also appreciate the constructive criticism! A few of you have pointed out that the music is distracting, the motion is too repetitive, and the pace is a bit slow. I didn’t see that when posting the video, but I can totally see where you’re coming from, so I’ll definitely take that into account when making future videos. This was one of my earlier videos and I was still figuring things out. So I really appreciate your feedback, and I hope these videos will get better over time.

@myetis1990 Год назад

You are not only teaching math stuff but also teaching how to think, thank you very much for the great video. Really inspiring, glad i discovered this channel, waiting for the videos about jacobian , translation , rotation, quaternions

@ehfik 9 месяцев назад

the constant animation loop gets a bit annoying. reversing, stopping and changing the animation from time to time would be a solution (and your newer videos are even better anyway!)

@RichBehiel 9 месяцев назад

I agree. Honestly I look back on this video and cringe at a few of the details, like how the animation loop goes on and on and is a bit nauseating, and music is too loud. But you live and learn! 😅 When I first started making these videos I really had no idea what I was doing.

@phenixorbitall3917 7 месяцев назад

@RichBehiel 18:19 on the left hand side you used Laplace Symbol instead of Nabla Symbol. But except that => great video! 👍

@atticmuse3749 3 месяца назад

With regards to pacing, I want to say that I really enjoy your general presentation style. You're not simply reading a script and getting the perfect take, you're actually doing a "live" presentation and I really appreciate the way you ad lib or go off on little tangents. I burst out laughing in your buoyancy video when you read the integral "zndS" phonetically.

@patricktanoeyjaya4430 Год назад

I really love how calmly you speak and how the lines you say feel unscripted. Makes it feel very personal. You also speak so clearly and concisely. I was able to get the gist of this with only high school calculus! This is making me like math again.

@RichBehiel Год назад

I’m very glad to hear that! :)

@user-pw5do6tu7i Год назад

unbelievably crisp explanation of gradient decent. It is remarkable to see it play out in those dimensions. Thank you

@w花b Год назад

And he repeats the animation so we can assimilate what's going on instead of quickly switching to the next thing. Very relaxed explanation which is nice.

@matteokimura1449 Год назад

Another beautiful way to get a linear regression formula is to take the vector space of all real-valued functions that are defined for the x values, choose the hypothetical ideal function that maps all of the x's to their y's, and orthogonally project that hypothetical function onto the subspace of linear functions. By defining the inner product as the cartesian dot product between the output of the functions due to the x values, you'll see that the distance the projection minimizes is the error between the linear function and ideal function.

@andreiimbru6835 Год назад

As an Econ Major, you have no idea how much this helped me understand the behind the scenes of regression lines and everything I've done in Statistics this semester, I've learned soo many new techniques with equation manipulation so, thank you!

@RichBehiel Год назад

Glad to hear that! :)

@zeyogoat Год назад

A rare video that's technically adept and, most importantly, not condescending or pedantic! Well done, from a chemist and educator =)

@TheRiverNyle Год назад

As an Applied Math (Stats/Probability Theory focused) major, this really got me excited!

@tommyproductions891 Год назад

great video! I love how at the start you explain the equation of a straight line and by the end it's multivariable vector calculus

@Liberty5_3000 Год назад

It's so beautiful! Thank you a lot! I hope your channel is gonna grow fast soon

@RichBehiel Год назад

Спасибо! :)

@berndkopera7723 Год назад

Absolutely beautiful visualization! Simple, smart and intuitive.

@mroygl 2 месяца назад

This is a piece of art, a captivating blend of deep understanding of the matter, beauty of plain graphics, voice acting, matrices, and "simple" software.

@simonleonard5431 Год назад

Thank you! I've been playing with a spherical geometry problem and there's so much I've forgotten from my school days. This video reminded me of so many things, including ways of expanding my approaches to problem solving. Brilliant 👌

@M.KRISHNAKANTACHARY 2 месяца назад

Thanks a lot for clearly explaining the concept of fitting a linear regression so beautifully.

@ivopfaffen Год назад

Sooo cool! As a cs major struggling with a numerical analysis class, this helped me understand linear regression so much better. Thanks man!

@MattHudsonAtx 3 месяца назад

I saw the calculus approach coming a mile away but it's great to see the linear algebra done so clearly. I need to take that again.

@ehfik 9 месяцев назад

this was SO satisfying! hope to see many more explanations, such a great execution!

@RichBehiel 9 месяцев назад

Thanks! :)

@johnstuder847 8 месяцев назад

Thank you! This is definitely one of RU-vids math gems! Ties so many ideas together. I would love for you to do a video on Fourier Epicycles. For reference, GoldPlatedGoofs ‘Fourier for the rest of us’ is a great starting point. I’m sure you could do a beautiful refined version showing how the Inner Product, Fourier, QM, function spaces and Art all come together in a beautiful way. Thank you so much for your sharing your videos!

@RichBehiel 7 месяцев назад

Thanks for the kind comment, John! :) I touch on Fourier analysis in my upcoming video on relativistic QM, the Klein-Gordon equation. Hoping to upload it within a week.

@Ayesha_F Год назад

Oh this was so SATISFYING! I don't think i have ever seen regression explained this way. It's like parts of how i understand it, is being so wonderfully articulated by someone who obviously knows the subject matter well. I have had to teach myself mathematics and statistics, and I've always been drawn to this intuitive and philosophical way of understanding it. Thank you for this!

@RichBehiel Год назад

Thanks for the kind comment, and I’m glad you enjoyed the video! :)

@TheScepticalChymist Год назад

I cannot finish the video because your voice is SO charming and comforting and makes me feel so safe, I just cannot pay attention in the maths

@jiadong2246 Год назад

Great work! Thank you, and I'm looking forward to your linear regression and gradient decent videos you mentioned at the end of the video

@sujalgvs987 Год назад

I absolutely loved this video. Please do more videos on regression and machine learning as a whole.

@enricolucarelli816 3 месяца назад

Wow! This is perfection explaining/visualizing complexity and its beauty! ❤❤❤❤ 👏👏👏👏👏

@anthonyrojas9989 Год назад

This was amazing! So fun to watch and appreciate this concept.

@RichBehiel Год назад

Thanks, glad you enjoyed the video! :)

@xxge Год назад

Great video! Coming from a linear algebra heavy background I still think taking the singular value decomposition of X, inverting it, and multiplying by y to find b is a more elegant and simple approach especially for multiple linear regressions, but I imagine if you have more experience with physics this approach would be more familiar and easier to digest. Keep these videos coming!

@dadamczyk Год назад

Great video! With those animations it would be wonderful to see an essay about bayesian linear regression since it is quite different and powerful approach to similar topic.

@benwinstanleymusic Год назад

Really enjoyed this, you're great at explaining stuff

@Aziqfajar Год назад

This is beautifully explained and visualized! I'm glad to be on the first wagon for the ride of this video.

@RichBehiel Год назад

Thanks, I’m glad you liked the video! It’s one of my favorite mathematical concepts, so it’s great to see others enjoying it too :)

@davidandrewthomas Год назад

This is beautifully put together! What a great explanation!

@RichBehiel Год назад

Thanks! :)

@alexkushnir8073 Год назад

Cool music Richard, it opens my mind and makes me understand things better! It's like combining hypnosis and a class;-) I wish my math teacher at school would have explained it to us in that way 🙂

@atticmuse3749 3 месяца назад

12:16 "it should keep you up at night" Very apropos considering it's almost 4:30 am right now and I've been watching your videos for hours 😅

@wishIKnewHowToLove Год назад

he just dropped the most beautiful linear regression video and thought we wouldn't notice

@jwilliams8210 Год назад

Fantastic presentation!

@marktahu2932 Год назад

Really very helpful - and I'm no professional in any of these fields, but just an old technician who is being reminded of all those brain neurons that have lain dormant for decades,

@tesstera Год назад

Amazing! Thanks for showing us how to solve a Maths problem in a Physics way. Even though this method has been used in nowadays AI already, it is still very interesting to see it works outside AI. The conceptual journey you taken reminds my trial on machine proving, or ATP; and helps most to eliminate the intimidation of numerical analysis. Thanks!

@bernard2735 Год назад

Beautifully explained, thank you. Liked and subscribed and looking forward to more.

@levimillerfandom Год назад

I was really stuck on a practical, I have to make a graph of my readings the book Stated that I should get a straight line but instead I got curves was really stressful, but thankfully found your video, It really helped❤ Thanks again

@ABKW119 Год назад

Why do your videos only get recommended to me at 1am, they send me straight down a rabbit hole 😂

@RichBehiel Год назад

Sorry 😂

@Cristi4n_Ariel Год назад

This was interesting! Thanks for sharing.

@coreymonsta7505 Год назад

I love code and taught calc 3 a couple times, which is my favorite class, but never learned about this topic in school (only hear of its name a lot). That was really interesting

@zeb4827 Год назад

very cool video, this connected some dots that I've been struggling to reconcile

@user-hl8sv1if7j Год назад

wow. So well explained. Thank you

@benjaminshropshire2900 Год назад

IIRC there *is* a way to leverage that outer product observation: If D is a matrix where each column is [xᵢ 1] and Y is another matrix where each row is [yᵢ] then the entire left Σ becomes DDᵀ and the entire right Σ becomes DY. also (I think) this actually generalizes to linear equations with more terms by adding the data as more rows in D. And the data can also be functions of existing simpler terms (e.g. Nth powers of x to get polynomial fits, sin(nx)/cos(nx) to get discrete Fourier transforms, etc.).

@Lado916 Год назад

Great video! I absolutely love the visual and dynamical proofs in math. I just wanted to add that there is a beautiful point-line duality between the two spaces: While a dot in parameter space corresponds to a line in real space, a line in parameter space defines a family of curves in real space that intersect at the same point. Moreover, if you map your datapoints to their corresponding dual lines, the center of mass of these lines will be a dual point to the best fit line of the data! Hope you find this as cool as I do.

@RichBehiel Год назад

That’s really cool! I’ve read about that kind of thing in an intro to differential geometry book, but hadn’t connected the dots in the context of this video. Thanks for a very interesting comment :)

@TranquilSeaOfMath Год назад

I really like all you put into this video. It helps connect ideas in interesting ways. Thank you for including the Python code.

@mskiptr Год назад

The parameter space is a super powerful concept. Especially in computer vision, where you can take a bunch of pixels and quickly detect all the lines they approximately form

@CarlosHlavacek Год назад

Really beautiful class.

@kalaiselvan6907 Год назад

❤️❤️❤️This is Gold ❤️❤️❤️ Thank you

@8megabitz706 Год назад

Ive been waiting for this for too long 10:17

@IAmTheFuhrminator Год назад

Such a great video! I had a lecture about this years ago in my engineering analysis class in undergrad, but I took such poor notes that I was never able to reproduce this function. Now as homework I'm going to take your process and solve for other functions like parabolas or cubics which will require me to use 3 and 4 dimensional parameter spaces. Thanks again for the great video!

@RichBehiel Год назад

That’s awesome, I love to hear that! Challenge for you: can you solve it for a general N-degree polynomial? Like with some kind of recursive algorithm. I actually don’t know if this is possible but it seems like a fun puzzle!

@IAmTheFuhrminator Год назад

@@RichBehiel that would be a fun problem to solve! And even if it can't be solved, I'm sure proving or disproving the possibility of a solution would make a great paper!

@chrislau9835 Год назад

Very good explanation 👍🏻👍🏻

@micahwithabeard Год назад

i just liked, subbed and commented :D i don't think i can be any more "violently complementary" than that. this was excellent thanks!

@andytroo Год назад

introducing the Jacobean could be a nice extension - the shape of best fit is an ellipse, which can make converging towards the best solution hard, as many of the gradient directions in the top half of your example are not pointed towards the best solution, simply towards that valley of best fit. Reshaping the gradients to make that ellipse a circle allows much quicker conversion

@RichBehiel Год назад

Great idea! I’d love to do a video on that someday.

@michahejman6712 Год назад

Great video! 30 minutes felt like 5 :) Thanks!!!

@RichBehiel Год назад

Thanks, glad you enjoyed the video! :)

@AfroNyokki Год назад

Great explanation, loving it so far. I'm majoring in applied math with a focus in numerical analysis, so this stuff is always fascinating haha. I noticed around 18:20, you started using delta instead of del. Thought it might be a typo but just wanted to check!

@RichBehiel Год назад

Yeah that’s a typo, sorry! 😅 Thanks for pointing that out.

@RocaSeba Год назад

This video is genius. Subscribed.

@nooks12 Год назад

Satisfying video. Took me back to University.

@피클모아태산 Год назад

Great video! I never thought parameter space with 'Error Force'.

@rouninph6349 Год назад

It looks like you are trying to hypnotize your listener. 😂 Great explanation btw. Using physical arguments to explain a mathematical concept, I like that.

@williamfurtado1555 Год назад

This video is wonderful. How did you create the interactive visualization with the "Parameter Space" and "Real Space" subplots? I'd love to be able to create one on my own.

@RichBehiel Год назад

Thanks William! :) For this video I used Python, specifically matplotlib. You can use that by downloading Anaconda, which will install Python and some scientific modules, then call “from matplotlib import pyplot as plt”. After calling that line, you can use things like plt.figure() and plt.plot() to make a figure and plot things. In this case the parameter space and real space are two subplots in a figure. They’re refreshing at 60 frames per second in a loop which sets the dot’s position in the parameter space while making the line in the real space, based on the current a and b values. To turn on the error landscape, I also added some code to evaluate the error metric (objective function) at all points in the parameter space for each a and b. Then for the error force I calculated and plotted the negative gradient of that. For the part where the dot descends down the gradient, I used F = ma - kv with mass parameter m and friction-ish parameter k to make the dot roll down the hill and then stop at the optimal point. I’ll be more careful in future videos to post the source code of the animations too. Well, at least for videos after the one I’m going to post this week; for that one, and the previous videos, I was very sloppy with the code and it wouldn’t be too helpful to see them. But there have been a few comments now about how these animations were made, so I figure the best answer is the code itself. In the future I’ll be better about writing cleaner animation codes and sharing them.

@rocknroll909 Год назад

@@RichBehiel wow, you're awesome for such an in-depth reply to this. Thank you, I might try this on my own

@user-pn1lm3pi6p Год назад

Very good!

@maxfitzkin9422 Год назад

I really loved how you put this video together! What did you use to animate and edit everything? It was really clean!

@RichBehiel Год назад

Thanks! :) I used matplotlib in Python.

@ydl6832 Год назад

Yeah, this is a nice explanation. Neural network is just a more sophisticated version of line fitting with more parameters.

@StudyEnggFocus 3 месяца назад

Hello, Richard! Could you explain what you meant by error metric? Thanks

@einsteingonzalez4336 Год назад

That’s awesome! But what happens if we let N approach infinity where the data points are in a finite domain?

@scienceuser4014 Год назад

Perfect video

@torquencol Год назад

Lmao thank you for this, this video just came into my recommendations when I needed it most: I've been stressed these last few days just doing laboratory reports, where I have to use a lot the regression line 🛌 It made me hate it less

@alexander_adnan Год назад

Thank you 🙏 ❤❤❤

@GradientAscent_ Год назад

Very cool animations

@brianli3493 Год назад

electric potential actually helped me understand this omg

@Osniel02 Год назад

just gorgeous!!!

@peterwolf8092 Год назад

😂 I realy love this and wish my highschool students would understand it so I could share it with them.

@trustnoone81 Год назад

Do I understand correctly that the "valley" in the error landscape is the set of all lines that pass through the point (x-bar, y-bar)?

@RichBehiel Год назад

Great question, and I’m actually not sure. Anyone know the answer?

@account4345 Год назад

Just gotta remind myself this is why I must master linear algebra.

@RichBehiel Год назад

Mastering linear algebra is a great and enduring source of spiritual fulfillment 🙏

@guslackner9270 Год назад

This video is a wonderful explainer! You've listed in the description that linear regression is "very useful in math, science, and engineering" to which I would like to add economics, which is what I am studying. This video and Jazon Jiao's work (ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-3g-e2aiRfbU.html) are the best explanations of the concept that I have seen in video, lecture, or textbook form. I look forward to seeing what else you share on this channel!

@tylerbakeman Год назад

Instead of calculating Dy, it might be better to calculate the distance a point is from the line (especially for smaller data sets, where Dy could be large, bur infact the line could be very close).

@jursamaj 9 месяцев назад

And you can fit to other curves with simple transforms of one or both axes, like log or exp.

@badermuteb1012 8 месяцев назад

How did code these interactive plots? Thanks

@jamesmcfarlane3469 Год назад

Is this method, or something similar applicable to non linear least squares? I did a project over Christmas using non linear least squares regression and this would’ve been super helpful 😅

@RichBehiel Год назад

The same concept of minimizing a least squares objective function by setting the gradient to zero applies to nonlinear least squares, but there are also extra steps involved.

@JOHNSMITH-ve3rq Год назад

Wow. Seen so many videos, read so many papers and books - but this one takes the cake. Would love to see you doing this but for more complex models with fixed effects and all sorts of other bells and whistles. Impressive!!

@nofalldamage Год назад

Great video. Is the matrix at the end always invertible?

@RichBehiel Год назад

Great question! It’s invertible as long as its determinant isn’t zero. Since it has the form [A,B,B,N] where A and B are real numbers and N is a positive integer, so its determinant is AN - B^2. For this to be zero would require that AN = B^2, in other words for the sum of x_i^2 times N to equal (the sum of x_i)^2. I’m not sure if this can happen, feels like it can be proven one way or the other without a ton of work, but I’ve gotta go. So I leave that as an exercise to the reader! :)

@nofalldamage Год назад

@@RichBehiel I think one of the cases where the matrix is not invertible is if all the points are on a vertical line. Kind of makes sense since then the form y = ax + b doesn't really work.

@akidnag Год назад

Great vid, thank you! I'm struggling knowing how do you visualize the "parameter space" in python?

@akidnag Год назад

I did a mesh grid for a and b from -5 to 5 100 points X,Y. Then I calculate the modulus as Z= the sum of the sqrt of the ^2 of each eq and did a contourf(X,Y,Z), but no luck :/

@akidnag Год назад

I think the quiver plot is ok as quiver(X,Y,eq1,eq2)

@RichBehiel Год назад

I did a contourf and a quiver. If the contourf isn’t working, it’s possible the color limits are off? Oh, actually come to think of it, I might have taken the log or sqrt of the error, to flatten out the landscape so it’s easier to see. Basically applying a nonlinear colormap.

@akidnag Год назад

@@RichBehiel Thanks a lot! Keep up the great work!

@akidnag Год назад

Still no good, I'm sorry. So in contourf is V (or log(V) or sqrt(V)) and in quiver is Fa,Fb, with spanned a and b, right? Sorry to bother but I feel I understand, but not having the same results make doubt what I'm doing wrong :/ Is it too much if you share the code for visualizing the parameter space?

@potatochipbirdkite659 Год назад

Do you have the blue dot following a Lissajous curve?

@RichBehiel Год назад

I forget what I did for that, I think I just had some sines and cosines of different frequency in x and y.

@PatrickDoolittle Год назад

I like Sujal Gupta watched this video because I am studying machine learning. I have been studying simple linear regression for the past couple weeks now! Just yesterday I started to think about how the moore-penrose psuedo inverse generalizes the idea of an inverse to situtations where the matrix is not square. I call linear maps to a higher dimensional space "embeddings" and linear maps to a lower dimensional space "projections". For a square matrix, which is neither an embedding or a projection but a linear operator in the same dimension, we can undo the linear mapping by finding the inverse X^-1. In the case of projections, there are many high dimensional vectors that can be projected down to a given low dimensional vector, so there is no unique inverse. However we can solve the system Xb=y for b using the Moore-Penrose *psuedo* inverse: (X^T X)^-1 X^T. When we apply the moore-penrose psuedo-inverse on the vector of response variable y, we project y onto the row-space of X, which is formed by the row vectors, which are linear combinations of the parameters. By projecting y onto every data point (row vector) and adding it up(in essence projecting onto the entire row space), we get our coefficients, and that is the beauty of the moore-penrose psuedo-inverse!

@davidmurphy563 Год назад

I code DNNs too. Um. I understood your words but not your point. Genuinely curious here. So we can calculate the inv matrix. Take the reciprocal of the determinant and multiply it by the matrix with the diagonal swapped and the upper/lower negated. This spits out a new matrix with the property that if you multiply that by the original you get the identity (assuming linear independence). Ok fine, all very useful. But what's that got to do with the price of fish?

@peterwolf8092 Год назад

Is it possible to get a „second best“ valley? A pseudo best solution?

@RichBehiel Год назад

Not for linear regression, but for fits with more parameters yes. Gradient descent can sometimes get stuck in a local minimum, a valley other than the best one. If there’s an analytic solution, it might involve the roots of a polynomial or something, so you can have multiple values which are locally optimal. In that situation, the height of the objective function at each optimum can be quickly compared, since the list should be pretty short.

@PrismaticCatastrophism Год назад

Could you make similar video about parabolic graphs?

@RichBehiel Год назад

I’d like to someday! The procedure is very similar, but ax^2 + bx + c instead of ax + b. It’s a 3D parameter space, but the same techniques work.

@denisbaranov1367 10 месяцев назад

The beauty of: Linear Regression

@kummer45 Год назад

Imagine you have a surface with a magnet. That's a game changer. Understanding the concept of statistics doing physics is the correct way of UNDERSTANDING mathematics and PHYSICS. However physics has nothing to do with mathematics and mathematics has nothing to do with physics. The magic of this is MODELING. Linear regression, average, the gauss curve are concept of fundamental use in statistical mechanics. Eventually higher mathematical physics will launch the student into the field of MODEL MAKING.

@SD-ni9jh Год назад

beautiful vid

@RichBehiel Год назад

Thanks! :)

@m9l0m6nmelkior7 3 месяца назад

But is that matrix invertible if there is more than one extremum ?

@bronga645 Год назад

sub, like and comment for your effort, even if you dont make much on yt you are a great mathematician! And i am sure you will make it in life and be a help to humanity as a whole. thank you

@RichBehiel Год назад

Thanks for the kind comment! :)

@sarthakjain1824 Год назад

That was on the level of 3 blue 1 brown videos

@RichBehiel Год назад

Thanks! :) Grant is a role model for sure. The aesthetics of his videos are much better than mine though 😅 But I’ll get better over time.

@benandrew9852 Год назад

holy shit I have genuinely never even come close to thinking about it like this top marks, no notes

@DavyCDiamondback Год назад

Nice video on OLS. I've often wondered though why lessons on regression focus on OLS rather than Deming Regression, as OLS seems objectively inferior, so to have so many projections based on the inferior model, we are shooting our research methods in the foot from the start

@RichBehiel Год назад

Good point. Frankly I think it’s because OLS is easier, and gets the job done in most situations. But I agree that there are times when Deming regression is better. Although someone who uses Deming would presumably have learned OLS first. OLS is also conceptually ideal for explaining how calculus can be used to minimize fit error, so it’s a good go-to image to have in mind when solving fancier optimization problems.

@DavyCDiamondback Год назад

@@RichBehiel I completely understand, in fact, this subject is making me think about applied mathematics, because if we go deeper, it's not like linear regression in any form is the the best way to actually model most data, so I'm thinking about dividing a function into splines to create a good fit, you can go too far and smoothly fit every point into a function, but then your function is skewed towards the data set, losing the ability for good projections. It's an interesting puzzle (and I hated applied mathematics in college)

@turun_ambartanen Год назад

Well, there are quite a few advantages of OLS compared to total least squares fit. For one, every measurement where x is tightly controlled and y is the thing you want to learn about, OLS is the right tool. Because there are no or only negligible errors in x, the distance of datapoints to the prediction, dx, doesn't matter and must not be included in the fit. And also it works much better with arbitrary functions than total least squares. For an arbitrary function I don't think there even is _any_ way to calculate the total least squares error. Only well behaved functions work, and even then you have to define the derivative to perform a total least squares fit.

@BrunoJMR Год назад

When calculating the zero gradient, how do you avoid the local minimum problem? They are also zeroes of the gradient

@RichBehiel Год назад

True! For more complicated fits, the parameter space becomes more textured and you’ll often have multiple local minima. But with an analytic solution, these minima can be quickly calculated, for example as roots of a polynomial. Then there’s just a small list of points at which the objective function can be evaluated and compared, and the minimum can be chosen from the list.

@BrunoJMR Год назад

@@RichBehiel Thanks! So the analytic solution gives us all the minima and we then can just check which one is the lowest. Cool

@RichBehiel Год назад

Yup. There may be some maxima and saddle points in there too, since those also have zero gradient, but those can either be filtered out analytically by solving some second derivatives for additional constraints, or just kept in the list and they won’t be the minimum so it doesn’t matter. In practice, people almost always do the latter. The only exception would be if the data rate is very high and there’s some benefit to solving those equations in exchange for a marginally faster routine. So in super high performance scenarios, the second derivatives are worth looking at.

@sgtreckless5183 Год назад

Is the direct product in the final formula always non-singlular, and so always has a inverse?

@RichBehiel Год назад

I believe so, but I’m not 100% sure actually. As a good exercise in math, you can explore if it might be noninvertible under some conditions, just set the determinant to zero and see what a dataset would have to be like in order for that to happen. I’ve done millions, maybe billions, of linear regressions (on data streams) and have never run into this problem though.

@sgtreckless5183 Год назад

@@RichBehiel Doing just the quickest amount of working out with a dataset of 3 values, I think the sum of outer product would only be singular if all the x values are the same, which obviously isn't going to happen. It's fairly easy to show that if we have a dataset like this, the matrix is singular (the 1st row of the matrix is just the second multiplied by x_i), though I'm not sure how you'd prove it the other way around (i.e. the matrix is non-singular in all other cases).

@RichBehiel Год назад

That makes sense! Btw, these equations are equivalent to a force and torque balance, if the residuals are imagined as elastic springs, so physically it makes sense that it would only be singular if the x values are all the same, or something like that.

@account4345 Год назад

Could you not also just sum the magnitude distance between points and a potential line of best fit, then find the minimum total distance possible? Maybe through some differential optimisation process? I imagine there is some reason why this wouldn't work. Its been too long since I have done any of that sort of math so I cant really test it out myself, I would have no idea what I am doing lmao.

@willbedeadsoon Год назад

When I run a code in VS code it shows nothing, but if I start debugging line to line at "plt.subplots(figsize=[8, 4.5])" it shows matplot window. Ités weird for me. What's going on here?

@RichBehiel Год назад

Hmm… I’m not sure, tbh. Do you have all the modules installed? I’d recommend installing Anaconda, then running the code in Spyder (it comes with Anaconda). That way you’ll have a lot of mathematical and scientific modules already installed. Plus, Spyder looks cool.

@SkrtlIl Год назад

Not sure why you get the window in debugging mode, but for normal python scripts you usually have to call plt.show() manually while notebooks would trigger them inside the corresponding cell. So you may also change your .py to .ipynb and run that in vscode

@zukofire6424 Год назад

Beautiful and surprised I never knew some of what you explained. I wanna add something irrelevant : you are so handsome!

@Null_Simplex Год назад

Why is it that in statistics it seems like mean squared error is the go to measure of dispersion such as in this video or using standard deviation rather than mean absolute deviation? My only guess is that mean square error resembles the Pythagorean theorem, and the Pythagorean theorem comes into play when you are dealing with perpendicular lines, and perpendicular lines sort of represent the idea of independent measurements. Probably just a bunch of rambling.

@RichBehiel Год назад

More than just rambling, you’ve hit on something deep there. It’s kind of weird that the distance between two points is the Pythagorean theorem, and not just dx + dy, right? Sort of related to that, the derivative of y^2 is 2y everywhere, which is nice and easy to work with. But |y| is sharp and not differentiable at y = 0. So it’s easier to work with y^2. You could do it with |y| but it’s more tricky.