Regarding the problem at 26:28 It would be solved if the matrix U was [1 0; 0 -1]. (Replace the 1 at the bottom right of the 2x2 identity with -1). This can be found by following the argument that Prof Strang makes in this video: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-rYz83XPxiZo.html (Skip to 19:37). The problem is that here the eigenvectors of U that we found are [1, 0]' and [0, 1]', but they should be [1, 0]' and [0, -1]'. The negative in the 2nd eigenvector allows the scaling term (sigma) to be strictly positive. Let S = Sigma (for ease of typing). I think the main problem is that the general form A = U*S*V' does not mathematically enforce that S should be a strictly positive matrix. So even though A'A and AA' will output the squares of the eigenvalues, simply choosing the positive roots is not enough. We would need to choose the right sign for the eigenvector that corresponds to the positive root. E.g. [1,0] and [-1,0] can both have the same eigenvalue, so we have to decide which to use. Hence, we need to check the cases and manually negate the vectors in U or V so that S can be positive. However, if we follow what Prof Strang does in the video whose URL ive included in the earlier part of this comment, then this is accounted for by the computation.
I think the main reason is that for choosing u2 and v2, it is also important to make sure A*v2 = sigma2*u2. It could be easily verify that if u2= [0,1], the above equation does NOT true but with a difference of sign. It is very important to keep in mind u2 should be evaluated by using u2 = 1/sigma2*A*v2, which yields 'the' particular unit eigenvector of A*A^t.
Douglas is correct. I mean, yes it can be solved by 'calcualting' like SulinWang said, but the essence is that [the sigmas don't have to be positive, it's just that WE CHOOSE them to be positive]. [u2 / sigma2 / v2] - ANY one of them can have different sign. It's just our choice, like -p=q*r and p=-q*r and p=q*-r are all equivalent. (the order of the columns and sigmas are also our choice.) @babyboo: it's just the same thing, as mentioned above.
After three or more years of studying linear algebra, I finally understood the fundamental meaning of SVD ... really took me a long long way to get here... Thank you Dr. Strang.
For anyone watching this who doesn't feel like they are completely getting the concept, Prof Strang has an updated lecture in 18.065 on the SVD, which I think lays it out in a cleaner way: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-rYz83XPxiZo.html
Also, I found it very helpful to first go through the Chapter 7.1 - Image Processing by Linear Algebra, from the Introduction to Linear Algebra by Strang, Fifth Edition. At first, I could not catch on with the lecture even though I have seen most of his past lectures and read the book. 7.1 really helped me.
I found it very helpful to first go through Chapter 7.1 - Image Processing by Linear Algebra, from the Introduction to Linear Algebra by Strang, Fifth Edition. At first, I could not catch on with the lecture even though I have seen most of his past lectures and read the book. 7.1 really helped me.
The problem u not agreeing with the equation is that, after sigma and V have been determined, we no longer have freedom in constructing u. The u obtained with AA^T is still valid, but require different arrangement of sigma and V to accommodate. u has to equal [1 0; 0 -1] to agree with the sigma and V pair.
Note1) (26:02) It can be solved by 'calculating' like u2 = 1/sigma2*A*v2, but the essence is that [the sigmas don't have to be positive, it's just that WE CHOOSE them to be positive]. [u2 / sigma2 / v2] - ANY one of them can have different sign. It's just our choice, like -p=q*r and p=-q*r and p=q*-r are all equivalent. (of course the order of the columns and sigmas are our choice as well - that's how the sigmas are ordered like that) Note2) In the 2nd example, the professor doesn't do the SVD in typical general way, but using the fact that A is rank 1 matrix (we don't have to find U,V in typical SVD way because we have _no choice_ since the A's rank is 1). Related to this, it would be helpful to think about the "4 fundamental subspaces of linear algebra". Also in the 2nd example, the sign(+/-) could have been a problem again - we can't just simply say the sigma is positive root of 125. If we wanna make the sign of the sigma positive, we should modify the sign of U or V accordingly. Note3) (15:43) A_t*A is positive "semi"-definite, in general. If A is non-singular(if all columns are independent), _then_ A_t*A is positive definite.
모든 MIT 강의가 이럴 것이라고 생각하지는 않지만 연세가 80 가량인 노교수가 자신의 머리 속 생각을 투명하게 펼쳐보이며 수백 번은 이미 강의하셨을 내용을 생생하고도 열정적으로 학생들에게 마치 처음 설명하시는 것처럼 전달하시는 것을 보며 충격을 받습니다. 과잉 일반화하는 것일 수 있고 혹자에게는 편협하고도 부당한 평가로 폄하될 수 있을 말이지만 용기내 볼멘 소리를 하자면 한국 대학은 과연 대학이라는 정의에 포섭이라도 되고 있는 것일까 회의를 하게 된다는 것입니다. 60만 넘어도 무슨 대가라도 되서 강의조차 초탈하게 대해버리며 정부나 삼성만을 바라는, 학생들에게 많은 고통을 주는 많은 이 땅의 학자연하는 자들에게 이 강의를 보여주고 싶습니다.
I love how confident and spontaneous Professor Strang is. He is not afraid to make mistakes in front of his students and term them as “to finish” examples for later examination as he moves forward and tries another example in hope of “doing it right” the second try. His hesitation which is a result of high brain activity may confuses you sometimes and throws you off tangent, but it emphasizes the need for continuous examination and pondering while you are presenting the material. He could to the naive critic have prepared better for his lectures to have a smoother sail, but he seems on purpose to present himself like a student who is just doing it for the first time, which is really quite sly and entertaining if you don’t find it confusing. You may for that reason have to wind back his lectures and watch it a second or even third time to get a full appreciation of his style. Thank a bunch for having such a brilliant instructor with decades of teaching and research experience!
I just have to comment about this: I love how Prof Strang has this hierarchy of good matrices and not so good matrices and superior ways of decomposing them and so on. It's like every lecture he introduces another type of matrix and goes "oh yeah and this type is a really good one. I mean, it's a special case of this other type of matrix, but this one is better than the rest, it's wonderful"
Professor Strang mentions that U and V form bases for the four fundamental subspaces of A, but it's not clear to me how C(A) = C(Ur) and C(A') = C (Vr'). I know that U and V were determined by the eigenvectors of AA' and A'A, respectively, but how are these related to the column and row spaces of A?
This is how I see it. Suppose A is m by n, it means that row vectors are n-dimensional. When we write Ax = b, x is also an n-dimensional vector, and it lives either in a row space of A or in a null space of A, if exists (Fundamental Theorem of Linear Algebra). On the other hand, b lives in the column space of A (or left null space of A, if exists). So, when we start from A*v = sigma*u (in the matrix form: A = U * Sigma * V'), v is in the row space, and u is in the column space. Then come the clever steps to determine what are u and v with the use of A'A and AA'. Does it help?
@@szymontuzel8182 its a nice way but ur argument is not fully correct. x need not lie in row space or null space of A.Fundamental theorem of linear algebra states that dimension of image + dimension of kernel = dimesion of vector space(rank nullity th). But instead what is correct is that the kernel is the orthogonal complement of row space. which implies given any vector x it can be written as v+w(v in row space and w in kernel).Now Ax=Av+Aw=Av. So u can think of it as only the row space contributes to the column space(indeed it an isomorphism as both are vector spaces with equal dimension) and hence u only care about basis vectors of row space and extend it with a basis of kernel.
A lot of examples in this video that show why the cameraman would be better if he/she could follow the subject matter - the camera is left showing the wrong part of the blackboard.
The problem of that calculation is given eigenvalues, the choice of orthonormal eigenvector matrix is not unique (we can multiply -1!). If we choose U not as identical, but as [[1,0],[0,-1]], the calculation will be alright.
but it shuld've work, i also get this weird result when i calculating by myself, math shuld work independly of the choices you do if these choices was right
In video 32 at 31:38 he says “a sign went wrong...” and explains it. Takes about 3 minutes. If you work it out, (he doesn’t) the EigVcs for U are [1;0] and [0;-1] (where originally he got [1;0] and [0;1])
looks like in this particular camera day the camera guy is lost in his mind a few times. There are several times that the camera hasn't followed the teacher immediately.
We can have infinite possible set of Eigen vectors. But in this problem, we know from the definitions Sigma×u = A v. So we have to choose u according to the v chosen. So in this case that u2 = [0, 1] is not corresponds to v2 chosen. The corresponding Sigma2 ×u2 = A v2 ==> u2 = [0, -1].
For the issue at 26:38, basically the conclusion that eigenvectors of A^TA and AA^T form columns of V and U is a necessary but not a sufficient condition. This is because we "squared" the matrix A to form AA^T, A^TA and hence the square of the signs gets cancelled. That is why we have to go back to the "linear form" AV = U\Sigma in order to determine the signs correctly. The second example at 34:48 has no such problem as the null vectors multiplying 0 in \Sigma and hence takes no effect.
@38:51, if A is a matrix of m x n, (m rows and n columns) Each row vector is in Rn space. There are m rows. Each column vector is in Rm space. There are n columns. Then shouldn't Vi be from v1 to vm ? and not vn? Since there are m rows?
Nice Lecture. I watched all over again and again, I don't get it completely though. When he finds the eigenvalues and eigenvectors for A'A which is symmetric positive definite, how he transfers them in A which isn't symmetric positive definite? A' is A transpose.
Can we solve the problem by setting the n-th number of the n-th eigenvector as 1, so that we can get [1,1] and [-1,1] in the first eample? I tried another 2x2 case and it seems right, and this idea just comes like some innocent idea but I can't prove it.
AA' isn't always positive definite. It's PD only when A is invertible. It *is* always positive-semi-definite though, but that's different. Mentioning it because he said it's positive definite at one point I think.
The eigenvector u_2 related problem in Example 1 is about the scaling factor. Dr. Strang said that all the scaling factors should be larger than 0. But if you set u_2 as [0;1], like in this lecture, sign decision issue arises. Av_2 = [ 4 4 ; -3 3 ] [ 1/sqrt(2) ; -1/sqrt(2)] = [ 0 ; - sqrt(18) ] = - sqrt(18) [ 0 ; 1 ] Incorrect -> sigma can't be negative sign if we change sigma_2 to have positive sign, Av_2 = [ 4 4 ; -3 3 ] [ 1/sqrt(2) ; -1/sqrt(2)] = [ 0 ; - sqrt(18) ] = sqrt(18) [ 0 ; -1 ] Correct! and obviously, u_2 we've got is [ 0 ; -1 ]
1st Example: U = [[0, 1], [1, 0]] Sigma = [[sqrt(18), 0], [0, sqrt(32)]] V = V^T = [[-sqrt(2)/2, sqrt(2)/2], [sqrt(2)/2, sqrt(2)/2]] where [a, b] is interpreted as a row vector. A = [[4, 4], [-3, 3]] = U * Sigma * V^T .... try it. We know what we're doing, but we're not computers...we make little errors once in a while that we can't track successfully. The core concept was brilliantly delivered, and Professor Strang did an exceptional job making us understand SVD. I didn't bother to find his computational error ☺️
Trust us... this isn't the worst camera man in the history of MIT OpenCourseWare... we've seen worse. If we are lucky, we can hide it by cutting to the slides or a still frame.
@@sunritroykarmakar4406 Thanks for replying! however it's not positive definite matrix yet up to there. A is not necessary a positive definite matrix in SVD. If so why sigma1 & 2 still > 0 ?
This lecture is gonna be a little ambiguous at first BUT once you have a firm grip on the previous lectures from 18.06, you’ll surprisingly realise it is the most beautiful lecture on SVD available in all of RU-vid.
Do you care to give a little explanation of why SVD is so important? My main interests for application of linear algebra are quantum physics and number theory, but I'm not really sure why I this subject is so important. It feels silly, because I was following the course up until this lecture, and I see a lot of people claiming it is the best one yet. Prof Strang even says it is the "climax of linear algebra" is his previous lecture...