[7:54] Announcing that the we need to look at when the Hessian is equal to zero appears to come out of nowhere. This could use some discussion of why this is the case.
You just set it to zero in order to find r1 and r2 because these roots lead to a simpler expression for the roots of the original cubic. There must be a theory as to why this happens but this is probably way beyond the scope of this lecture.
@@rob876 Right, it just seems to be convenient for the problem to look at this other expression, maybe just announce that's how it is, then it becomes clear why we do it (before the reveal and deduction (you'll have to think about what was the connection, no connection?) later)
Exciting traditional algebra. One not-so-obvious next step is when all 3 solutions are real: 2 now appear as sums of complex conjugates. Not in real form. An alternative I looked at as a student secured all real solutions ( I found out this year it was well-known to Viete, as the French call him ). The trick is to make an affine change so the equation has no square +/- term ( Cardano saw that early on ) and the p coefficient of x in x^3+px+q=0 become +/-3/4 by the adequate change of scale .
24:21 Honestly, that describes what all the video felt like. Bunch of algebra that works at the end, but hard to see any of it as meaningful or interesting in its own right
Apparently for reducing a cubic to have no quadratic term, the subsitution can be found by finding the unique inflection point and translating the polynomial so that point is at 0. And you can do some extra stuff using the half-turn symmetry at that point or something at that point to derive the z - p/3z substitution that yields the values of x from quadratic solutions
@@gerardvanwilgen9917 Most "unfriendly" approaches to the cubic are honestly just bad approaches, unless they are historically justified (like the trigonometric solutions to the casus irreducibilis which avoid cancelling complex roots)
Here's my take as someone that has taught/tutored at elementary school level (though most of my experience is high school and early undergrad math like calculus, diff eq, and linear algebra): common core's main benefit is standardization. It's painful to admit, but many schools left to their own devices will either teach to no curriculum or will teach poorly to one (and that's why the rise of EdTech is simultaneously good and predatory--edtech companies seek out failing school districts to sell them a product that aligns schools to these standards). State standards (Florida, Virginia, etc.) are hit or miss but more importantly inconsistent. A good standardized curriculum doesn't mean we'll suddenly be enriching enough to the brightest students, but it does get one step closer to avoiding the long standing issue of often less-privileged students struggling in math also giving up on math and, in the worst cases, dropping out of school. That being said, common core's means to create standard uniformity has created some media worthy memes and arguably done damage to "intuitions" that should be developed at a young age. For example, I was sent a tiktok of a teacher explaining how to do the area of a 14 by 15 (something like this) rectangle by the "area method," which ends up being this contrived setup that is basically foiling (10 + 4)(10 + 5). Of course, elementary school students can't regularly expand binomials.
@@garrettthompson3286 Expanding out 14x15 with the area method may not be a good method of computation, but I don't like the idea of telling people to just use the multiplication algorithm with no sense of why it works. It demonstrates the distributive property visually, and it establishes that we can write the product in terms of a sum of various multiples of powers of 10 with the coefficients depending on the digits in the numbers. The multiplication algorithm is obviously a more efficient way of doing this, but you won't appreciate it unless you understand how the distributive property and the base 10 number system are enabling it. Given that people can almost always have calculators available, the most important thing about learning any computational methods is to understand the ideas underneath, and so we should present ones where clever shortcuts are avoided so that the main idea is clearest.
@@garrettthompson3286 "A teacher on tiktok did X" doesn't demonstrate the standards or their shortcomings. As for the example: 14*15 = 7*30 ⇒ [7*3 = 21 and tack the 0 back on the end] for 210.
@@Keithfert490 But he can only write P and Q in terms of the roots of that quadratic polynomial when the Hessian is zero. If the Hessian is not zero, what are r1 and r2 supposed to be?!
@@bjornfeuerbacher5514 what i know from Khan Academy video, if Hessian is not equal to zero, then r1 and r2 will both be either saddle point or maximum/minimum value
The wiki page is not very helpful. I felt like it created more questions than answers. en.wikipedia.org/wiki/Hessian_matrix#Applications If f is a homogeneous polynomial in three variables, the equation f=0 is the implicit equation of a plane projective curve. The inflection points of the curve are exactly the non-singular points where the Hessian determinant is zero. It follows by Bézout's theorem that a cubic plane curve has at most 9 inflection points, since the Hessian determinant is a polynomial of degree 3.
The roots of H do not "correspond" to the roots of the reduced cubic, they are related with the roots of the reduced cubic by some formulas Michael Penn showed us.
I like how you give references; not everybody does that. And if you go through those old journals, you'll find lots of interesting math (as you obviously know).
Hmm, the cbrt(r1*r2) = -cbrt(P), because the quadratic equation is z² + (Q/P)z - P = 0. Moreover, if we write down r1 and r2 explicitly, then we'll get after simplification: z = -A + cbrt(-Q/2 + Psqrt(D/4)) + cbrt(-Q/2 - Psqrt(D/4)), where D/4 = Q²/(4P²) + P = (1/P²)(Q²/4 + P³). So, in fact, this is "Cardano's" formula. Recently I have investigated the methods to solve third degree equation and found on French Wikiversity an approach called "Méthode de Sotta". Almost nothing is known about Sotta himself: his first name is allegedly Bernard and he is a mathematician from Marseille, but I haven't found about him anything else. So, the main idea to solve cubic equation ax³ + bx² + cx + d = 0 (1) is to construct the quadratic polynomial Ay² + By + C = 0, where A = b² - 3ac, B = bc - 9ad C = c² - 3bd Let's denote r1 and r2 as roots of Ay² + By + C = 0 (here we consider A≠0 and B²-4AC ≠ 0, so r1≠r2). Then the solutions of (1) are given by the formula below: x = (r1 - w^j*cbrt(gamma)r2)/(1 - w^j*cbrt(gamma)), j = 0,1,2, (2) where w = exp(i*2pi/3), gamma = (b + 3a*r1)/(b + 3a* r2). Note that gamma≠1 since r1≠r2 and denominator of (2) is never equal to 0 in this case. The derivation of formula is based on linear recurrence relation Ay_n = -By_(n-1) - Cy_(n-2) with y0 = a, y1 = -b/3, y2 = c/3, y3 = -d. If we know the general formula for y_n: y_n = k1*r1^n + k2*r2^n, where k1 and k2 are constants, then (1) can be rewritten as k1(x - r1)³ + k2(x - r2)³ = 0 (3) From (3) we get x = (r1 - w^j * cbrt(gamma) * r2)/(1 - w^j * cbrt(gamma)), where gamma = -k2/k1. Finding the constants k1 and k2 from y0 = a and y1 = -b/3 will finally lead us to (2). The formula for case B²-4AC = 0 (i.e. r1 = r2) is derived in the same way. Note that case A = b² - 3ac = 0 will give the full cubic expression: ax³ + bx² + cx + d = a(x + b/(3a))³ - b³/(27a²) + d, and thus the solution is found pretty easy. After this whole long journey through formulae one may ask: why is this method better than Cardano's one? Well, here is at least one advantage. In Cardano formula you will calculate cube roots of arbitrary complex numbers in case of three real roots. However, in Sotta formula it is the only complex number (which is kind of Viète trick), and moreover, it's modulus is always equal to 1! Indeed, |gamma| = |b + 3a*r1| / |b+3a*r2| = /* r1 = z, r2 = bar(z) */ = |(b + 3a*Re(z)) + i * 3a*Im(z)| / |(b+3a*Re(z) - i * 3a * Im(z))| = 1 In fact, this allows to perform computations in terms of Euler's formula: exp(i*phi) = cos(phi) + i*sin(phi) After all the simplifications I got the final formula for real root of cubic equation in the 3-root case: x = Re(r) - Im(r)ctg(phi/6), where r is a complex root of Ay² + By + C = 0, gamma = exp(i*phi) = (b + 3a*r) / (b + 3a*bar(r)) I guess in the same way the other two roots are found, but now I'm tired to write all these computations)
Here's another method I've seen: Divide by a. Substitute x= k-b/3, expand in k, and combine: this gets rid of your quadratic term. Here you could substitute into the stereotypical cubic formula, but I'll keep going: Substitute k= z-p/3z (stage 2 cubic= x³+px+q=0) then combine and solve a quadratic. Back-substitute z into k and then x, and there you have it!
For the reduced cubic, zˆ3 + 3Pz + Q= 0, write z=u-P/u to obtain a quadratic equation for v == uˆ3: vˆ2+Qv-Pˆ3 = 0. The two roots of this equation satisfy v_1 * v_2 = (-P)ˆ3, so we may write z = v_1ˆ(1/3) + v_2ˆ(1/3). For the example the quadratic becomes vˆ2 + 12 v = 8, leading to z= (sqrt(44)-6)ˆ(1/3) -(sqrt(44)+6)ˆ(1/3) = -1.4702... Which demonstrates that explict solutions of cubic equations are mostly useless!
When the cubic polynomial is written in the depressed form (what can always be done) x^3+ax+b, its roots are: x = sqrt(0.5aWq(2b^2/(b^3))) where Wq is the Lambert-Tsallis function with the q parameter equal to 1/2. Wq(x) is a multivalued function, so the same formula represents all the roots. One can find more examples of Wq here: "Analytical solutions of cubic and quintic polynomials in micro and nanoelectronics using the Lambert-Tsallis Wq function".
Dear @@MarcoMate87 , Thanks fror your question. The correct formula is x = sqrt(0.5aWq(2b^2/(a^3))). So, there is no simplification. I posted the correct formula too as soon as I noticed my mistake.
The derivation seems more complicated than the usual one. Maybe if there was a fully worked example, including finding the quadratic equation, I could judge whether it was easier.
Nice, but certainly not more "friendly" than just using the u+v substitution... PS: and for numerical stability, just like with the u+v substitution, it is better to express the solution closest to zero not in this way (since there can be loss of accuracy by subtractions!). Keep only the largest two and re-express the third as w₃ = -C/(w₁ w₂), using the fact that their product is -C.
in 26:00 you multiplied 3 by -2, but forgot to multiply sqrt(11) by -2, so the proper answer is -1+cbrt(6+2 sqrt(11))+cbrt(6-2 sqrt(11)) verified with calculator
I find this a very interesting approach; but still I do not think it is easier or friendlier than reducing the cubic and then, in your notation, sub in z = u + v and demanding u*v + p = 0.
Interesting, there was quite a bit of work done in solving cubics in a practical manner in the 20th century that seem to have largely been forgotten with the rise of numerical methods and modern computers. You should check out 'Solution of Cubic and Quartic Equations' by S. Neumark, which uses a trick to allow the solving of Cubics using hyperbolic functions. He actually used early computers to create tables, included in the book, to allow for quick solutions, but even without these he presents a method that allows cubics (and quartics) to be fairly quickly solved just with tables of trigonometric and hyperbolic functions.
Wonder if that is the same method described in the Math CRC handbooks? They give an algorithm (easily programmed these days) that is about half a page long and about as computationally difficult as Michael's, but then, after that, have two pages on the "trigonometric solution".
@@user-gs6lp9ko1c It's possibly related, I don't have the book (or one of the Math CRC handbooks, for that matter) in front of me at the moment and I don't want to try to give the exact formula off the top of my head as I'd likely make a mistake (and the solutions are broken down into three cases anyway), but it is at least related to the 'Trigonometric and hyperbolic solutions' method in the Wikipedia 'Cubic Equation' article; however, it gives all the roots, not only the real roots, the computation is substantially simplified by pre-calculating a delta value directly from the coefficients that, if I recall properly, removes the need to actually reduce the cubic to its depressed form, and the method is expanded to quartics as well, though I believe that might require some algebraic manipulation if I'm not mistaken. I would say the computation is simpler than the method Michael gave in this video and the derivation is certainly much simpler. It's obviously built on earlier work, from Viete onwards, but it is much refined and I bring it up because it is the most streamlined and elegant solution to cubics and quartics I have encountered (even without the very nice tables relating delta values and roots he created and puts at the end of the book). I'll try to remember to look up and post the specifics when I get home this evening.
I got z = ³√[2√(11) + 6] - ³√[2√11 - 6], pls excuse the pedantry. I love seeing clever approaches for cubics, but I do not think this is any sort of computational advantage over so-called Cardano's method. I also do not see it as in any way more enlightening, but it belongs in the compendium of solutions.
I understand this notion from a pre-complex-numbers-view of mathmatics, but why are people so eager to avoid the complex roots of a cubic? It's not black magic, and you can even write it in a way that combines all three solutions into one formula just like the quadratic formula does it. The only difference is a factor of (-1±i√3)/2 before either cube root.
@@Fluorineer standard cubic formula involves adding related conjugates of square-roots of cubic roots. At the very least, it's hard to implement numerically on a computer, let alone compute by hand in closed form. Especially since for complex numbers either you use iterative methods, which may take time to converge(using which you might as well use iterative methods for the cubic), or get the phase using the log, divide by 3 and reconvert using exp, effectively using sin and cos, which brings transcendental functions into the mix. For general complex coefficients, we might need the full formula, but for real coefficients this method is dead easy.
Math CRC handbooks (mine is the 27th edition), have a little cookbook algorithm for solving cubic equations that looks like it takes a similar amount of computation. (This method also includes a few cube root equations.) Straight forward, these days, to write a little program in Matlab or Mathematica, or even Excel. Actually, it was a fun and useful computer science exercise in my high school class in the late 70's. Of course, we didn't have to know how the algorithm works, just that it does. Just found the same method in a 1941 edition of the Math CRC handbook--might have come from the 1935 paper? There are no references in the handbook, and the algorithm is a little different. Glad I didn't grow up in the days when they had to solve these by hand!
That's amazing. This is simply the fastest method to find the roots of a cubic I've ever seen, extremely useful. The only problem is when the cubic has three real roots. In this case if we apply these formulas we can't avoid to express two of them using the (non real) cubic roots of 1. This is the well known "casus irreducibilis" of Cardano.
Did you know there's a general hypergeometric formula for cubic roots? It might not be as nice as the trigonometric formulas, but certainly an interesting option. We can also express general quintic roots as hypergeometric functions or so I've heard. I don't remember if you had a video about hypergeometric functions, but I think it's a nice topic to cover, as even some STEM students don't know about them
Many years ago Wolfram Research published a poster on solving quintic equations, which I had on my wall for years. Hypergeometric functions are one of the approaches they covered.
If you have the x,y points where the Hessian of f(x,y) is is zero you have a nice point where the function will be linear f(x,y)=f(rx,ry)+J(rx,ry)+0+q(x,y) where q is dim 3 wrt x&y. From here you can match polynomial coefficients, which is what I think he spent most of the video doing. 😅
Why do you always insert a mistake in the last line? Do you ever check the result? Or is it intentional, in order to see, if the viewers are still awake? You've messed up the multiplication with -2 = r1*r2...
Curious why Mathematica views r1^1/3 different from cube root of r1 and so on. When I would evaluate the cubic from the example using cube roots it came out different with the roots found vs when I wrote them in terms of r1^1/3, r1^2/3, r2^1/3, and r2^2/3 the cubic correctly evaluated to zero.
@@Milan_Openfeint I was missing that if x < 0 and you use CubeRoot[x] you get (-1)*CubeRoot[-x] while if you do x^(1/3) you get E^(pi*I/3)*CubeRoot[-x], so it's just taking a different 3rd root of -1
The traditional derivation is easier to follow as you said, but the formulas for the solutions showed by Michael Penn are incredibly fast to memorize and apply.
What is more friendly here? And what from? There tons of insightful videos with symmetries of polys explained which I recommend. Yours I dislike, it is an enourmously overinflated bunch of algebra with hessians etc, (why?). Original Cardano's way of doing this is way simpler, and there are even better ways of showing how to complete a cube. Almost as bad as the fluke when deriving Euler-Lagrange eqs.
I really don't see why Michael considers this approach to be "much more friendly"... Ok, the end result looks a bit neater, but the intermediate steps are even more complicated than in the usual approach. And I don't even understand the approach - why exactly does he set the Hessian to zero??? That's the first video by Michael ever where I give a "thumbs down". :/
It's a bit unfortunate he doesn't still have grad students working for him, as an excellent "bit" would be to have different presenters for all the different substitutions done. (one for the w's, one for the z's, etc.)