Wait, so comparisons in floating point only just KINDA work? What DOES work?

Подписаться 184 тыс.

Просмотров 242 тыс.

50% 1

An introduction to the floating point numbers (iee-754), and some of the oddities surrounding it.
🛒 Recommended books (on Amazon): www.amazon.com/hz/wishlist/ls...
❤️ Support me on Patreon: / simondevyt
🌍 My Gamedev Courses: simondev.teachable.com/
Disclaimer: Commission is earned from qualifying purchases on Amazon links.
Follow me on:
Twitter: / iced_coffee_dev
Instagram: / beer_and_code
Github: github.com/simondevyoutube/
Some great resources:
docs.oracle.com/cd/E19957-01/...
randomascii.wordpress.com/cat...
Some more great stuff:
en.wikipedia.org/wiki/Floatin...
en.wikipedia.org/wiki/Subnorm...
en.wikipedia.org/wiki/Unit_in...
www.h-schmidt.net/FloatConver...
cowboyprogramming.com/2007/01...

Наука

Опубликовано:

4 окт 2021

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 622

@simondev758 2 года назад

Btw, please support me for more videos! My Courses: simondev.teachable.com/ Patreon: www.patreon.com/simondevyt

@kittysplode Год назад

this is completely incomprehensible. you don't actually understand how to teach. you're rambling and scribbling things that have literally nothing to do with the data you're presenting. everything in a lesson should help to understand that lesson. this is like explaining something in a loud cafe on a napkin, except you've recorded it. your sheer incompetence at your chosen occupation is admirable.

@MisterDan 2 года назад

Many years ago when designing the Sheerpower programming language for business applications, we spent a ton of money (over $100K) on this exact problem. We ended up with a data type called "real" with integer and fractional components located in their own memory locations. The hard part was making the runtime performance fast. Once done, it has been enjoyable never worrying about all of the FP pitfalls that you very well explained. In fact, this is the best explanation and clarity I have ever seen! Thank you.

@simondev758 2 года назад

Interesting! It sounds a lot like fixed point?

@MisterDan 2 года назад

@@simondev758 Fixed point using separate memory locations to speed up things like "convert to an integer" where one just clears the fraction part memory location... no calculations required.

@bpark10001 Год назад

This sounds like "integer" format (with number of bits twice the number of bits in your word length) scaled by 2^(-n) where n is the word length. Why not use double-word integers?

@MisterDan Год назад

@@bpark10001 We used two int64s. The use of two memory locations made many frequent operations (truncating numbers, etc) much faster.

@Un4GivNX Год назад

@@simondev758 'decimal' type in c#

@KevinInPhoenix Год назад

In the 70's and 80's we called floating point computer math: "floating point approximation". Someone in marketing dropped the word "approximation" sometime over the years.

@niclash 3 месяца назад

When we designed a language for PLC use in 1984, the language didn't have "Compare Equal" for the REAL (floating point type), but a "Compare Tolerance", with an explicit tolerance argument provided (as described in video). Many customers were confused at first, until they realized that "measurements" are not exact and need to be treated as approximations everywhere. I was young and inexperienced at the time, but the boss were old school veteran in analog computers, sensor technology and much more, so he insisted "no compare equals for REALs. It is not possible!".

@0LoneTech 3 месяца назад

@@niclashShould typically be two tolerances, one relative and one absolute. The fun part with subnormals is they have variable relative precision, but their absolute precision remains the minimum available, so with both tolerance checks they don't need special handling.

@idiotsinwhips 3 месяца назад

@@0LoneTech⁠could you explain how both of those numbers would be used?

@soundspark 3 месяца назад

There was that scandal when one of Intel's processors did the approximation incorrectly.

@Kebabrulle4869 Год назад

My favorite floating point hack is that 7/3 - 4/3 - 1 will always give you machine epsilon. I don't quite remember how, but I found a comment in the depths of stackoverflow that claimed it worked regardless of programming language, OS and computer. As long as it's using the IEEE standard it works.

@simondev758 Год назад

Super cool! I found a reference for it here, problem 3: rstudio-pubs-static.s3.amazonaws.com/13303_daf1916bee714161ac78d3318de808a9.html

@volbla Год назад

Oh, that makes sense! A third is like the perfect middle step between powers of 2, so the mantissa is all ones. But 7/3 has a one greater exponent than 4/3, so it's "missing" a decimal digit that's presumably rounded up. The difference between them cancels out everything except 1 and the least significant digit of 4/3, making it 1 + ulp. What a cool trick.

@Kebabrulle4869 Год назад

@@volbla Thanks for the intuition! It makes a lot of sense when you explain it that way.

@Insightfill Год назад

I remember that the old Windows 3.1 calculator had a bug where 3.11-3.1 (the two major Windows releases at the time) would equal 0.00. Good times.

@gregorymorse8423 Год назад

It's called rounding. The rounding mode in IEEE defaults to round to nearest even. So your trick only works in some rounding modes. Meaning your condition of IEEE is incorrect. And not understanding the trick is in rounding is a blunder.

@RPG_Hacker Год назад

My boss recently told me a story of a game he once worked on. If you left it running for about 28 hours or so, all kinds of weird shit would start happening. Like the rendering would break completely, certain things would stop moving etc. The reason was that certain things in the game kept some kind of on-going timer. This was usually a timer of accumulated delta times and in the range of seconds. Turns out that after the amount of time mentioned above, these accumulated timers got so big that a delta time of 1/60 was no longer large enough to affect them in any way, thus they froze entirely. It's basically one of the floating point issues you mentioned in the video. This specific bug never got a proper fix, just a workaround, which was to simply pause the game on inactivity.

@simondev758 Год назад

Absolutely. This is part of the reason developers do soak tests.

@coopergates9680 Год назад

Incremental stuff should always use a char (byte), short, int, or long, and every once in a while it's fine to convert that millisecond or nanosecond figure into a float of seconds. Given that a double has more significant figures than a 32-bit int, if a timer goes far enough to lose this much resolution in a double, it's ticking stupidly far anyway and should be reset or redesigned.

@RPG_Hacker Год назад

@@coopergates9680 Yeah, switching to integers and using millisecond delta times in general is one of the proposals my boss had to fix this problem for good. Just tedious and dangerous to do in an already existing game, so it's something we'll likely be doing for future games.

@vadiks20032 Год назад

how do you normally fix it?

@hartmutbraun6712 Год назад

Since the time interval is constant (1/60) you should use fix point instead of floating point: use integers and count the number of 1/60ths seconds, i.e. the least significant bit is interpreted as 1/60 of a second. With a 32 unsigned integer you can then run it for 828 days (add more bits if needed!)

@petrie911 Год назад

My "favorite" thing about floats is that float operations are nonassociatve. That is, (a + b) + c need not equal a + (b + c), and same for multiplication.

@Niohimself Год назад

My rule of thumb with programming using floating point numbers to just assume that two floating point numbers are never equal. The only time a FP is equal to another FP is when they were obtained by copying. FPs can be compared as "less than" or "greater than" as a sort of "inside/outside" check, with "equals" case being implicitly bundled with either one of those two.

@sackboy1665 Год назад

@@piisfun

@dylangergutierrez Год назад

It's good never to rely on them being equal, but it doesn't solve all your problems. Like, 50 billion and one is bigger than 50 billion, but if X=50,000,000,000 and Y=50,000,000,001, then Y>X will return false.

@coopergates9680 Год назад

@@dylangergutierrez Hence it's a percent error issue, it's more like abs(X/Y - 1.0) < 0.000001. We all know that bug in old Minecraft when the player is far from the origin lol

@CuulX Год назад

@@dylangergutierrez 50B + 1 is larger than 50B, if the former can be stored. Otherwise the result of the addition is 50B, and you would be comparing 50B with 50B.

@devforfun5618 4 месяца назад

that is what unity does in the animation, floats cant have an equal comparisson, only integers can

@parasharkchari Год назад

I think this is why Sun did that big push to evangelize interval arithmetic. It basically covered for all the imprecision of floats by simply treating them as fuzzy intervals. Things like == comparisons are now interval overlap checks and operations that make the error worse actually make the intervals grow. You basically avoid a lot of these headaches by just assuming that error will always be there and developing your arithmetic around that assumption.

@monad_tcp Год назад

Which incidentally is only of the few non pitfall way of using floats. Keeping an error counter and controlling the interval manually. It's a PITA doing it in C thou

@bpark10001 Год назад

...or do it in integer & know the "error" is zero. The problem with floating-point fuzzy scheme is that the error builds with the number of chained computations, which the math doesn't know about. Of course, if you stack irrational (such as trig) computations, this error appears no matter what the number representation scheme. Floating point disease was so bad because as soon as it was introduced, everybody "had to have it" & it was boasting point for computer manufacturers. It was so much that computers had ONLY that format, even when working with integers. Early desktop HP computer calculated 2^2 = 3 (it used logs to compute exponentials).

@elietheprof5678 Год назад

Interval arithmetic is also incredibly useful for anything scientific

@coopergates9680 Год назад

@@bpark10001 "computers had ONLY that format, even when working with integers" *cough* JavaScript *cough*

@bpark10001 Год назад

@@coopergates9680 Yes they did. HP made a desktop computer in the 1970's that had ONLY floating point format. (When loop counters & other integers were needed, the computer internally TRUNCATED to integer. That's what caused the "2^3 = 7" problem. (I had to add a "+ 0.5" to any exponentiation calculation to get 2^n iterations of the loop.) I guess this "simplified" the machine as there was only ONE type of variable, of a fixed size. Remember in those days a lot of the math was done by dedicated HARDWARE. It is simpler to have fixed-size fields in memory. Most of the calculators also used this format, not changeable. "JavaScript" in 1970 had something to do with coffee & writing, & nothing else.

@scaredyfish Год назад

It kind of makes sense that floating point values don’t play well with equality, because the real numbers are infinitely divisible. In the real world, when you’re comparing things, you are always working to a certain degree of precision. The only way for two objects to be the exact same length would be for them to have the same number of atoms, which is an integer comparison.

@quatricise 2 месяца назад

Yes and even then we're still making a lot of assumptions about the nature of atoms.

@neintonine Год назад

I love the little HTML changes you make in the websites. The description in 7:32 makes the title even better :D

@kirbofn524 3 месяца назад

And at 0:36

@GeorgeGeorgalis 3 месяца назад

@@kirbofn524 thanks for pointing that out! haha

@ellaa_nashwara 2 года назад

This is an amazingly dense video! I have to watch it multiple times to completely absorb it.

@simondev758 2 года назад

Heh yeah I hate repeating myself and figure you can always just rewind.

@rickarmbruster8788 2 года назад

@@simondev758 thats why you are so clear ;)

@zemoxian Год назад

Years ago, I recall reading the specs for a Java3D library and I think they had a 256bit fixed point library. IIRC, you could represent Planck lengths in the same model as the observable universe. Though I imagine there would be performance costs for that, with 32 byte numbers. A spacetime coordinate system would use 128 bytes for 3 space and 1 time coordinates. Or even just homogeneous space coordinates.

@simondev758 Год назад

Oooh fixed point is awesome, I've been meaning to make a vid on that.

@Islacrusez Год назад

@@simondev758ooh, did you ever get anywhere with that?

@vulpo Год назад

@@simondev758 Maybe you can also discuss the Java BigDecimal class.

@simondev758 Год назад

@@Islacrusez I have notes and stuff jotted down, but I kinda go with what I get excited about at any given time. I was happy to dive back into graphics a bit the last few months.

@Sollace Год назад

12:27 A perfect example of this in effect is Minecraft (specifically something called the farlands on with wiki), back before there was a world border. I'd encourage to go check it out! It's really interesting and works wonderfully to visualise these floating point errors in action.

@rya1701 Год назад

in Minecraft bedrock, there's also the stripe lands

@jakesto Год назад

This was so helpful! I always wondered why in MatLab, I sometimes have to do a - b < 0.00001 instead of a = b to compare two values.

@SomeStrangeMan Год назад

eps(num) gives you the minimum value that can be added to num. Usually useful to do something like abs(a-b)

@smorrow 11 месяцев назад

@@SomeStrangeMan All well and good and practical, but I hate when people use "epsilon" (like, from analysis) to mean "really small number". The point of epsilon in analysis is that it's the _arbitrarily_ small number.

@somestrangescotsman 11 месяцев назад

@@smorrow in MATLAB the eps function returns the smallest number that may be added to the floating point number given to it as an argument.

@smorrow 11 месяцев назад

@@somestrangescotsman Yeah, but the _name_ of it is obviously a mistaken reference to epsilon from analysis. And epsilon in analysis really means "the smallest number you can possibly imagine, except for zero", sort of like an inverse infinity. It doesn't have an "actual value" that you could in principle write down, whereas the Matlab eps' whole point is to be an actual value, so it's really inappropriate to name one after the other.

@somestrangescotsman 11 месяцев назад

The smallest number you can imagine is the smallest number you can add to another. Within the rules of floating point numbers, that IS epsilon.

@shaggygoat Год назад

A minor niggle: What is described as a “mantissa” here is really a significand, one which is linear within the range allowed for a given exponent value. Mantissas, as in log tables, are logarithmic. If the exponent in a floating point format were represented as a binary fixed point (so that the usual significand would no longer be needed), the fractional part of the exponent would truly be the mantissa (and in the language of log and antilog tables, the integer part of the exponent would be called the “characteristic”). (Watch out for negative exponents, since the mantissa still has a positive sense in log tables. For M=0.113943, C = [−1, 0, 1, 2], 10^(C+M) yields [0.13, 1.3, 13, 130].)

@jgharston Год назад

Yes, it really is the significand, but in adopting mathematical techniques into computer engineering, the word mantissa was used, and became the defined word.

@Asdayasman 2 года назад

8:23 scared the shit out of me.

@simondev758 2 года назад

Some say I'm a master of horror...

@Asdayasman 2 года назад

@@simondev758 All we know, is he's called The Stig.

@n8programs733 2 года назад

Omg I can't believe you are a time traveller. What other exciting computer science moments have you witnessed?

@simondev758 2 года назад

The end of the gpu shortage was crazy, can't believe how it went. Wait, has that happened yet?

@DryIceyboi 2 года назад

@@simondev758 no wait... There is a shortage TIME TO BUY SOME GPUS!!! LOL

@theonewhobullies 3 месяца назад

@@simondev758 all hail Proof of Stake

@Ivan-pr7ku Год назад

The fundamental problem with FP arithmetic is that Real numbers are not natural fit for binary computers. There's no way to directly map values with moving decimal point in a register, since the register has fixed length, without accumulating large errors. That leaves you with the fixed point format option, where you have to choose between limited range or limited precision, but not both. The convoluted way FP arithmetic is implemented in the binary logic constraints makes it possible to have both cases (range and precision), at a cost of added complexity and a thick book of rules/limitations -- the IEEE-754 standard -- that historically made high-perf FP hardware implementation even more expensive.

@Anohaxer Год назад

The fundamental problem with FP arithmetic is that Real numbers are not natural fit for computers. It doesn't matter what base you're working in. 1/3 is unrepresentable in base 10, since it's 3.33... repeating. You will run into this issue at some point. You have a countably finite space in any case and you need to cram in an uncountable infinity. There are more reals than there are integers, to cover *any* subspace exactly is impossible. Even if you had to exactly represent the space [0.00001, 0.000011] you would undoubtedly have to use either FP or fixed point and in either case lose a lot of precision. What FP does do is provide acceptable precision in the vast majority of cases through the observation that small numbers we work with often have smaller differences between them.

@smlgd Год назад

It's fun to think that floating point units were so complex the first processors didn't even have them and you'd have to use a coprocessor that was often larger than the processor itself (like the intel 8087 that had almost double the amount of transistors the 8086 had) and today we have GPUs that have thousands of FPUs in a single die

@monad_tcp Год назад

@@smlgdat some point we are going to have to ditch digital computers and use analog voltages. Just look how terribly inefficient is Machine learning on GPUs.

@dinoscheidt Год назад

All machine learning engineers agree

@henrycgs Год назад

real numbers are, by definition, not fit for COMPUTERS. the definition of computation and computational problems requires that ALL inputs must be representable with a finite sequence of symbols. otherwise, it literally is not computation. the real number set (or any continuous subset of it) is not entirely representable with finite symbols. however, nothing stops us from picking a few real numbers and sticking some labels on them. and that's what floats are. (smartly ordered) labels for a (smartly picked) finite subset of the real numbers. in case you're wondering "hey, but what if we could have infinite inputs?", that's called hypercomputation. good luck with that.

@BillDemos 3 месяца назад

I don't know if you have ever been a lecturer, but I can tell you you are really good, and all people who have had you around are very lucky. Great coverage, great dissecting of the subject, extremely well presented. Thank you, subscribed just from one video. :)

@simondev758 3 месяца назад

Never been a lecturer, but I've spent a lot of time as a mentor because apparently I'm good at that. These videos are a great way for me to work on collecting my thoughts into a more cohesive form and working on my presentation skills.

@gloweye Год назад

I understand the basics of float, and I've decided to just use integers whenever possible. Far less issues.

@TheEulerID Год назад

A basic principle of writing business application software, especially involving money...

@simondev758 Год назад

Yep, but games historically had to put all their chips in the "performance" pile

@calyodelphi124 Год назад

This is the clearest explanation of what floats are that I have ever, ever, _ever_ seen. Thank you for this. :D

@crazeelazee7524 Год назад

Remember, the outsider thinks computer science is magic. The novice programmer will tell you about how computer science makes perfect sense. The experienced programmer *knows* computer science is magic.

@smorrow 11 месяцев назад

You need quantum physics to understand semiconductors and that's the closest thing to magic.

@RAFMnBgaming 10 месяцев назад

dark magic, possibly made of bees.

@nonickch 2 месяца назад

Hey, thanks for the quick refresher. It's been over 25 yrs I'm out of school and all I remembered was "stay the hell away from floats, they be crazy"

@im_cloudy 2 года назад

Unfortunate you haven't made this video on May. I've been learning this for my final examination in university. It's always better to watch someone explain it this way than reading a bunch of papers. Keep up the good work, currently I've seen all of your videos.

@mohl-bodell2948 Год назад

Whenever I can, I use integers instead of floating point. I just pick a smallest unit, e.g. 1mm, and count how many of those I have in all my measurements. If you work in 64 bit integers, you have enough range to cover a whole lot that way.

@simondev758 Год назад

Yeah that approach works super well, basically a simplified fixed point?

@Tomaskom Год назад

That's what KiCad (tool for designing PCBs etc) does. Uses 32 bit integers with nanometer increments and you get a reasonable upper limit of just over 2m for the PCB size.

@mohl-bodell2948 Год назад

@@simondev758 Yes, well, it *is* fixedpoint. That is all fixed point is: Choosing a minimal unit that is some specific fraction of your base unit and counting how many such fractions you have.

@jaimeduncan6167 3 месяца назад

The issue is the propagation of error. That works if you have to do a few (in computer terms) operations, but if you have to do many, like in the simulation it's one of the worse approaches. Each multiplication, for example, produces a loss of precision "identical" to truncate. It's simply not viable for problems of modern scale.

@mohl-bodell2948 3 месяца назад

@@jaimeduncan6167 It is actually exact, that is the whole point of using fixed point. Floating point has a lot of precision problems, but when you are counting a specific number of your minimal units, you are simply counting an exact number of those units. There is no error and thus no error propagation. You have to accept that whatever you are counting is quantized, if you are doing a flight sim, your planes will be snapped to a 1mm grid (or whatever minimal unit you decide to use). As long as that is fine, you have no error and no error propagation. With floating point you do get an issue of error propagation which makes many operations much more complicated. Like the video says, you can't directly compare two floating point numbers for equality. If you are adding an array of numbers, you have to sort them by exponent and add the smallest ones first, before adding larger ones. If you don't, adding a number with an exponent 53 higher than a smaller number will make the smaller number vanish with no effect (the whole mantissa is too small to have an impact). In a summation of many numbers, that small number could have made a contribution if it had been added to an only slightly larger number first and thus been propagated up to the big numbers. This means that the order of addition is important in floating point, you lose commutativity. Without the ability to swap the order of summation freely, a lot of algebra is lost as well, making many other things much harder.

@PvblivsAelivs 2 года назад

For using floating-point numbers as a black box, intermediate calculations should have double the precision of the final result; tests for equality should pass if two numbers are within "epsilon" times one of the numbers (it doesn't matter which you choose) or absolutely the smallest normalized number in the target precision. These, of course, can be given as defined constants. If you actually want to _understand_ floating-point computation, IEEE is not a good place to start. It's great for a standard to put into microchips. But, for learning, a good starting place is to represent sign, exponent, and mantissa as integer values (fixed point) in their own right, so that, by implementing them, you see how you are handling rounding errors.

@GeorgeGeorgalis 3 месяца назад

Wow, thanks! I've known about scientific notation, binary, integers, and significant digits for a while; even supported scientific compute where these problems come up; but with the underlying algebra you have shown us exactly why, no more blind attribution to intuitive real and binary conversion errors...

@Delsto5 2 месяца назад

When I clicked on this I wasn't expecting Bob from Bob's burgers to educate me on some complex concepts

@kenhaley4 3 месяца назад

Hint for business programmers: Use integers if you're dealing with money. (Some languages support a "numeric" data type, which is nothing but an integer with an implied decimal point.) But avoid floating point for monetary values!

@PaulSpades Год назад

DEC64 is a proposal for a floating point implementation with decimal exponent, which would fix operations with decimal fractions.

@static-san 3 месяца назад

I caused a bit of s stir on the old Risks list a few decades ago commenting on the error characteristics of base 2 floating point versus base 10 floating point. This was about the time that people were finding PC spreadsheet programs were making mistakes with currency values because the program was using binary floating point instead of something in base 10. I knew something about this because Texas Instruments had implemented a base 100 floating point system in their home computers - and had documented it in the BASIC manual!

@decare696 Год назад

Small correction: a number like 1E-9 usually means 1*10^-9 or possibly 1*2^-9 (I'm not 100% sure on rhat one) but 1*e^-9 is something different entirely (e=2.718... is Euler's number)

@MatthijsvanDuin Год назад

10^-9 yes, a power-of-two exponent is only used for hexadecimal float literals (0x1p-9 would be 1*2^-9)

@HiddenWindshield Год назад

The capitalization of the "E" doesn't matter in floating-point literals. "1E-9" and "1e-9" both mean "one times ten to the minus ninth power" (though the capital version is preferred to avoid confusion). Euler's number is represented in an entirely different way, depending on the exact programing language in question (e.g. "M_E" for C and most of its descendants).

@galoomba5559 Год назад

I was confused why he was using base e lol

@farfa2937 11 месяцев назад

This is why I like the Decimal type, it stores the integer portion and the decimal portion as 2 integers so there are no precision errors. Especially for money, you can't tell people it may spawn or banish because floating points are weird.

@voxelfusion9894 2 года назад

To visualize the "gap between 1 and 2 is cut up into parts of size 1.19*10⁻⁷)" that's like measuring a distance of 1m with a precision of 12µm (micro meters) = 0.012mm, which is a tad smaller than a thin human hair.

@freshtauwaka7958 Год назад

If i am not mistaken 1.19*10⁻⁷ meters is actually 0.00012mm which is around the diameter of a single coronavirus according google.

@RicardoValero95 2 года назад

I’m no computer scientist nor a mathematician, just a casual web dev… but I’ve never understood why floating points is the norm and not rational (like rubys rational class). I get that we can not represent all numbers as rational (because of irrational numbers like pi obviously) but many problems with floating points would be spared. Like the 0.1 + 0.2 == 0.3, in rational 1/10 + 2/10 = 3/10. I guess I’m going deeper in the rabbit hole. Great video!

@simondev758 2 года назад

Hah! Happy to have increased the amount of confusion! I didn't read into the decision process itself, but if I had to guess, to me floating point is a better tradeoff as a general purpose data type with it's massive range compared to fixed point.

@bluesillybeard Год назад

The reason (as far as I can tell) is because floating point is insanely simple to implement in hardware, and is extremely fast. I'm not certain, I should probably use the internet to find the answer

@AnarchistEagle Год назад

Rational datatypes often suffer from representing the same value multiple times. If you're using two 16 bit integers to store the numerator and denominator, then you have 65,000 ways to have 0, 32,000 ways to have 1/2, etc. This can cause problems with comparisons, overflow, etc. So most implementations I've seen simplify rationals into their lowest unique value, which decreases performance and requires prime factorization after each calculation. But you're still left with massive holes in your datatype, and you've slowed down all your algorithms tremendously anyways. Floating point represents a much wider range of values, with higher precision for small values, and there are no duplicate values to deal with (with an asterisk for NaNs and on systems where subnormals are truncated to 0).

@latedriver9019 Год назад

Research how it's implemented in the hardware. The hardware limitations give rise to software limitations. This key understanding of hardware is the difference between programmers and computer scientists. With that said, 1/10 and .1 are the same.

@melonenlord2723 Год назад

@@AnarchistEagle 1e+1 and 10e+0 is also the same and still no problem to understand. They simply get both converted to 0.1e+2 or 10. Something like that could also be done here by converting everything to an integer and a power integer. So 0.032445 would get converted to 00000032445 and -6, 134.31 to 00000013431 and -2, 12300000 to 123 and 5. But i think speed is the key. Maybe mathematical operations aren't that fast with this format.

@jp5000able Год назад

A few years ago when I started programming a universe sized environment I first used floating point. I quickly learned that was a big mistake. I switched to 64 bit and 128 bit integers which are 100% accurate.

@saultube44 3 месяца назад

It was Concise: Giving a lot of information clearly and in a few words; brief but comprehensive. Thank you Sir 😊👍

@donnydarko7624 3 месяца назад

This is really great knowledge with regards to writing audio plugins for digital audio workstations as well.

@mikebauer6917 Год назад

Related: when summing numbers with a large range of values you need to sort by abs value in case you have a lot of very small numbers a a few very large ones, in which case an unsorted add of a large value can saturate the available precision and the small values (no matter how many whose sum is large) will be ignored.

@simondev758 Год назад

Good point, wish I had thought of that for the video.

@nBodyResearch 3 месяца назад

Integers with bit flipping algorithms is the only reason I haven’t lost my mind

@sosasees 5 месяцев назад

after watching this video i feel good for choosing to represent collectible crystals in my game code not as a floating point number but as an integer which counts 12ths, only converted to float for display: ``float crystals = crystal_shards/12.0``

@Awezify 2 года назад

I really appreciated this video, I have thought about it on and off during the last week, thanks for quality content. I'm always excited for your new videos. Keep it up Simon!

@TommyLikeTom Год назад

The way you included that joke about Konrad Zuse without drawing any attention to it and then you actually read and liked my comment with over 100K views makes you one of my favourite people in the world, and you were already pretty high up there. Just and interesting note, tonight in a South African comedy club I saw the actual Darryl Philbin from Dunder Miflin perform live musical comedy. I was supposed to perform but I got bumped to next week. I got to show him the 3D caricatures I made of his coworkers (edit: coSTARS), and now I'm going to make a caricature of him (his name is Craig Robinson) and show it to him before he leaves my country. Can't wait to meet you some day too! I'm working on a game! almost done!

@simondev758 Год назад

Hah, I mean it doesn't take long to go through the comments, there's not a million of them. If you take the time to write a comment, I'll definitely read it. re: music, that's super neat! I loved the Office when it aired!

@razoras Год назад

This is a really fantastic video! Floating Point is one of those things in my early Computer Science 101-level coursework that kind of blew my mind.

@T33K3SS3LCH3N Год назад

I'm honestly surprised how rarely this has actually given me trouble. I know some languages offer types like decimals to go absolutely sure, but I believe I never actually had to use one. Most problems fall into a "if it's roughly right, it's fine" category after all. The only case that's regularly important for me is to use epsilon to check for equality. I usually use a pretty big one like e-4 since false positives tend to be better than false negatives in my experience. One time I was actually diving into the float implementation to encode some bitmask into a texture on the GPU and I was curious if I could avoid bitshifts... only to find out that the framework supported integer texture formats after all 😅

@simondev758 Год назад

I feel like this is one of those things where you could go years without ever coming near an issue.

@jamesmnguyen 3 месяца назад

A cool way to see the approximation nature of floating point is to do a Mandelbrot Zoom with 32-bit floats, eventually you'll see the image become pixelated and your "continuous" zoom stutters and ultimately stops.

@colinjohnson5515 Год назад

I work in FinTech and we always use a library for these reasons. C# has type Decimal but for JS and Go we use open source libs. IIRC Shopify was the base we built off of. And remember fractional numbers in JSON are Doubles so most(all?) Decimal libraries serialize to/from string.

@SegNode Год назад

Love the video, I'm currently studying for a software degree and they sadly don't teach anything this low level so this is a big help. I was so glad when you just went into an example at the start too, I hate when RU-vidrs try to teach a concept and they go all the way back to the stone age just to cover the origin 😂

@simondev758 Год назад

Heh yeah, I hate including a bunch of unrelated info to pad the video out.

@spacelem Год назад

Have you encountered unums / posits? They're an attempt to redo floating point in a way that reduces these problems quite a bit. Obviously they'd need hardware support to be fully performant, but it's possible to implement them for accuracy testing purposes (e.g. Julia has an implementation), and they do extremely well.

@simondev758 Год назад

Only read about them, haven't had a chance to try them out though

@crysicle 10 месяцев назад

If anyone is having problems with floating point precision errors, consider switching to fixed-point. 32-bit fixed-point might not give enough precision for most problem spaces, but 64-bit fixed-point would and is an easier data structure to deal with as a lot of the precision errors become predictable.

@user-os6ip5xr2o 3 месяца назад

An exceptional video about floating point precision. A great teacher right there. He gives a lesson like no problem

@fr3ddyfr3sh 2 года назад

great video from a dev for devs, now i see, why it's beneficial to have increasingly big amount of numbers, when you get closer to zero. Was really wondering. Also: always think of my english teacher, which urges me, not to curse . Then i watch one of your videos and smile "3052... and some crap, give or take" 😂 Fun Fact: C# has the very handy decimal type, which is a floating point number with base 10, instead of 2. So you can actually do things like "0.1m + 0.2m == 0,3m" (m is the literal for decimal type). It's a real life saver for LOB applications, not for games or other high performance scenarios of course.

@simondev758 2 года назад

That's interesting, the language specification has some overlap with fixed point. My english teacher never encouraged my swearing either :(

@GegoXaren Год назад

C23 has Decimal numbers too.

@moth.monster Год назад

I love this channel for all my uninitialized variable needs. But you can really get anything here.

@z-beeblebrox Год назад

It was super cool of Bob from Bob's Burgers to take some time out of his day to teach us all this

@flameofthephoenix8395 3 месяца назад

15:03 The best way of doing it is to manually find the exact representation of the float as an array of 32 bits then handle the comparisons yourself, or better yet, just use fixed points!

@randyscorner9434 3 месяца назад

Great explanation of Floating point numbers! I designed the FP execution unit on the 387 and 486 processors and this brought back a lot of memories. Handling Denormals and Unnormals were a pain but we got it done. Same with NaNs. Unfortunately, the guys who did the Pentium design after this failed in getting the right division lookup table entry and it led to an interesting story.... The next interesting topic might be a discussion on rounding using Guard, Round, and Sticky bits for numerical correctness.

@simondev758 3 месяца назад

Woah, you've been around! I was just a kid playing Sierra games back then, would love to hear more about your experiences if you have a blog or something.

@randyscorner9434 3 месяца назад

@@simondev758I haven't written my memoirs yet, but have had many discussion with other folks about the earlier days of CPU design. After the FP design I was the Design Manager for the P6 (Pentium Pro) and then GM/VP for Pentium II, Pentium III, Pentium 4 and the first Celeron. It was fun until it wasn't and then I left and started a company and then worked at SpaceX for a while. I'm not sure how to do a blog about so much of this since so many other people are intertwined in the history.

@the_furf_of_july4652 3 месяца назад

That was my favorite history section ever, thank you

@ampisbadatthis 3 месяца назад

this is very helpful, in a piece of code I wrote recently, I kept running into this issue where when trying to calculate percentages made 3/10 into 31% with the ceiling function, and couldnt figure out the issue, I will try to reimplement it with this in mind and update how it goes in the edits later today

@pavelperina7629 3 месяца назад

I'm aware of (most of) this and it's always surprising. Fun was something like if (a>=0) { b=std::min(1.0/a, 1e6); // since now assume that b is inside range between zero and million }. Program had some weird behaviour. After some debugging, it turned out that a can contain negative zero and b can be negative infinity.

@hg-ir8tb 3 месяца назад

The problem with Floating Point representation (IEEE 754) is that we're basically trying to force a base 10 number into a base 2 representation. As such, compromises have to be made in order to reduce both computational and memory complexity. Another way of looking at it is using scientific notation: You can describe pretty much any rational number through scientific notation, but the number of significant figures generally increases both complexities on a linear scale. You can bound the complexity by limiting the significant figures, but this leads to a loss of information. Once we had excess memory and computational resources, things like Java's and SQL's decimal for more accurate but memory and computationally more expensive representation.

@ChrstphreCampbell 3 месяца назад

I’ve run into this many times! & combined with my logic dyslexic, it would drive me crazy !

@nunyabusiness3710 2 года назад

I happened upon your videos 2-3 weeks ago. You're f'n crushing it dude. Just sub'd.

@fabianfeilcke7220 Год назад

When comparing whether two floats or double are equal, i always use a percentage-wise tolerance, that is suitable for the application. like A is within 0.999*A

@enfieldli9296 2 года назад

Your explanation and the way you talk throuth like in 0:32 really useful and entertainning😁, thank you for the hard work!

@Oktokolo 3 месяца назад

Best explanation of the floating point binary format.

@smanzoli Год назад

IEEE 754 octuple-precision binary floating-point format: binary256 In its 2008 revision, the IEEE 754 standard specifies a binary256 format among the interchange formats (it is not a basic format), as having: Sign bit: 1 bit Exponent width: 19 bits Significand precision: 237 bits (236 explicitly stored) The format is written with an implicit lead bit with value 1 unless the exponent is all zeros. Thus only 236 bits of the significand appear in the memory format, but the total precision is 237 bits (approximately 71 decimal digits: log10(2237) ≈ 71.344).

@simondev758 Год назад

That's a big float

@CesarGrossmann Год назад

I remember using something like if ( abs(a - b) < error_value) with error_value = 0.0001 instead of if (a == b) to circunvect this problem with floating point comparison. It was some numerical computing (I think I was playing with a numerical method to finding the roots of an equation, or something), and the "a == b" part was never being triggered...

@weirdsciencetv4999 Год назад

I wrote a machine controller once, and the position for the steppers were calculated using floating point numbers. When I tested the stepper driver routines the shaft position would be updated by some small value that gave a certain RPM. At first everything sounded normal and smooth, but after about 5 minutes the steppers sounded horrific and choppy. I eventually figured out the compiler for the microcontroller did not support double precision by default, but does not generate a warning during compile. It just silently interprets it as regular floating points. After enabling the right flags and recompiling it finally worked. But the error was simply the problems floating point numbers have in representing certain spans of numbers.

@MostafaZeinali Год назад

From my experience, this is how we compare two floats/doubles. You need two tolerances. Relative, and Absolute. abs_tol is the value you accept "as zero" in "this context" of comparison. rel_tol, is the max amount of "relative difference" two numbers can have to judge them as equal. And the formula is: abs(a-b) < rel_tol * abs(a) + abs_tol As you can see, there's an "a" multiplied on the right side. And what that does is, it "scales" your rel_tol to the vicinity of the numbers you're comparing. So, if you are comparing really close to zero, (a is small) rel_tol * a will become smaller and the significant member in the RHS is abs_tol, so, near zero, you are using your abs_tol. If you are comparing two large numbers, rel_tol * a becomes large and now this term (rel_tol * a) is the most significant term of RHS, controlling the comparison result. This is a variation on the simpler version which is: abs(a-b)/abs(a) < rel_tol You take the abs(a) to the right side, but add the abs_tol. From my experience, for "double precision" we set abs_tol to something like 1e-16~20 while rel_tol to something like 1e-8~10. This has worked mostly in the past for me, But I've had cases where even this does not work!!! Right now I'm reading randomascii article and it is fascinating. I'd love to know your thoughts on this. Thanks everyone.

@absalomdraconis Год назад

Try looking for functions to extract the parts of a float, as well as functions to reunite them. You get the exponent of whichever value you're treating as dominant, then pack together with epsilon (at least, I THINK it was epsilon, it's been a while since I did this), and that gets you the smallest possible step size for the context you're interested in... more or less. You may want to consider the scale above and below as well... It may also be that extracting the exponent gets you everything you care about, but I've never tried that, so I can't speak to the sanity of attempting it.

@LA-MJ Год назад

Bookmark

@dontbealoneru 2 года назад

Good video! I appreciate your effort.

@Bowa10000 4 месяца назад

A handy trick I figured out that better handles floating point equality is to xor the integer representation of the two floats data. Comparing the resulting Int giving a rather effective way if telling if two numbers are effectively identical. for example: xor(0.1+0.2,0.3) == 7 (0b111). so anything 7 or below can easily be considered floating point math error (we could say ≤15 (0b1111) to be safe). It's at least more accurate than a direct comparison, with the single cavoite I've found being 0 vs -0.

@alonamaloh Год назад

Around 4:01: When you are working in binary, you probably shouldn't call the point "decimal point" or the places "decimal places". If you do, it's just very confusing. Just called them "point" and "places".

@FluffyFoxUwU 3 месяца назад

my rule which i made myself, is to NEVER compare floats whether its != or == under any circumstance. and i learned by hard way (yes the debug for hours)

@ChronicWhale 2 года назад

Thanks this is a great explanation

@ethanlewis1453 Год назад

It's quite amazing the designed floating points to allow for such a surprising failure of 0.1 + 0.2 == 0.3. I was wondering why the C++ QT SDK bothered including a "real" data type and this might explain that.

@VoidloniXaarii Год назад

Thank you so much for making this amazing vid

@CR3271 Год назад

This really screwed me over when I was trying to do some angle calculations on a coordinate plane. I knew the line AB was parallel to the line CD in my test case, but the angle comparison in my code kept failing. It was infuriating. When I figured out what was going on, it was even more infuriating.

@TommyLikeTom Год назад

"Konrad Zuse, loyal subscriber to SimonDev's youtube channel" XD

@tristanridley1601 Год назад

And this is why I jump through moderate hoops to treat my numbers as integers. "So for this I'm going to count my universe in millimeters." "Why?" "Because it's more precision than I think I'll need, and it's not float." I have seen (and ranted about) using float for *currency*. Please, dear god just use integer pennies...

@VoidloniXaarii Год назад

Fascinating, thank you

@LucasDenhof 2 года назад

Very informative video!

@arulmuruganK94 2 года назад

Loved the wiki history. now we know time travel is possible.

@wiregunner 2 года назад

This is so underrated channel

@Ni7ram 4 месяца назад

excellent content. senior developer here, never thought much about it

@thegermantomoeser Год назад

Damn, what a nice Channel! Subbed!

@Skeffles 2 года назад

Fantastic video! I'm constantly forgetting what I know about floating point numbers so I'm definitely going to be coming back to remind myself in the future.

@rasowa2958 10 месяцев назад

If you can use integers instead of floats without overcomplicating the program, do it. For example, the currency is better handled by integers. Just store 995 cents instead of 9.95f dollars, and convert to dollars only to interact with user. That's the best way to avoid all these issues. Also, one issue not mentioned in the video is that these errors like 0.01f + 0.02f accumulate, if you do thousands or millions of operations on a floating point variable the error may become quite substantial. Again, use integers instead, if it's feasible. I know there are libraries that help to deal with fractions. It's not a bad alternative. Just keep in mind that integers are native type, and calculations on integers are much much faster than on any non-native type.

@hypergraphic 2 года назад

Great video. I love the whatevers :)

@cavesalamander6308 3 месяца назад

There is also a difference in 32-bit and 64-bit compilers and CPU/FPU (x86/x64 Intel I mean). 32-bit CPUs use an 80-bit intermediate representation of data in the FPU during operator evaluation. The programmer also has access to the 80-bit long double type On 64-bit systems with vector instructions, compilers prefer them even for calculations with single numbers, so that even an explicitly declared variable of type long double is implicitly converted to double. As a result, the same program compiled in 32-bit and 64-bit modes will produce different results!

@lulairenoroub3869 Год назад

Love the H Jon Benjamin impression

@amigalemming 4 месяца назад

There is a proposal for "Posit numbers". They do not have the problems with subnormal numbers and have different distribution that concentrates around 1 and -1 and they have only one Zero and one unsigned Infinity.

@tolkienfan1972 Год назад

Floats compare exactly as well as is to be expected. The problem isn't with the representation. It's with developers making assumptions that aren't true.

@williamdavidwallace3904 3 месяца назад

If one is doing modeling then floating point can introduce chaos at each step on the model code. We used to validate models by increasing numerical precision until the results became comparable ie within tolerance. One would use single, then double precision then quad precision...

@Astrophysikus Год назад

This also implies that in a sum of more than two numbers, the order of the summation might change the result slightly. As a consequence, a perfectly "deterministic" program can have completely different outcomes every time you run it, as soon as you have some section of optimized/parallelized code where you do not have full control over the exact order in which some low-level stuff is computed. I was shocked when I first experienced this first hand as a young student working on physics simulations.

@Kalumbatsch Год назад

"where you do not have full control over the exact order in which some low-level stuff is computed" Doesn't sound very deterministic to me.

@Astrophysikus Год назад

@@Kalumbatsch That is why I have written "deterministic" in quotation marks, LOL. I had naively assumed that some fancy optimized function (also involving some multi-processor stuff) would perform just like an ideal mathematical function, giving you the exact same output for the same input every single time. In floating point reality, not so much.

@simondev758 Год назад

Yep, floating point isn't associative heh

@j7ndominica051 24 дня назад

The coordinates in GTA: San Andreas was a floating point number like most things in the internal script. The world spanned a few thousand units in either direction. But someone made a mod with a boring road on the water to the edge that was at 20,000. When approaching that, every part on the car began visibly shifting. I had the great idea to separate the bumper, the license plate, and the lights, etc., so that they could be later selected and copied. In Age of Empires, the money were floating point numbers off by a significant amount, and it was not possible to find them with a simple cheating tool. Floating point matches our perception of the world where small differences become less important as we have more of the stuff.

@SteinGauslaaStrindhaug Год назад

I've watched so many videos on floating point numbers, this probably won't tell me much new; but it's such an interesting topic I'll watch this anyway. I'm not sure if I find the design beautiful or ugly,... or both. But it's definitely interesting

@simondev758 Год назад

If you know the format well, it probably won't tell you anything new, but yeah totally agree it's a bit of a love hate relationship heh

@Tomorrow_For_Sure 3 месяца назад

That's why I never use equality comparisons with floating point numbers. I had a LOT of problems with that when I first started programming. 😓

@PauxloE 3 месяца назад

A (64-bit) double-size floating point can exactly represent all 32-bit integers (and a few more), and the operations match. So if you only need 32 bits, JavaScripts Number works for integers as well.