I love how there's a team of literal wizards out there somewhere speaking ancient runic languages who have the power to delete the internet if they get a rune wrong.
these "wizards" were parsing HTML with regex lol. don't be too impressed by them. don't be fooled by all these fancy terminologies. that's an absolute rookie mistake and big fat no no lol
Pro engineering move: don't actually fix the issue, implement an undocumented hacky workaround that itself seems like a bug so that when someone else fixes the workaround they get hit with the initial issue
To be fair though, I think it is unlikely they knew about the leak, given the rarity and impactfulness. I would chalk the empty buffer masking the issue up to coincidence.
@@kevinfaang I think its more likely the author of the code knew about it being possibility, so just put it there just in-case. Otherwise there's really no reason to have an empty buffer. The general rule when you see weird stuff like this is, "its there for a reason".
Considering that Google and all the other caching services were involved, it would've come to light eventually hurting their trust even more, so they didn't have much of a choice
one thing my father always told me about iterators: "NEVER check index == end. ALWAYS check index >= end. You should never assume that your iterator is consistent and never skips over any values."
That advice taken at face value is bogus since many kinds of iterators only support equality-comparison, e.g. when iterating a linked list. But it's good advice if your iterator does support a total ordering that's consistent with iteration.
@@thisismygascan4730 == can actually be faster in some cases, since it makes it easier for the compiler to reason about the loop count, but yeah it _usually_ makes little to no difference
@@MatthijsvanDuin from my experience, nearly all instruction sets I've seen so far (AVR, MIPS, x86, ARM) take the same number of clock cycles for integral comparison (usually 1 or 2).
That is why we invent safety things. Sure you can operate this machine without such and such guard, but just know for every smart one that can there is dum/drunk/exhausted/stressed someone else that will get chewed up if this machine can operate with someone in eye site distance.
@@fulconandroadcone9488 Sometimes, the smart one is also the exhausted and stressed one (also, smart people can be drunk, though hopefully only during hobby projects). I mean, I don't know about you, but whenever I'm too sleep-deprived, the quality of my work generally declines.
@@nickstegman8494 my guess is if that if you disregard basic safety which includes disabling safety mechanism during normal operation you might not be as smart as you think you are
@@fulconandroadcone9488 you've never worked while severely exhausted. Not a programmer, but in any field exhaustion leads to dumb basic mistakes. Example, I was exhausted the other night after a 16hr shift, made some cereal for quick dinner, and put the milk in the pantry and cereal in the fridge.
The more you learn of the Internet and its overall supporting structure, the more you'll realize how fragile it really is and how terrifyingly easy it is to make it crumble.
So it was the time honored, decades old combination of: - working with naked pointers - not checking if you reached the end - foolishly trusting that input from a network source _isn't_ garbage Calling it "Cloudbleed" is fitting, as Heartbleed had all of those as well :)
Never trust input for the en user - period. Garbage in - Garbage out. Not too long ago I fixed Non-Ascii chars bombing a piece of middle-ware, all because the commercial devs for the input system never though users would cut and past content from the Internet into a comment field. Nor did they properly set the XML declaration for the output file to specify the character set!
@@williamdrum9899 This was off by a _lot_ more then one. And also you were off _with a pointer_ - which is why you should not work with naked pointers unless you really need to go low level.
The title was intriguing and video was so interesting that I didn't even notice that video has 54 views from such small channel with 157 subs. Good job :D
Год назад
I didn't realized that too until I read this comment :D Really good content.
This also shows a different common issue with loops, where the exit clause is an equal value. Sure, you might expect the incremented value to eventually always reach the desired value, but the safer thing to do in this case is check if it's higher or equal. I would likely write it as less than, but depends a bit on what the surrounding code looks like.
Was gonna point out the same thing. Never assume an incrementer value will always eventually _exactly equal_ a target value, any number of things (race conditions, cosmic ray bitflips, floating-point fuckery, ancient mummy-curses, etc) could cause it to somehow "miss" the intended exit value.
@@magnusculley6817 Well, once you've ruled out race conditions (because you're using a language that enforces thread safety, or your code isn't multithreaded to begin with), cosmic rays (because your server uses ECC memory), and floating-point fuckery (because you're using integer vars), at that point it's time to look into supernatural root causes. ...Honestly, I'm surprised there _aren't_ more COEs containing the phrase "Return the slaaaab..."
Checking for equality would be the normal case. If the index is beyond the upper limit, that should trigger some form of assertion so the root bug is fixed, not masked.
@@cassinihuygens1288 Sure, but preventing hell from breaking loose is more important, you can always add logging after breaking out of the loop in case shit hit the fan. I'm not saying you shouldn't deal with the issue, I'm saying you should ensure the code runs as expected even when failing. A small addition; it would be better to check if the value exceeded the expected value outside of the loop regardless for quite massive performance reasons, since a loop like this will run so many times that a simple if-statement will slow it down significantly. It is also not necessary to check it until after the loop has been exited as it couldn't have been above it prior to that. So regardless this is how you'd do it.
I didn't really think a HTML/XML parser was _that_ hard to implement, but i never even thought that it basically is just a giant state machine, where different characters can change the entire state in many different ways, and managing that is nightmare inducing
A layer of abstraction is missing in the description. A finite state machine is indeed used in many modern approaches, but they are usually auto-generated from a more human-friendly parser description. So the finite state machine remains under the hood, invisible for the end-user (programmer). Do you know regexp? In classic implementations finite state machines are generated from regexp, and then used to do the parsing.
You can't parse [X]HTML with regex. Because HTML can't be parsed by regex. Regex is not a tool that can be used to correctly parse HTML. As I have answered in HTML-and-regex questions here so many times before, the use of regex will not allow you to consume HTML. Regular expressions are a tool that is insufficiently sophisticated to understand the constructs employed by HTML. HTML is not a regular language and hence cannot be parsed by regular expressions. Regex queries are not equipped to break down HTML into its meaningful parts.
An empty buffer as a terminator of a buffer filled with null terminated strings is common enough pattern to be named: double null terminated buffers. But it's also weird enough that even when its used it's often not consistently produced or consumed.
This channel is so underated omg I can't even believe that a small channel can produce so much quality content, keep going bro, you're going long ways ❤
Based on the voice I think it’s the same guy behind Fireship Edit: After listening to Fireship again I can hear a difference in the voice, and Fireship also shares a channel called Jeff Delaney which I would think is his personal channel.
As a QA software dev, my job is to write the program that runs our automated tests as well as hunt down bugs in the code the regular QA guys find. Kudos for making an entertaining video out of a bug hunt. I have had some looooong hunts. Requiring a very specific type of bad input is really hard to figure out, I often end up stepping through the debugger trying to think of if any of the possible branches would fubar me.
I have a fun bug report for you. Devs on project come up to me saying that Cassandra DB rejects a basic SELECT query on data of a certain user, and they can't see the entire stack trace because of logs are masked on production. I try the query and see that JSON parser returns with "index out of bounds" while constructing the array of integer IDs from the result! By removing numbers from the end of it and looking up the source code of the parser used by the database, it turned out that the hacky ICPC-style was failing on a combination of the first 3 numbers pushing the index into thousands.
I became a much better software engineer the day I joined a small company where it was standard practice to stay at work until the bug that was blocking process was fixed. I wasn't there very long before I became very adept at identifying possible error conditions. As you suggest, a good debugger is an essential tool because it shows you all the things you didn't think about.
@@chupasaurus Exhaustive scenario walk-through during code inspection could have weeded it out before it got into the final product, but it depends on the caliper of the code inspectors.
The QA community should self publish a monthly magazine only filled with bughunts from devs out i the trenches like this video. Like veterans telling war stories at a bar, imagine all the little things you could pick up along the way over the years.
The evil part comes, when running the debugger changes how the program is executed, for example when the bug comes from a race condition and running the debugger naturally makes the code run slower.
If you’re ever waiting for something to increment until it equals a value, it doesn’t hurt to have an “else > value” block that throws an error to let you know something went wrong
Got this in my feed with only 20 views. Usually I always skip those, but I gave it a chance. Great quality on the video, don’t think you’ll not be big one day.
This is where documentation becomes important. If the developer who implemented the empty buffer had explained why they needed it, maybe this wouldn't have happened or at the very least they could have figured out a way to circumvent the problem when they were first rewriting the HTML parser.
But there's always the chance the buffer was perhaps entirely accidental (smells like an off-by-one error to me, instantiating one too many buffers for our purposes) rather than actually covering up this other off-by-one error in the other parser
@@ferociousfeind8538 naw, I don't think it was a one off. They had deliberate functionality in there were the system would chunk html docs and the final doc in the buffer was always empty. That empty buffer chunk was like a carriage return in a way. The closing tag of the HTML indicated the end of the doc but if the doc doesn't have a closing tag, the lack of characters in the following buffer would work as a flag that basically tells the parser that this is the end of the doc. It's obvious that this was done deliberately but the reason is a bit vague as to why exactly they did it this way. Hence why documentation would have been important. When they migrated from the old parser, they obviously didn't take into account the edge case where the HTML is broken and they removed the empty chunk without adding logic to handle the usecase.
this is why i write my thought process for most things. espically if its "clever". forget someone else. *I* need to remember what I did and why 2 months later when I look at it aagain lol
This is a really great video. It explains the situation really well and easy to understand. I also massively appreciate how you put footnotes in the description.
I've been on RU-vid since it was Google video and I've never seen a video description like yours, good stuff, this video presentation goes tough as nails, I salute
I always feel uncomfortable seeing/doing something like p == pe to check if the end has been reached, interesting seeing those fears validated. I always do p >= pe, and add assert(p
This video and channel deserves so much more recognition. I'm almost halfway through the video and it's so well put together. Good wishes and I cannot wait to see you get the views you deserve for the effort put in. ♥♥♥♥
I have never been more proud to know that as a c++ dev I will always be needed because legacy moon runes written 30 years ago will inevitably fail when some obscure pointer spills over into undefined memory.
New favorite channel; love the high BPM background music, visuals, concise yet detailed explanation, overall format/structure, the Michael Bay explosions every 5 seconds, etc.
I had a stupid bug last month, that heavily degraded performance, also happened Friday night. I had added a new caching solution some years back (one I wrote), and after 1,5 years flawless performance, I added more usage. This tipped the scales, and the caching storage was exhausted. I then remembered I forgot to add a flush of storage in this case, so it got full and all new requests failed. I quickly added a flush, only a few hours later, but this was only a bandaid; it'll fill, flush, fill. So after some quick sleep with a fever from the flu that I had, I realized the cause (it was added to something unique, causing every call to create a cache with no hits, at 1+ mil transactions per second), so I deleted the caching at this point and it flowed again, phew. Cause? A design change. The caching point WAS a good non-unique place for 6 months in development, but during bug testing, someone altered it, so it became unique. And I had already done the tests for performance at scale, so it just wasn't noticed 😬 Luckily it was hardly noticed by anyone, but it could have been truly terrible. I work in a bank, and the finance engines was grinding to a halt. A process that runs some critical financing was still running after having used 16+ hours! After my hotfix, we terminated it and restarted, and it took 2 minutes (as it should) 😳 If I wasn't sick with a fever, I could likely have reacted faster, but thinking in that condition was as slow as a caching bottleneck 😂
Kevin! Just stumbled on this video and was watching until 5:00 before noticing that your channel is tiny. How are you this good at this humble size! Big ups to you my man. Thanks for the content
This video was super good and well-made! I don't know how to describe it, but it just felt good to watch. The visuals were just so satisfying, and I especially liked 6:51. I also really appreciate the sources, assumptions, and corrections in the description! Many big RU-vidrs don't cite anything and go by the philosophy of "well you shouldn't trust me entirely anyways so it's not my fault if I misinform you." Subscribed and liked, great video!
I just found your channel, must say I am impressed and like the content. I have a CS and Mathematics degree, and most channels don't give a deep enough dive or are just too cornflakes with water dull to pay attention. Thanks for the content
I've learned to add a >= instead of == even if you always expect the pointer to never get past the target, cuz you never know, right? This could've prevented this from happening as well
Yes, only add == when you truly only want to run your code when the variable is exactly that value, if the code can accept higher values there's no point not using >=
10:20 Usually, if some code looks dumb, or inefficient, it’s because some software engineer before you working to a deadline had to get something out fast, not clean- and there’s probably a good reason for that “empty buffer” 😂
Such informative video! It's easy to follow and it taught me to value backwards compatibility more. I hope you get more views! Also kinda surprised to see Mr. Affable there
"why was an empty buffer added? No reason" I can almost guarantee that was a developer that thought "someone will inevitably do something dumb. Lets make sure there's nothing beyond the last buffer"... and the developer that removed it thought "this system must be perfect - no overhead. It's not like anyone's going to try read beyond the last buffer".
Which is why when having range checks you should not test for equality but equal or greater/less, that way a one off error can only catch one extra char. Yes it will cost performance, but if that is unacceptable, you need to have much better understanding. The extra empty buffer seems in my opinion as a big red flag, if its explicitly added and not just a result of an one off error it probably served a purpose and should be thoroughly documented, or refactored/re-engineered into something less obfuscated. This is why I always encourage curiosity and ask all new devs to make sure to question any code they do not understand, if a more experiences dev cannot explain it in an understandable way its probably wrong, or at least bad code that should be rewritten :)
The only problem I see with this, is how the f, didn't they have 100% test coverage, with something this important, if they had proper test use cases, they would instantly catch an issue, with the new parser implementation. It's insane that this important digital companies don't do the most basic coding practices, it's just mind boggling to me.
No company can reach 100% test coverage. Seems you don't know how any important software is patched together in the real world. 😁 It's all a patch work. No software can be defect free.
I used to have a website, full of scripts that were poorly coded by me. It was full of unfinished tags i.e. those that weren't closed properly. It could have caused the whole internet to collapse if my website was visited by a lot of people
This is very common for how memory bugs occur in software. One programmer makes an assumption as to how the memory works, and writes their code accordingly, then some other programmer changes some other piece of code that makes it so those assumptions no longer hold, and voila, you have a bug.
There is about a 100% chance the original dev had this bug occur, didn't figure out the preincrement issue and simply added an empty buffer as the last data, "fixing" the bug just in time for the weekend
This one was a double-whammy: Code Optimization kills the safety hack the old engineer put in place (never documented; probably SOP to him). And a rookie mistake in implementing boundary check conditions. Back in my good old days of Delphi, the compiler had an option called Range Checking, which would guarantee this kind of bug would never see the light of day....however, it hindered performance and most devs never used it outside of debug builds.
Back at my college we have been told to use => or =< for buffers for exact same reason. This precise condition (++p == pe) is based on assumption that p will always be incremented by 1 and never will be incremented by 2. But actually there is possibility it will be incremented by 2 or even more. Simple use of (++p >= pe) will actually fix issue, and prevent it from happening. It reminds old exploit of buffer overflow somehow. The reason may be bugs in this or other code, sun or other radiation make bit from 0 to 1, malfunction of memory or registers, etc.
- Honey, I'm home! I'm sorry for being so late, me and the boys were saving the Ethernet. - Ah, I see. So you prefer your friends from work more than me?
Not to be that guy, but this is probably one of the most interesting examples why memory-safe RAII is so important, and in Rust this couldn't have happened
Well, it probably wouldn't have happened if cloudflare didn't go into the c code to manually optimise it. People can still shoot themselves in the foot while using Rust.
I only understood probably 10% of this, but from the very little that I did understand, it appears that the way HTML is parsed is part of the reason why it is still so important to learn lower level languages, like C. Also: VERY important: this is why we always use the proper comparison operators during an iteration loop! 😂
Early increment has its applications. Pre increment is not the cause of the bug. It's just that the code somewhere incremented again which caused the check to skip over the breaking condition. ++p >= pe would have been a safety net for unexpected behavior like this.
RU-vid's algorithm must have a bug in it because I've never written a single line of code in my life yet here I am, left to conclude that anyone who has is not quite human.🤣😋
@@bytefu Yeah. imho functional programming languages are the best kind of language to code a parser in. I've coded a json parser in scheme before and that was a breeze. All I had to do was to slice up the string and pass it down to offspring functions. Having no shared buffer and not having to think about pointers and memory life cycle puts a load off the mind.