No video :(

Parsing JSON Really Quickly: Lessons Learned

InfoQ

Подписаться 230 тыс.

Просмотров 74 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

26 авг 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 109

@vytah 4 года назад

43:37 "cause you're assuming that the person running your program is not switching the CPU under you". The audience might be laughing, but this actually is sometimes a case, even in consumer hardware. Non-US Samsung Galaxy S9 has a heterogenous CPU, with some cores supporting the atomic increment instruction LDADDAL and others not, and with Linux kernel modified by Samsung to report that all cores support that instruction. Your program would crash after being rescheduled to another core.

@GeorgeTsiros 2 года назад

what about non-US s7 and s10 ? Got any more info maybe?

@Kitejrpaladin Год назад

Wow, you brought up a good point, I mean his assumption is more or less valid in the sense that someone isn't going to hot-swap a CPU on you like a hard drive in a RAID array. However, that changes when you start looking at current changes in CPU architecture. where you can have cores that might be architecturally different. Examples of this are Big-little architecture in Arm CPUs or Intel's p-cores and e-cores in the 12th gen x86-64 CPUs. If intel for example decided to have built that CPU with the microarchitecture of the P cores and E cores unchanged then the P cores would have supported AVX-512 and the E cores wouldn't have. To avoid that issue, both the P cores and E cores support the same instruction set features according to intel.

@rikamayhem 3 месяца назад

@@Kitejrpaladin Same instruction set is not enough, other CPU characteristics are already getting switched under us. For example, Mono had a bug in which the JIT failed to flush half the instruction caches on little-cores because they first queried the size on a big-core. And IIRC, some applications/runtimes have shown false sharing because they first queried the CPUID on an e-core, which reports a model that formerly required half the stride. At this point, they might as well drop the instruction set requirement.

@toddymikey 4 года назад

Good lecture for programmers interested in thinking pragmatically rather than just abstractly.

@MrMcCoyD4 4 года назад

I love this! I never knew about carry-less multiplication. I also didn’t realize that disk speeds and CPU throughput were so similar nowadays

@PatrickKellyLoneCoder 4 года назад

You sir, confirmed a theory I've had for the past year but was unable to work out: you absolutely can parse (at least partially) using SIMD. Which opens up the door to parsing with a GPU. Absolutely fantastic!

@SimonBuchanNz 4 года назад

The real trick is "shotgun parsing", if you can break the input up into chunks and partially parse each chunk and recombine the results into a final parse: it looks to me like the quote escape and category bits could be separately processed and combined, for example. It would be interesting to see if that speeds things up when it's already so optimised: for example x264 is a CPU SIMD library to encode H.264 video that beat the contemporary GPU encoding libraries, many of them written by the GPU vendors!

@trinopoty 4 года назад

The real problem is that not everything can be parallelized. Certain algorithms are just impossible to parallelize beyond a certain point if ever. For instance, algorithms like SHA or AES looks like they'd benefit from parallelizing at first glance but then you realize that they just can't be parallelized because each byte depends on the previous byte. And, let's not forget that maintainable code is better than efficient code most of the time. The mathematical shenanigan pulled by this library may be fast but it'd need a 3 day class to explain to anyone looking to maintain the code.

@PatrickKellyLoneCoder 4 года назад

@@trinopoty yes, absolutely, but there are parsing algorithms which, to extremely oversimplify, grab chunks of text and then determine what it is, rather than reading byte by byte, char by char, etc. SHA and AES are encryption not text parsing. ;)

@trinopoty 4 года назад

@@PatrickKellyLoneCoder Yes those are not text parsing, just random examples that came to mind. You can do text parsing in chunks if you can break it down properly. But, like I said, at what cost?

@SimonBuchanNz 4 года назад

@@trinopoty If I'm not mistaken, both AES and SHA are block oriented precisely such that they can use SIMD to encode multiple bytes at once.

@plusmanikantanr 4 года назад

Has anyone managed to figure out the XOR multiplication Instruction Set and CPU Instruction he mentioned @35:54 ? The entire lecture is too darn too good ... very relevant in this day and age. All the kids think that NodeJS or Serverless is the new hot topic. Most people forget that we are still limited by the hardware and wiring that exists.

@patrickproctor3462 4 года назад

Looks to me like arithmetic shift left (zeros are rotated in on the right, rather than taking the bit from the left), or SARL in x86, combined with simple XOR. As for using AVX/2 to do it on multiple characters at once: template __m256i _mm256_shift_left(__m256i a) { __m256i mask = _mm256_permute2x128_si256(a, a, _MM_SHUFFLE(0,0,3,0) ); return _mm256_alignr_epi8(a,mask,16-N); }

@MrMcCoyD4 4 года назад

Did some googling and I’m pretty sure he says “carry-less multiplication”, which has a lot of results and involves xor. The math isn’t super clear to me though.

@DanielLemirePhD 4 года назад

It is the carryless multiplication, also called vector multiplication or polynomial multiplication. You can use it via intrinsics like vmull_p64 and _mm_clmulepi64_si128. It is basically "like" a multiplication but where the addition is an xor. So 3*a would be a XOR (a

@patrickproctor3462 4 года назад

@@DanielLemirePhD Oh that is some wacky stuff. Thank you for the information.

@drygordspellweaver8761 Год назад

@DanielLemirePhD Hey Daniel- amazing work. Can you point out any good resources for all the bit shifting magic that people have come up with over the years? Or even general guidelines / glossaries of the techniques

@yohan9577 4 года назад

Impressive bitfields wizardry 😮 Thank you. This talk is giving me inspiration to rewrite a few things without branches, and learn more about "bitwise" intrinsic operators !

@patrickproctor3462 4 года назад

Depending on the context (and wizardry of the hardware) branching might still be faster. Make sure you benchmark before & after :)

@yohan9577 4 года назад

Patrick Proctor Of course ! Maybe I should have written « without branches when they are not really necessary ». I used to think that branches were always « harmless » but now I understand potential mispredictions.

@patrickproctor3462 4 года назад

@@yohan9577 It depends on the program. If it's "small" and doesn't have many, you will probably never miss.

@PatrickKellyLoneCoder 4 года назад

@@yohan9577 There's a lot of information about this in the gaming space. I don't remember what it's called but there's a presentation from Sony about optimizing games, and they found quite a few cases where outright calculating large things and not needing them was better than a branch misprediction. It's weird. But yes, always always always benchmark and profile. Branching can speed up your code when correctly predicted. Inlining can actually hurt performance in certain cases. And so on.

@garryiglesias4074 4 года назад

@Patrick Kelly - It's not weird, it's a matter of cache, values dependencies and many very concrete things. I love theory, but some "expert" freshly out of their PhD should just keep a low profile instead of reducing everything to some abstract or "standard" stuff, over-abusing the "don't reinvent the wheel" to the point we'd still be with stone wheels if there weren't some "pragmatic" guy who actually look how things work and what it imply. Yes benchmark, but if you're satisfied with your benchmark believing it's the fastest thing in the world, you might be wrong.... So benchmark and also ask the old fellas who did impressive things when there were less than ONE MEGAbyte of RAM, less than 66 MEGAHz of clock, 32 bits (both latter when you were lucky or not so old). Don't reject them as "obsolete old C programmers".

@fredg8328 4 года назад

Good old bitwise tricks. Reminds me how I coded 30 years ago.

@ShadowTheAge 4 года назад

Aren't you supposed to use classes like LeftSquareBracketMatcherStrategyFactory?

@patrickproctor3462 4 года назад

and LeftSquareBracketMatcherStrategyStringReturnerFactory? Long live Kevlin Henney xD

@markthomas9641 2 года назад

Great example, thank you. DeltaJSON is really useful if you are working with JSON, it does compare, merge and graft.

@whamer100 8 месяцев назад

GTA5 uses the inverse of this for its json parsing, it takes 6 minutes to parse a 10 megabyte json file for some ungodly reason

@andrzejsupermocny2386 2 года назад

If anyone else is confused by slide 40, 41; then there is an error where the order should be brackets -> 1, comma ->2, colon ->4, most whitespace -> 8, whitespace -> 16, other -> 0. Then the math checks out. Had me scratching my head for a long time...

@allanwind295 4 года назад

Interesting lecture with good mix of high-level strategy and low-level (pseudo) implementation details.

4 года назад

Very nice talk.

@willw2596 6 месяцев назад

This is an amazing presentation. Thank you!

@eformance 4 года назад

If you have learned to program assembly language, all of these optimizations are second nature. Everyone should *learn* an assembly language and spend time hand optimizing their code. Learning assembly will make you a better programmer and influence your coding strategies in time-critical areas. That being said, optimizations often cause readability to suffer, so be wise about how you use coding strategies.

@KarlForner Год назад

Very impressive, and nice speech. Thanks.

@dielfonelletab8711 4 года назад

Now I wish I had a sticker!

@yjhwang4246 4 года назад

Its really excellent approach. But I have no idea about how to check the cycles.. Is there any tool?

@breakprismatshell6270 3 года назад

perf

@trinopoty 4 года назад

I think if you need that level of performance, just ditch JSON in favor of something binary based like Protobuf.

@impero101 4 года назад

I was actually about to write the same thing. If you're in a situation where your critical path is when parsing JSON sent over HTTP requests, you're much better off just using a different protocol. Although for parsing large JSON files, the performance increase of this library can certainly be a big deal. I reckon the speaker also knows this, which is why they benchmarked it using large json files, rather than many small snippet.

@yadancenick2671 4 года назад

"Our backend developers spend half of their time serializing and deserializing json." From what i've been working around, JSON serializing and deserializing happens and only happens in the entrance or the exit of a request process. During the request process the "JSON" will be something like map/dict or Object. And in a typical backend server, the time saved by JSON serializing compares to DB calls will be almost negligible. But the magical branch-less parsing is so amazing and beautiful.

@blackwhattack 3 года назад

If you want fast float serialization you should probably dump the in-memory representation, no?

@Axman6 2 месяца назад

JSON uses a textual representation of numbers (and the standard allows for numbers that are unrepresentable by double and float - though most parsers don’t parse all valid JSON, Haskell’s most common parser is a notable exception).

@JaapvanderVelde 4 года назад

5:15 I'm sure Douglas Crockford would at least write valid JSON, i.e. not missing a closing curly braces...

@DanielLemirePhD 4 года назад

The typo was reported to me. I suspect it is a copy-paste error (the last character got cut off). This being said, it does remind us that not all JSON is well-formed.

@llothar68 Год назад

Now i wonder what magic a binary format could do with SIMD bit magic. I know Protocol_Buffer and some other like BJSON are not doing this.

@smooth_lighting 4 года назад

👌🏼

@platin2148 4 года назад

You might be faster by not using the jvm which needs todo emulations of the real platform e.g use posix write/read instead of java‘s sh… and move to a language that was actually designed to be easier analyzed and optimized.

@ccreutzig 4 года назад

I believe that is part of what he is doing. None of his optimised code is going to work in Java.

@brunoais 4 года назад

@@ccreutzig I think it is fine to run it with Java. Just use native methods. That's how java works fine in many different platforms. Every time very high performance is needed, native methods can be a good solution. This one here is easily one of them.

@brunoais 4 года назад

I think it is fine having it running in the JVM but as a native method.

@ccreutzig 4 года назад

@@brunoais Sure, if you have a reason to use Java, go ahead. Many of us don't, and that is fine, too. That wasn't my point - all I was saying is that the video doesn't show Java.

@brunoais 4 года назад

@@ccreutzig Oh. I misunderstood your comment. I've coded in Java, C and many others... Which is fine too :)

@subschallenge-nh4xp 4 года назад

I have started learning Jason this year can somebody tell me what's going on on the video

@subschallenge-nh4xp 4 года назад

So so sorry It was spoken or written

@rumfordc 4 года назад

@@pictureus What Systems Planet said was not "elitist" in the slightest? he gave him the exact answer in the simplest terms possible.

@Voy2378 4 года назад

Zen2 also has AVX512

@DanielLemirePhD 4 года назад

It does not.

@Voy2378 4 года назад

@@DanielLemirePhD yes my bad, thanks

@KaiHenningsen 4 года назад

Nice content, but somewhat painful to listen to.

@cerebralm 4 года назад

yeah i love it but i can't stop myself from hearing him say "jism" every time he says "json"

@Ryan-hp6xs 4 года назад

Or... stop using JSON and start using Protocol Buffers. Better in virtually every way.

@SimonBuchanNz 4 года назад

I've used protocol buffers. They are not better in virtually every way: they are better in several ways, and worse in others: just like Avro is better in some ways and worse in others than both of those, and so on for most other formats. There's no simple "best format", and it's foolish to simplify like that, when the available options are so rich.

@rudolphriedel541 4 года назад

Getting rid of JSON might be a good idea then.

@rudolphriedel541 3 года назад

@Peter Mortensen something that is optimised for the machine to work with, anything but ASCII for starters, probably something that was invented a couple of decades ago

@ChrisHinton42 4 года назад

And the moral of the story is: don't use JSON

@sanderd17 4 года назад

Why not? JSON has many perfectly valid use cases to pass data on between two parties. It's well standardised and has libraries in all kinds of environments. If you need to invent something on your own, you'll certainly introduce bugs every time, and the performance will be worse than JSON too. It will only be better if you can invest a massive amount of time in it. It is an argument against microservices though. In a monolithic program, you can easily communicate over memory structures, but the microservices often use JSON communication which cause slowdowns.

@ChrisHinton42 4 года назад

@@sanderd17 Umm, what? You're right about JSON being very standard with a lot of users, but it is not hard to write your own format. Maybe it's harder to write your own format in JavaScript, or other scripting languages, I haven't done that. But I have done it in C++, and it's super easy. Binary formats are even easier to write than text formats. Serialization/deserialization is really easy to test. This presentation has some cool stuff, but it's all totally unnecessary work if you just make a format that's something like a version number, checksum, and dumping a struct to disk. If it's an internally used format, you don't need to do any parsing, and the fastest hash functions can validate the data faster than your memory, much less your disk. If it's an externally used format, it needs a tad more thought given to flexibility, ease of validation, and versioning. But it's just not a hard problem. So as a game developer, JSON just seems like a mediocre data exchange format that involves a bunch of pointless CPU work, and a bunch of pointless optimization effort if you need it. But if performance never becomes an issue, then you're right that using JSON in any managed language takes basically zero development effort, and with so many people using it, it starts to make sense to use it externally from a pure business perspective. Sigh.

@sanderd17 4 года назад

@@ChrisHinton42ah, there you say it, you're a game developer. Games typically have very little communication with third party applications. If all code is controlled in-house, it's indeed easy to make a good binary format. And in games, cpu time is a very limited resource, so you are allowed to spend a lot of time on optimizations. I work on business applications: communication between all different parties that require all different data. If the other party supports JSON, I know it's going to be a lot more stable and easier to develop. If they require their own text or binary format, there are always bugs in the protocol. Performance is also a nice-to-have, but not an absolute must. It's nice if a user has a snappy interface, but there's more benefit for the company in automating additional tasks.

@Niosus 4 года назад

@@ChrisHinton42 The original use case for JSON, and probably still the main one, is for browser-server communication on the internet. In those cases you are most of the time limited by network bandwidth and latency. It's very useful there, because you can parse responses from random servers without having to know the format of the message exactly beforehand. It's perfect for quick and relatively small communications over a network, and being human-readable makes it very easy to debug if something goes wrong. If you're going to pass larger amounts of data, especially if the structure is known in advance, something like Protobuf is already going to be many times faster while probably even being easier to implement (as you get all the type checking for free). It's not as useful for web APIs, but for regular applications it's just great. It still isn't the fastest, but odds are you will not be bottlenecked by disk IO... Different formats for different use cases. A Ferrari is faster than a truck, but sometimes you need to move a container. A truck is your best friend in that case.

@ChrisHinton42 4 года назад

@@Niosus Parsing JSON and passing it along without doing anything with what's inside sounds like a waste of your limiting resource: bandwidth and latency. Not to mention there's nothing about JSON that makes that better than any other format. And if you're just validating it, isn't it the case that a valid JSON message doesn't necessarily make a valid API message? I still think that the only thing JSON has going for it is that a lot of people use it. And therefore there are many libraries. Text formats are good for debugging, but its super easy to write a binary to text viewer for debug builds and viewing logs. JSON is general purpose. General purpose tools are by definition not tuned to a specific purpose. Expediency or expense is the only reason you would use one. And I guess my point is that making the specialized version is actually pretty easy. And if backends are really spending half their time serializing and deserializing JSON, then it would be way cheaper to use a custom solution. (of course, depending on traffic, any time spent with JSON could be really expensive in absolute terms) But maybe I am being naive. Why is Protobuf not useful for web APIs? I don't know much about it, but it looks approximately like what I would propose as a middle ground between general purpose and specialized.