@@jskr456there’s little people of the network card shouting through the network cable. And in the receiving computer there’s little people listening. Don’t let big network deceive you
The write is about writing to the screen. Under the hood it writes to standard output. Under the hood it writes to a file descriptor. Under the hood the file descriptor describes a terminal "device".
The year was 2024. The Internet discovered system calls. Next season: containers are just processes; blinking an LED with TypeScript; .eh_frame section.
At the end you described iouring, but it cannot be used everywhere, because sometimes you need to know result of one syscall in order to decide whether or which syscall to perform next. If you bring this to the extreme, everything will be running in kennel space. Enter eBPF.
Specifically, each line is its own write syscall. First is "File contents:", next is the buffer, next is the empty string after newline. This is really not an equal comparison, because in python he used an iterator to iterate over the lines...
To your point re: batching calls. Look at sendfile. It takes two file descriptors and moves data from one to another. When switching from userkernel land memory buffers are copied (userspace cannot access kernel space) and that gets very expensive when sending large amounts of data.
thanks for thinking out loud , and letting everyone know that dumb questions and the best ones and they really fill in the loop holes in your understanding , it takes a lot of courage to do that to answer the question you posited at the end of the video - I don't think that would be possible lets assume you are reading a 10 MB file and printing its contents to the screen - lets say you create a 1 mb read buffer , now you read the file and print , to read the whole file , you would need to make the 10 read syscalls (keeping things simple here for understanding ) - now if we employ your idea of making all the calls at once , that would defeat the whole purpose of having a memory buffer for read - if we make all the read calls at once where would OS write the data it has read , because the buffer is only 1 mb and the point of having a buffer is to constrain memory usage.
Sorry but this is video has way too many problems ;) (1) The number of syscalls impacts the performances, a context switch between two threads of the same process or a switch to kernel space and back can take up to 5 microseconds, so if you are reading 1 byte at time (silly example eh? sadly not too much :( ) from memory and do absolutely nothing else at maximum you can read 200000 bytes (probably its less but let's pretend it's a perfect world and you don't pay the price of the memory copies and various checks). Even if you pin the thread, you would get down to about 2us and this would mean just 1000000 bytes per second ... New kernel components like io_uring implement VERY complex mechanisms to avoid the need of context switching forth and back from the kernel! (2) The version of nodejs in use is fairly old, beginning of 2022, with the 20.15 (which is the LTS not the latests so there might be even more performance improvements) it's down to below 700 (3) Not clear which is the version of python (4) This is minor but the C code does 1 single printf when instead the nodejs and the python version do 1 print per line, this is minor because the file contains just 2 lines NOTE: I am not a fan of node.js at all lol I like facts ;) EDIT: Also would have made sense to have a comparison with some code that was actually doing something because perhaps a lot of these syscalls are done during the initialization phase and the number of them might be much lower while the code runs. It would have been an important and useful comparison.
@@ghassenlabidi5171 to be honest o am not sure of the specifics, I am more highlighting the approach to this specific test / video. I understand that the original. Idea was to highlight all the extra work that these platform do for little reason but it's also very much true that they are built for backend applications and/or long running applications in mind and therefore they are little optimized when it comes to the startup
To exclude initialization, you could start and end with a specific syscall and use awk to select just the relevant ones between, then shell magic to summarize.
Speaking of chat GPT, a lot of times when I'm out jogging in between listening to your RU-vid videos and other videos I will have conversations with the chat GPT voice mode about these sorts of backend and front and integration issues and other technology ideas I have in the same or simular domains. It would be interesting for you to record a voice conversation with your musings about this with chat GPT or PI AI to see how the conversation goes, then follow up on it to find out if it's expressing something useful or not。 You would get immediate feedback from what might be an expert source if you prompt it correctly, and it would also give us the ability to see if it's able to give useful information the way you do when we're not able to have a direct conversation with you.
It would be interesting to do control runs executing empty source files to adjust for the Python/JS initialization, and deduce the call count of the read action specifically
I think nodejs python bun require exec to run the file so it might have more syscall but c is compiled in exec and directly executed so it have less syscall.i might be wrong though just my assumption
Even working with memory need a syscall like malloc function , you supposed to mention that in the beginning, but thanks for the video anyway. Follow you from Algeria 🇩🇿
With synchronous syscalls, I don't think it's possible to "batch" them. The kernel wouldn't have the arguments. If you objdump the C code, you'll see before each `syscall` instruction, the code places values in specific registers. With io_uring, this batching is straightforward. You can submit 3 SQEs and wait for all 3 to complete. In that time, your program yields to the kernel and the kernel only resumes that thread when everything is ready. Then you can loop through CQEs and get the results.
shouldn't we also take into account the system calls that are done anyways without reading files? i feel a lot of those system calls are common for all node processes. also some of those reads are due to the script itself, as node and python also have to read the script passed to them. i feel comparing them to a compiled language like c is a little unfair. a more fair comparison would have been go vs c.
Hussein is basically saying we need to avoid those kernel-user switching again & again, instead do something like an UPSERT or an Asynchronous call you make in JS. Hussein I thought a little bit, I had a question. Wouldn't an Asynchronous call take more time? You will surely have less system call but execution speed wouldn't be less? And won't it create a problem if the 2nd system call is dependent on the 1st one so it went, came back, bought some metadata(read write) and the 2nd call uses it with some of the users input? I'm probably mad or I should do some breathing exercises.
You could use command buffers and queues, the way Vulkan does. And IDK anything about how any of this works, but if Vulkan does it, I'm inclined to think it's fast.
The real (and unsurprising) takeaway for me was how ugly javascript is… Can‘t even read from a file without wrapping everything in a lambda, just so that you can await something. JS really gives me nightmares
It's interesting stuff, but I this was a very unfair comparison because the programs did completely different things and I really don't get why you are using an ancient Node version
I think you are not giving node bun and pthon a fair chance. Sure the c program is fast but if you think carefully we also need to make trace of when we are making the .out executable filein c. We are executing that file but to execute that file we need to first make it and that will surely take more syscalls. Just a observation to point out
the javascript code could have been much better. especially with such an ancient version of node. i think it misrepresented how bad JS is. JS is bad. but not that bad (copium)
Watch out mistaking a library for a framework, let alone runtimes in the JS community… You’re either gonna start a third world war or end up in a mysterious us accident, really dangerous thing to get wrong…
I think this “benchmark” is quite frankly flawed. Not because it doesn’t show anything, it’s because it shows things in the wrong light. What you have showcased here is not the objective speed of execution of these runtimes and the number of system calls they make, but the behaviour they show when starting the environment, reading configuration, writing startup logs and all the things they do right on start, which they then almost completely stop doing. The program you wrote in C? - It’s compiled, it has no need to set things up as it’s been done in advance by the compiler. So many people structure their benchmarks in a way that is objectively flawed simply because interpreted languages require some startup overhead put in place by the runtime that simply isn’t there with precompiled programs. If we were to play a fair game, we would need to benchmark the compiler doing its own things together with the compiled program against a script running inside of a runtime.
@@TheOnlyJura Sure, a guy comes with an actual argument to which some random on the internet says "lol no" and leaves. Either provide an argument or leave jerk off.