This was so good i would just like to say. And my great great great grandson, William, is very primed and ready to tell you about u8 and their downfall
Absolutely great video. You managed to cover almost everything and keep it simple. There's only one huge problem. I'm 257 years old and that invalidates everything.
This guy is one of the better discoveries I've made on RU-vid in quite some time. This is helping me pass my Operating Systems exam. Btw, books referenced couple of times in your videos are the same ones I'm currently learning from. Great minds think alike. 👍
Actually future cyborgs and androids have a good chance too. Not sure if they'll still code in Rust then, but what I am sure about is that software can be updated. So any robot in the future is probably able to post a feature request. 😂
@@CoreDumpped well no, while i dont entirely agree with the commenter, his comment about padding is right, the u8 could be padded when used in a structure but when incremented above 255 it'll still wrap around. padding just exists to align structures and help the cpu cache
Tbh I still try to an uint16_t (C++ equivalent to u16 if I’m reading the syntax right) when possible, because I tend to bunch data together in ways where the data savings are noticeable (and I often have two 16-bit values next to each other which won’t get padded). Admittedly using a uint8_t/u8 is kinda stupid for age since a troll could very easily set their age to be 300 or something, but with a 16-bit integer you could validate to 3 base 10 digits and call it a day. Only case where I’ll use a large number of 8-bit values is a string of ascii characters, otherwise your do create potential overflow issues and the data savings typically aren’t worth it (whereas you typically won’t overflow a 16-bit value with normal-ish arrays)
@@CoreDumppedI don’t know what that commentor was smoking but there is an actual reason to avoid using small ints. I don’t know if this is an issue with rust but with c small ints are able to be promoted which can cause errors that are very difficult to debug.
According to them, we're already at fault for using any fixed size memory medium, because in the future, our software won't cover the case where a person/entity will leave for eons, way above our limited understanding of how long a thing can exist in the Universe.
That "as an AI model" jab with the voice was hilarious. This was a great refresher since I could hardly remember anything from my own operating systems unit.
Excellent video. By far the most educated explanation to stack vs heap without digging unnecessarily deep to lose the point. I love the part you reiterate what “heap is slow” actually means “heap allocation is slow”. Would love to see a follow up video on how to access heap fast by leveraging cache-line alignment, packed data structure etc. from day to day i got worried about accessing the heap memory in hot path, even it is a pre-allcoated once vector to be reused for each loop - i can’t reason about for every loop whether or not pre-allocating a dynamic size vector on the heap (and perform a clear upon each loop) or use a large enough fix sized array so it’s on the stack always (although potentially waste of memory coz you can only expect the worst case size of an array at compile time) Looking forward to the next video. Greatly appreciated
Heap and stack management are two of the most fundamental topics in programming languages, and understanding them can spare a lot of headaches so you can secure and debug your program easily. Thank you for your videos. Keep it up.
I can not agree more. I took the last 2 months just learning stack and heap and i got so much better in debugging programms with dynamic memory systems. I also got very good with linked lists, trees, and recursion. I learned so much more in such a small timeframe that i can safely say this is really fundamental in growing as a low to mid level programmer.
@@OneMilian if you want to understand linked lists, recursion and trees conceptually you cannot do so by understanding the low level stack or heap lol.
@@samuraijosh1595 of course you can, look: If you understand the heap you can already think what happens with a linked list, when memory for a new Node gets allocated on the heap, like in this video. If you wouldn't then youtuber core dumped would not implement data structures in this video about the heap. Recursion is one possibility to move between the nodes. If you know the heap, you can definetly see and think what he does.
This video is EXCELLENT! Like you said at the start, there's some simplifications-but it's totally accurate and very comprehensive. I'm one of those "more experienced people" you mentioned at the start (at the end of an extended 6-year undergrad in compsci), and I'm impressed by how much info has been crammed in here. I'm about to start a master's degree with a focus on pedagogy (the study of teaching), so you can imagine I spend a lot of time explaining these topics to my juniors. I think you've found an excellent level to simplify things to for someone who may be new to this sort of thing. Good examples always help, and you transition to them well and the "story" of the video flows well. Great work on this video. I'm going to go and watch the first video on the stack, too. Even if I've thought about this stuff a ton over the years, this was a great refresher on a few of the concepts I've gotten rusty on (pun not intended) or even completely forgotten about. Looking forward to the next video!! 😄
6 years of undergrad lol wtf? 3 years of normal undergrad is more than enough. Most people could’ve gotten a masters + honours year in the time you spent dicking around
@@kotfare1698 People have lives that can get in the way, such as having kids, having to work to survive, thus making it take longer to get a degree. Most people often quit as well. I'm happy mattshnoop kept going.
Great video! Good explanations at a nice intermediate level! I can see a couple of things to add for a bit more depth. First, the performance problem with allocating memory on the heap is a good example of where it's more about the variance (or unpredictability) than the typical case. For most reasonable heap implementations and kernel page allocators, allocations will be really fast, but a small number of times, it's going to be really slow, and that can have significant impact on "quality of service". Second, in that vein too, I would say the overhead of requesting memory from the OS / kernel isn't really the context switch. That's pretty negligible compared to page allocation. What's really bad is the kernel lock. The point is that the kernel has to deal with multiple processes requesting memory at the same time, and so, it has to have a global lock to synchronize those allocations between processes. Clever things can be done to avoid hitting that lock too much, but it still has to be there and you will inevitably hit it. This is one of the few places where multiple processes on a system directly interfere with one another (beyond just generally sharing limited CPU/Mem resources). And kernel locks can be a big deal, I've seen wait times in the order of a few seconds when hitting a kernel lock, which can be a major disruption in the middle of an operation you expect will only take a few milliseconds.
@@CoreDumppedYou will be providing a huge service for us. This video thoughted more about these several different topics than months of watching videos or reading articles about the same thing.
I don't know about the reality of kernel locks related to memory allocation, I never knowingly experienced one in 30+ years. But I also don't really see the necessity. The kernel knows and controls which processes are running in parallel at any moment in time, at most one per CPU core. All that would be needed to avoid contention is to assign a pool of pre-allocated memory pages to each CPU, that only this CPU (and whatever process is running on it right now) can use. The assignment can be stateless (f.e. using a hash code of page addresses). I would be surprised if something like this is not already implemented in Linux kernels in one way or another. Also, once a process requests memory, it's no longer running (bec. of system call). It's of course wrong to think of "the kernel" as if it was a single threaded thing. Multiple parallel alloc system calls running on different CPUs can well be contending. But such concurrency issues are the bread and butter of kernel land. I don't think that anything people like us can do in user space will be more efficient than battle tested algorithms running in the kernel written by people spending every working day on such topics for years.
Also: how can a kernel lock possibly result in wait times in the order of seconds? This sounds like the kernel was waiting for a process to finish playing air on a g-string. Or less offensively, doesn't that smell of things like dead-locks?
Excellent explanation of complex concepts! I wish if you can cover Booting, BIOS and UEFI in detail along with MBR and GPT partition internals. Namespaces, CGroups and File systems deep dive
After I wrote a compiler, I realized that a lot of the power comes from the fact that you can use relative references from the knowledge you have about the stack at compile time, which makes it so powerful. There is no lookup, it just knows the address of the variable you are trying to access. This was after optimizations, at the layer of code generation.
The comment put on blast in this video makes me think of something John Carmack said in his appearance on Lex Fridman's podcast; I'm definitely paraphrasing, but Carmack said something about how it can be useful to include limitations in your software that will notify you when things have changed greater than you ever thought they would. He was speaking in reference to limit-removing ports of the Doom engine, but I think it's pertinent here too. It's okay to write software that addresses your current needs and your current use case using the technology currently available, and sometimes that will force you back to the drawing board to some extent in the event that something related to your needs, use case, or technology changes, but that's not necessarily a bad thing even if it causes more work, because if something has changed that much, it's quite likely that other things have also changed such that other parts of your software need to be revised or at least reevaluated. So basically, what I'm saying is that the event of maximum human lifespan more than doubling from its current state would probably be part of much broader and more sweeping changes to the world that would call for a more holistic revision of your software. But because there does not seem to be any realistic probability in any remotely foreseeable future of a human being living to age 256, it is perfectly reasonable and sensible to represent human age with a u8.
In modern architecture the difference in memory taken up by an u8 vs. a u64 is 0, and as such the program should have just used the u64. This is because it is faster to access based on memory width than to parse a memory address as finely as 8 bits. Yet the reasons for that comment was stupid as the longest I've ever heard of a program being used without semi-regular maintenance is 10 years.
There's one more thing that can make stack faster. Instructions may use immediate addressing relative to the stack pointer. Accessing data on the heap requires loading its address to a register first (not on all architectures), which not only is an extra instruction but additionally makes that register unavailable for anything else.
One more thing: there are architectures for some embedded devices, where the RAM for stack is just faster to access than the RAM for heap. Basically, those devices have two different types of RAM. For such architectures, accessing array[3] will be faster, if the array is stored on the stack, than on the heap.
...but once you have that address in that register, if you've designed the program well, you'll have a nice amount of cache-friendly data to iterate through, avoiding many cache misses.
...and if you've designed the data really well, you can also avoid many of the headaches with multi-threading, including performance hits which evict data in registers and L1 out to the slower shared L2 or L3 caches (i.e. false sharing).
The explanation at the very end is much better than the entire video, except for GC which does not protect against leaks anywhere near as much as people like to believe; there is such a thing as reference leaks.
Men I've watched all your videos and I have to say it: impresive. Very well explained (even with the simplification which is 100% necessary). Keep it up, nice work :)
Tysm for making these. I've been learning Rust lately, and as part of that I've been trying to learn how to best use the low level features it provides. I've gotten the hang of references and am getting to grips with lifetimes, and learning about things like the heap and stack are helping a lot with learning what Rust is doing and what it's protecting me against when it's particularly strict in a region
You know in the past two months I spent weeks reading and searching on these low level concepts (explained in details more or less) after losing hope on RU-vid. Thankfully there is someone today 😇
Please keep these videos coming! I'm coming from Javascript with very limited insight into the amazing world of low-level programming. I'm currently learning C and your videos got me hooked!
An amazing explanation for someone who has experience programming but not in a language or context where I ever have had to think about or learn this stuff. Thanks!
I've been getting a lot into memory management and i gotta say this is the best video ive found. You answered all my questions and just explained everything perfectly! Also the AI voice didn't bother me much after a few minutes. Thanks for sharing this and please continue to make more!
I cant believe that I can finally understand what the heap and the stack are! Im looking forward to more videos, especially the ones about gc and threads. (I dont usually comment but this channel is just something else. im also just doing my part to help algorithm lol)
As a Python, Lua and Bash wannabe scripter, im happy to have discovered your channel. I showed your other videos to a freind struggling to shift from c++ to python and he was amazed, we both understood how and why pointers generally work in c++, stack overflow and how high level vs low level fundamentally differ in both code and under the hood. Im also inspired, though ive to 1st finish some high level language battles. Im even heavily conflicted between c++ and Rust, its going to be a challenge but aint that the beauty of language ( moving diffrent parts within certain parameters to attain certain output ... like the way we arrange nouns, verbs and adjectives in this foreign language to get an output/meaning). Thanks... Arigato and Valar Morgulis.
Sos o unique way to explain the core very core of everything no matter how advanced AI even become this will remain the very core. Kudos for such effort. Can I expect a data structure series from this unique visual perspective at least? Thank you.
Oh wow, I found this channel as refresher of things that I mostly know and for intresting stuff I never delved into (how dram functions was intresting) and having something to send to people who I think need to know this (usually dev juniors). Who would guess I find prime in comments for like 3rd video of chanel. Anyhow very good job with that stuff.
Are you really an AI? Your content is far better than content of mere humans. BTW what app do you use for fantastic visuals and soothing voice. That would be nothing without a well thought, thorough and structured content. Bravo !
Having only worked with garbage collected languages in my career so far: it's great having these concepts and their implications actually explained. But it leaves me scratching my chin if I can utilize any of this knowledge to make my code run faster and/or more efficient. Just from a gut feeling I'd say this is impossible for PHP, bash and Python but maybe there's a chance with Go
Great. You covered the heap in a lovely and intuitive way. The only thing is that I would add would be alternative allocation techniques such as arena allocators their upsides and downsides.
Videos like these are really helpful as they help visualise the key concepts we learn in computer science. Take data structures for example, I kind of knew why we use them but seeing an animation for it really helps clear things up. I hope we get a series about the different data structures and their needs too.
You're one of the best out there for explaining weird prgramming concepts that are obscure for some of us, your videos are of very high quality and very well explained. I am really looking forwars that arraylist video tho, that tickled my curiosity.
Amazing video! Just found the channel but you explained everything is a very digestible format that really helped me understand concepts that i was having trouble with.
I have never seen this so well-explained. Excellent job! This series of videos will be my go-to resources for teaching others these low-level concepts.
This video is also very good for explaining why things like Java have Xms and Xmx and why it is best to set Xms to Xmx (pre-allocating all ram you want to allow the program to use ahead of time) as syscall for allocating memory takes time/is expensive
One of the best channels at explaining low level concepts, i have just started with C recently in hope to learn about these things more and later on go to rust, and i must say you are really good (even as a LLM haha)
15:13 Another solution is to just allocate 3 pages at once, set the middle page as no permissions for anything (thus ensuring a segfault can occur instead of corrupting internal data). Then extrapolate the address of the linked list via pointer - pagesize * 2. Done well you can take any address in the data heap and identify what data it belongs too. However the allocators should also be designed to allow choice between the "first, best & worst" fits. It should likewise be designed to treat buffers and normal data differently. Such as allocmem( WORST_FIT | DEDICATED_HEAP NULL, size ); By having dedicated heaps it's possible to prevent buffer overflows from corrupting other data by way of a segfault.
As a senior engineer, this is an important lesson to learn for junior. Heap is fast and can be fast, it's all about how you design your data structure. For example with linked list, one can implemented with contiguous vector as storage ( you must manage the free slot a bit ), and it can be really fast.
Yooooo, this video and explanation are amazing, I have been having a vague concept of thes topics for a while now, but this really helped me grasp it much better. So thanks a ton for all the effort in the animation and script!
I love the level of Reddit “well actually” from that comment, the part about ages going over 256 years being a Y2K bug waiting to happen is gold because you didn’t build your software to last. Glorious.
The funny part about it is that it not only requires life extensions to get to that point, but it also requires living humans being at least 256y old, which won't happen for another 150y even if we had these advancements tomorrow 😂
@@Leonhart_93 funny that implies human consciousness can survive for that long, I remember my old grandma just tired of living and begging for the release, she lived to 96.
@@monad_tcp Hard to say why is that, it's very likely that an aging body makes you tired of life. Especially the hormones would be very low. If she lived 96y in a younger body, would it be the same? I am guessing not.
@@monad_tcpgood thing people are working on healthspan and not just lifespan now, no way id want to live to a 100 if i was incapable of doing anything interesting
7:57 If I'm not wrong, while it's true the pointer itself does not know its associated chunk size, it's "stored somewhere". For example in C++ you might request N bytes, but the implementation will allocate 8 + N bytes, where that prefixed 8 bytes stores the size (and you get a pointer to the beginning of the N-bytes chunk).
Thanks for this video, it was quality! Shared it with fellow boot.dev'ers who really haven't heard of the Stack/Heap. Also for me it was really nice to see the inner details of a sys call, because to my own demise, I knew there was more overhead from it, but never looked up what actually was happening. 10/10 vid!
for people like me your channel is immensely helpful for example now I understand why we need data structures like linked lists. I hope you keep on producing awesome content. This is like Nand-to-Tetris but more visual. Thanks.
The trip! Thanks. I learned to really program in C using macos 7. A pointer to heap was called a handle, doing your own memory management... Seemed complicated at the time, seems easy today with a hundred million layers between your program and the hardware.
I’ve studied programming on and off in many forms for a long time. This is the first time I have seen an example of how a linked list interacts with memory fragmentation. The others all just explain the big O’s and that’s it. They don’t ever encourage thinking and reasoning about runtime. Good job! Thanks for this. Liked and subscribed! (I wonder if the algo triggers on exclamation marks, let’s try that !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!)
Thanks for this trip down memory lane. About 20 years ago a big difference between C# or C++ became a 3 hour daily commute for the latter. The linked list option is easy to forget, useful in higher level languages too.
Stuff like this is why programming is magic/wizardry to me. I understand the logical concepts, but wielding it is another thing entirely. I also see why (in part) programming is so difficult and error prone.
ooooohhhh, okay NOW I understand when something goes on the stack or the heap and why. I'd watched and read other resources explaining this, but never quite got it until your explanation here. Thank you! =)
I showered praise on you for the Stack video... and you've outdone yourself with this one! You could easily be a required supplemental instruction to CS50, and it would actually make the course clearer and more intuitive :) Can't wait to see these other teased videos, such as vectors and other data structures!!!
This makes stack-oriented languages, like Forth, interesting and admirable. If you keep a stack-based approach to memory management, and force your programming logic into this model, you can avoid the issues of heap management and get better performance. Also, by avoiding concurrent access to the heap, you can keep the heap more predictable? If your architecture is based on a tree of stack machine processes with their own stacks(potentially multiple stacks per process to make things easier), perhaps this can be a good approach?
great video. but quick correction here. 3:43 System Calls does not involve a full context switch, but a partial context switch. That means the current process doesn't change with OS process. It's still true that the CPU state is stored and replaced. A process granted with a higher privilege will start executing kernel code instead of user code.
Hey, thank you so much. Honestly, a ton of what discouraged me from programming earlier in my life was a lack of very basic answers. Everything was always dismissed as "abstraction" or "lolmagic" without ever just admitting "I really don't know". Prime example is that deceptively simple printf() function. When I asked that same question of "how are the arguments being manipulated whent the call is made?"; I kid you not, the majority of the answers tried to give me a description in C++. Why does searching for correct answers in terms of programming have to be quest every time? Most answers are wrong and the ones that are right are almost deliberately overcomplex and unnecessarily heavy on jargon.
15:56, You can combine both, just make the links offsets from the base of the array. Every allocation would then have a minimum size of at least one link in the array.
A heap can be preallocated and have a fixed size and the call stack can grow dynamically. OS level memory allocations may or may not happen with either one. Also the amortized cost can be made to be constant, at the expense of using slightly more RAM. I wouldn't worry about the syscalls. To really know the impact of heap allocation and fragmentation on performance, you'll have to actually measure it. You can find studies about that online. One that I recall reading years ago showed that first fit and best fit had basically similar performance in real world scenarios.
Alone having it spelled out that the stack is allocated at program launch and that's why it's fast and small is so valuable. Not even to speak of the rest. Great videos, thank you!
An array only needs to be contiguous in chunks of 4K (or whatever your OS page size is), as the OS uses virtual addresses. So a realloc won't typically move any data in physical memory, only rename chunks/OS pages (and possibly assign more of them). realloc being as cheap as it is means there is typically not much reason to ever use linked lists.
How does CPU caching work with this? Does it cache memory which is physically close to requested memory, or memory which is close in process address space values?
The main problem with unknown data sizes on the stack is that you wouldn't be able to compute the offset to access a given variable at compile time. An array of constant but unknown size for example cannot be put on the stack either! Suppose you push x: u64 to the stack and then y: u64. The compiler knows that to get y he needs to take the stack base address and add 8. This is a single assembly instruction. If the size of x is unknown however this offset can't be determined at compile time. This would mean that additional runtime clock cycles would be needed to figure out where a given variable is located on the stack.
Really great video. Some notes from my side. The heap can actually be much slower than the stack, also by just accessing it. This is mostly the case in garbage collected languages, because the GC runs additional to the actual allocator and continuously tracks every object that you reference to in code. In C# for example, there is a measurable performance improvement by using more so called value types in code, which differ from reference types in that they do not come with allocations. They are a direct value as a whole, which has the implication of copy behaviour though when assigning them to new variables. What you mentioned at 7:52 is not so debatable, but I have seen slices in Rust being referred to as "thick pointers", which do store the size together with the actual address.