How Memory Usage Slows Down Your Programs

Jacob Sorber

Подписаться 165 тыс.

Просмотров 20 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

28 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 69

@diconicabastion5790 3 года назад

It can make a hell of a difference. I created a program a while back that created roughly 40,000 rooms connected together in a maze. Initially it could take hours. By the time I was done it was under 1 second. That was on a single thread.

@internetsfinest8839 3 года назад

It’s amazing how fast you can make a program. I was writing a program that will seek out a certain line, but it was extremely slow. The solution was to put the entire file into an array, and the program went from a couple minutes to a single second.

@JacobSorber 3 года назад

Yes indeed. The more you understand how the underlying layers work (and don't work), the more often you can see opportunities to improve things.

@hidden8990 3 года назад

I like this one a lot. It moves the direction of data oriented design. I don’t know what your experience is with that, but I think an introduction to that vs object oriented, and what use cases would make one more efficient than the other would could be a good one-off video.

@rogo7330 3 года назад

I thinking about how to make DOP and OOP help each other. DOT is more about how to actually place data in memory, but it lacks readability in complex things. OOP on the other hand can be done in C with structs and pointers to real objects, but i curious how much it will cost if i will use OOP-ish structs in runtime.

@badwolf8112 3 года назад

i feel like OOP doesnt necessarily stop you from optimizing. you can use OOP in many ways, and C++'s creator talks a lot about performance and reliability for systems, so i doubt using OOP makes things worse (many of C++'s abstractions have no overhead)

@sanderbos4243 Год назад

@@badwolf8112 Using OOP doesn't stop you from optimizing, but it's just that things like for example virtual function calls can both bring the performance down and make it much harder to debug. Think of a line, where OOP is on the left and Data Oriented Design is on the right. The left is generalized to be very extensible, but can be slower and harder to debug. The right is more specialized to a specific task using knowledge of what you will *actually* use it for in practice, and can be faster and easier to debug. This principle of flexibility vs excellence also holds for a lot of other things, like AI: you either have an AI that is meh at a ton of different tasks, or have a state-of-the-art chess AI that excels at the one thing it was made for. If you try to design a real-life factory that is able to create *any* type of car, then your factory will be incredibly complicated and slow. The clue is that the more you decide to constrain what your program is capable of doing, the more performant and simpler you will be able to write it. OOP applies less constraints, which comes at several costs. This is an overgeneralization, because pretty much every programming technique has its place, but look up "object oriented programming vs data oriented design" for countless articles and videos if you want to know more.

@RobBCactive 10 месяцев назад

OOP is for long term adaptability & encapsulation, when you're solving real problems correctness is key. Requirements change with time and you may need to add features and data types later, which were not even considered when the project began without updating 100's of files. A data oriented design works well for things like games, the data tables might even be extracted and processed from build edit tools that use objects to produce the level data that the runtime uses for high efficiency max fps with specificity not flexibility.

@reptilicusrex4748 3 года назад

This video was very informative and really gets the point across using a well-chosen simple example. Thanks for the effort.

@sparten1527 2 года назад

I'm taking an advanced computer architecture course which deals with optimizing memory access times in the cache. This is really interesting how it ties in with the knowledge I've learned in class!

@debanjanbarman7212 3 года назад

Sir, please make a series on Linux kernel development.

@GAMarine137 Месяц назад

Appreciate these detailed videos

@cernejr 3 года назад

Helpful reminder. And good to see actual numbers to get a feel for how much the cache actually matters.

@foadsf 2 года назад

I don't do C for work. I just watch your videos as meditation. you are amazing! 🖖

@dmitripogosian5084 Год назад

Back n the 80-s, anybody who had 2 hours of C instruction, would never write the second loop. It was just an common knowledge that C is stores 2D arrays in column major order, while Fortran is row major, so you iterate in C on the second index inside the loops, and in Fortran on the first

@farikunaziz6504 3 года назад

Can u give overview about rust language

@iltrovatoremanrico 3 года назад

Very informative, thank you!

@badwolf8112 3 года назад

(how) do we know that the faster way to access would be faster on all computers? can you make a video about profiling, and in particular, how do we know profiling on one computers makes a difference across all computers?

@JacobSorber 3 года назад

Good topic idea. Thanks! In C, arrays are laid out in a specific way. So, this example is going to see pretty consistent performance across nearly all machines, since the same elements will be close to each other each time. But, yeah, there definitely are situations where performance will be more machine specific. I'll see what I can do.

@lawrencedoliveiro9104 2 года назад

I once took some MATLAB code written by a client, made a series of tweaks to it, and showed it running a lot faster. I did all my testing on a Linux box. Unfortunately, the client needed to deploy it on a Windows installation. And on that machine, all the versions ran at pretty much the same speed. ∗Sigh∗ ...

@dwaynestgeorge2558 Год назад

Thanks

@billowen3285 3 года назад

I love your channel Jacob!

@raghavsrivastava2910 3 года назад

Yes this professor is amazing

@JacobSorber 3 года назад

Thanks, Bill.

@tauqeerakhtar2601 2 года назад

This is all about spatial locality. It is implementation based. May be the opposite will happen if implementation of memory is opposite.

@matteolacki4533 3 года назад

Just out of curiosity, how fast was it without allocating anything?

@jwbowen 2 года назад

Now do the same thing in Fortran and see how the timings shake out

@sman3424 3 года назад

Does spatial locality still hold with malloced memory since virtual memory could only be contiguous in the address space, but not physically?

@8292-d6n 3 года назад

Yes it does. Your cache line is usually 64 Bytes long (could differ especially on embedded systems) meaning every time you access somewhere in memory it loads 64B into the cache. If your malloc'ed data is larger than you will have a delay each time you enter a new section that wasn't loaded into cache. Also note that whatever you load into cache will be removed as soon as new data is coming in but this behavior depends on your cache sizes and strategy.

@benjaminshinar9509 3 года назад

could you show the hit-rate of the two loops with cache-grind?

@JacobSorber 3 года назад

Probably. I'll take a look.

@CR3271 Год назад

8:45 I know this wasn't the main point, but I just have to say stl vector -- at least the MS VS implementation -- is a scam. I regularly get 4x speed out of my home-spun dynamic array template. Maybe you could give the boys at MS a lesson in memory access 😂

@LoesserOf2Evils 3 года назад

Would using dynamic allocation address this issue? I fear -- I mean, 'think' -- not because of stopping to allocate a new cell in twoarray every so often. I fear -- I mean, 'think' -- dynamic allocation would increase the execution time partly because of the access time.

@JacobSorber 3 года назад

Allocating the space would take extra time, but once you have the memory allocated, I don't know that it would be any slower. Maybe a little if having the accesses further apart affects cache use, but I would expect them to be nearly the same (except for the extra cost of the malloc/free calls). I guess I'll have to try it out.

@LoesserOf2Evils 3 года назад

Thank you, Dr. @@JacobSorber. In summary, execution would slow because of the allocation time for each cell but not because of memory access time, right?

@ВладДок-д8щ 2 года назад

It would be slower due to additional inderection. And allocator could give you any address so locality goes out of the window. That's a serious problem in Node based data structures like trees and such

@LinucNerd 3 года назад

'Mike Acton' made a presentation here on youtube about Data-Oriented Design (ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-rX0ItVEVjHc.html), and he touched a bit on the mentality that "The compiler can fix it"... Although the compiler can do a lot, there are a lot of things it can't do. The compiler is not some magic software that knows everything, although built by intelligent people, there are limitations. There also seems to be a growing trend nowadays against OOP and C++, and I'm not entirely sure why... I don't know C++, it's too hard for me, but I wonder why it's become so trendy to hate on it.

@irwainnornossa4605 2 года назад

Openning braces belong to a new line!

@leokiller123able 3 года назад

So all global and static variables are stored on the cache?

@JacobSorber 3 года назад

Caching happens at a lower level-at the load/store level. It doesn't know anything about whether the variable is global, local, static...whatever. If your program accesses memory it be cached.

@leokiller123able 3 года назад

@@JacobSorber so when you store memory in heap using malloc the memory also gets cached? I quite don't understand when memory is stored in cache or not

@JacobSorber 3 года назад

@@leokiller123able It does when you use it.

@leokiller123able 2 года назад

@@JacobSorber okay so then if all memory gets cached when you use it, why do people say that accessing memory on heap is slower that on stack? If it goes in cache anyway shouldn't it be the same speed resuired to access it?

@JacobSorber 2 года назад

@@leokiller123able This comes down to two things. First, there's the cost of allocating and freeing memory. That adds overhead. Second, the heap can get a bit fragmented with bits of data scattered all over the place. The call stack is a simple linear data structure, and all of your local data for a function call will all be colocated in its stack frame. So, you often get better locality (hence better cache performance).

@embeddedbastler6406 3 года назад

As we are talking about cache locality. Wouldn't a video about data-oriented programming be very insteresting?

@JacobSorber 3 года назад

It would.

@HimanshuSharma-sd5gk 3 года назад

New video idea. arbitrary precision arithmetic in c.

@HydeFromT70s 2 года назад

I appreciate the effort but this video shows HOW to use the arrow operator and not WHEN to use it...

@matthewkott8863 3 года назад

Very interesting. Intuitive, once (if) you think about it. By the way, Jacob, have you done a video on Big O already? That's a topic that really baffles me, as I simply have a hard time recognising the patterns that determine the most significant terms.

@JacobSorber 3 года назад

I haven't. I'll add it to the list and see what I can do.

@emreozdemir9358 3 года назад

So someone finally point out the importance of cahce with an excellent example. Thank you so much sir.

@JacobSorber 3 года назад

You are welcome.

@realdragon Год назад

It makes sense, I would never think of that

@happyTonakai 2 года назад

Good video, number 1000 thumb up!

@adityavikramsinha408 2 года назад

likeeee uuu smmm

@TheVertical92 3 года назад

This always segfaults on my machine if my ROWS/COLS are too high. Seems like max stack allocation makes trouble🤷‍♂ Edit: Oh i see. When i declare the array globally it works, if i declare it within main() it segfaults.

@47Mortuus 2 года назад

That's it? Matrix traversal? There's so much more to this... Let's just hint at the "Von Neumann Bottleneck", which suggests that ultimately, any program is limited by memory bandwidth when it comes to performance.

@umpoucosobreconhecimentos 3 года назад

Thank you for this very useful explanation about how memory locality can affect program performance. Amazing explanation

@pathayes2224 2 года назад

As always, an excellent presentation. Caching is key to performance. I once developed with a processor which had this option switched off inadvertently,. As a result it crawled along, despite its fast clock speed. Also, file IO effects speed, because it is slow. This is a big area