Advanced Topics: Software Memory Barriers

Подписаться 22 тыс.

Просмотров 9 тыс.

50% 1

In this video we look at memory re-ordering and software memory barriers!
For code samples: github.com/coff...
For live content: / coffeebeforearch

Опубликовано:

5 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 21

@hanspeterbestandig2054 2 месяца назад

Very well explained! 👏👏👏 Thank you Sir! 🙏

@thouys9069 3 месяца назад

good stuff man

@AMITKUMAR-ys4oe Год назад

nice explanation

@mikeoxlong5100 Год назад

Speaking only about software optimization, why dont you just declare shared_value as volatile?

@blipman17 3 года назад

I had to say, I was kinda confused with x86 sfence, lfence and mfences. I know the empty inline assembly doesn't give any guarantee in a multithreaded or multiprocessor environment, but i fear that that is not clear enough from this video.

@CoffeeBeforeArch 3 года назад

My goal was to avoid talking about hardware memory reordering as much as possible because it's a separate issue (and warrants a dedicated discussion). If your instructions have already been scheduled in the wrong order by the compiler, it won't work even on a sequentially consistent machine, so you've lost the war before you've even executed your application. I'll be doing video on hardware memory reordering and hw barriers soon! Cheers, --Nick

@blipman17 3 года назад

@@CoffeeBeforeArch Ahh so you've actually forseen this problem already. Of course you're miles ahead of me. I should've expected that by now. I just always seem to have to talk about what certain memory primitives do and more importantly what they not do when talking to other programmers.

@CoffeeBeforeArch 3 года назад

Memory models are a niche topic that unfortunately many people don't have a good foundation in. It's also a place where intuition can fail you. Fortunately on x86, it's a relatively strict memory model where the reordering of stores is not even possible, so the example is safe. On a platform like ARM, with a much weaker consistency model, you would need a barrier (e.g. like the ARM linux kernel spin lock that uses the smp_mb() macro that expands to the dmb instruction).

@nathanmartinez2630 6 месяцев назад

LOVE IT. THANK YOU.

@arunu2002 3 года назад

Can you help a session on reading binary files as buffers using fread function? How to predetermine buffer structure size in binary? Thanks much

@CoffeeBeforeArch 3 года назад

Thanks for the suggestion! I'll see if it fits in with any of the other topics I have planned. Cheers, --Nick

@ravenxrz6523 2 года назад

nice video, but why we should use volatile here？

@chenyu8553 2 года назад

Good question, I also want to know.

@DoobooDomo 2 года назад

@@chenyu8553 technically, this is an incorrect use of volatile, I believe it should be an atomic with relaxed memory. The thing you want to guarantee is that now_serving becomes visible to other cores. An atomic or volatile guarantees that reads and writes always goes through memory and doesn't use a local register copy. The difference being is that CPUs have visibility on other CPU's memory operations (since memory is shared) whereas registers are private to a logical CPU (hyperthreads get their own copy of the register state but share the compute units with other hyperthreads). The reason why volatile is not technically correct is a bit subtle and is the reason why C++11 introduced atomics in the first place. Probably the explanation would make for good content @CoffeeBeforeArch!

@killacrad 8 месяцев назад

@@chenyu8553 volatile guarantees intra-thread ordering and inter-thread visibility but without ordering and, thus, not inter-thread synchronization. The order in which other threads see volatile memory accesses is not guaranteed to align with the order in which they are made by a given thread. This is why atomics and memory_order semantics are required to ensure desired behavior actually occurs in program execution. As stated by @DoobooDomo below.

@MyChannelZs Год назад

Didn't know Hearthstone was an advanced topic haha

@davidflaherty8509 3 года назад

Hey Nick could you check your email please? I would reallllllly appreciate your help with something. Thank you!!

@Ergzay 3 года назад

There's zero point to create a software memory barrier without a hardware memory barrier. This talk is highly misleading. You're using undefined behavior of compilers to do this. If you're going to cover memory barriers, you should talk about the standard C++ barriers. More so you're using "volatile" on now_serving incorrectly. Volatile is only for use with hardware IO, and has basically no use for anything outside embedded applications.

@267praveen 2 года назад

Not sure what your point is. The guy has started from a single concept of software memory barrier and has another video about hardware memory barrier. This is common sense to move step by step for novice audience. You know stuff doesn't mean you should be told all at once. Now stay quite and check all videos.

@DoobooDomo 2 года назад

@@267praveen Erg's tone is a little harsh, but he's correct. Even teaching to novices, it is probably a good idea to avoid teaching things that are actually wrong. This use of volatile is incorrect (it should be an atomic with relaxed memory ordering), but it is a common error because that's what people did before C++ had an explicit memory model with atomics etc. (I think this use of volatile is also correct in Java). I agree this stuff is subtle, and made-up (but important) concepts like the "abstract C++ model" make things more difficult for all learners. To end on a positive note: I really enjoyed Coffee's treatment on different spinning policies in his spinlock playlist.

@turdwarbler 2 года назад

you should take this video down. using volatile in this way shared between threads is WRONG. It may or may not work but basically its undefined behaviour.