how does a Mutex even work? (atoms in the computer??)

Low Level Learning

Подписаться 668 тыс.

Просмотров 61 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

7 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 116

@edgeeffect 2 года назад

I looked at the title and thought "Oh! Everyone knows what a mutex is!" But I'm glad I watched because I'd never thought that acquiring a mutex either needs a mutex or needs to be atomic. ;) Great stuff as ever!

@tf_d 2 года назад

Reminds me of the question, "who delivers the mailman's mail?"

@Stopinvadingmyhardware Год назад

@@tf_d his wife

@Vnifit 2 года назад

God I love your channel, your videos are the best I've seen on embedded/assembly stuff. Really fascinating topics presented beautifully!

@savithaiyer8017 2 года назад

Damn, I’d love a video on a DIY RTOS. Love your videos man!!!!

@esra_erimez 2 года назад

+1 because like wasn't enough

@KenJackson_US Год назад

If your processor has a Read-Modify-Write instruction, you can implement it in one instruction, simple. If your processor lacks it, the simplest way is to disable interrupts for the read, modify and write instructions and then reenable interrupts. Though you have to be careful to _not_ reenable interrupts if they were initially disabled.

@filker0 2 года назад

This is a good video, though I would have used different instructions for the MuTex. Most ISAs have instructions that are intended for implementation of synchronization primitives like MutEx and Semaphores. The ARM Cortex-M provides LDREX and STREX. The first sets a reservation for the source memory address. If any other entity writes the location, the reservation is lost. STREX conditionally writes to the location, and will fail if the reservation has been lost. The entity that writes the memory and breaks the lock could be another thread on the same or other processor cores or DMA from an external master. Usually, a MutEx will not be set up in memory that is the target for DMA, but having multiple cores, even on microcontrollers, is becoming more common.

@MindlessMegaLawl 2 года назад

Mad respect as an argentinian for showing footage of the Super TC2000 motor racing series, one of (if not) the most technologically advanced series in Latin America. Thank you ✨

@jonshouse1 2 года назад

"Mutex in C" should be "Mutex with POSIX threads". Plenty of systems use mutex without pthreads. ESP32 Base code and FreeRTOS for example. Not sure why anyone would make yet another RTOS. FreeRTOS is good for small stuff, NuttX for larger stuff, Linux for big stuff, Any of the three +Gate array for obscure/awkward stuff !

@luismorales9878 2 года назад

Nice, i always wonder how could a mutex gives the security of multithread, the Atomic operations Is the key, thanks Man, AND like

@gaborm4767 2 года назад

What are the differences between mutex and semaphore and what are the pros and cons?

@abidanbrito 2 года назад

From my understanding both are synchronization primitives and both are used to solve critical section problems. Having said that, a mutex is a locking mechanism, whereas a semaphore is a signaling mechanism. The former goes to say that a single task can acquire the mutex, effectively owning it. This means that only the thread / process associated with that task can enter the critical section. The mutex needs to be released by the owner thread, and it can only do so upon exiting the critical section. At a high level, it can be described as a mutual exclusion object. A mutex object is either locked or unlocked. The latter pretty much says that if a thread is waiting on a semaphore it can still be signaled by another thread. In short, a semaphore can be used by several threads (based on priority). That is not the case with a mutex. There are two types of semaphores: binary and counting. At a high level, it can be described by means of an integer variable. Note that a binary semaphore is not equivalent to a mutex. As for pros / cons, I recommended looking those up online (there are some lists on blogs), or trying both out for yourself on a microcontroller.

@gregoryfenn1462 2 года назад

Mutex is a Boolean lock/unlock state. Semaphore is a finite resource counter (E.g. only three processes can access some data at once). Note that a mutex is a semaphore - it’s a semaphore where the counter is set to one so that only one thing can access the resource at at time.

@gaborm4767 2 года назад

@@gregoryfenn1462 yes, so i ask pros and cons because if you can do the same and more with a semaphore like with mutexes, what is the reason for the existence of the mutexes?

@random_bit 2 года назад

@@gaborm4767 the spinlock is the main difference, sempahores use interrupt handles to communicate whereas Mutexes use a spinlock/compareswap mechanism

@gregoryfenn1462 2 года назад

@@gaborm4767 randombit answered too but the short answer in general is that since semaphores can do more than muticies, they won’t be as efficient or simple to use as a mutex. A tool that is can do lots of things will be less good at particular task compared to tools that are more specialised. So if you just need to lock or unlock a resource, a mutex is better than a semaphore.

@petevenuti7355 2 года назад

what about in distributied systems? are there other techniques that work with highly asymmetric systems? possibly other complete semantics and methodologies that don't require coordination in distributed multi-processor systems. please what can you tell me?

@victoraraujo1255 2 года назад

I'm a simple man. I see a video of this guy, i watch and leave a like

@crustykrist7144 2 года назад

great video bro. Very clear and to the point. TY!

@LowLevelLearning 2 года назад

Thank you :)

@Nellak2011 2 года назад

Atomic lock sounds like Atomic Clock. Imagine a processor running on an Atomic Clock using Atomic Locks for Mutexes, while the Program running is running Atomic Simulations that were written in the Atom editor.

@nimcompoo 2 года назад

Everything was fast until the point u opened the atom editor

@KangJangkrik 2 года назад

@@nimcompoo yess thanks to atom's electron js

@petevenuti7355 2 года назад

I know that this is ment for humor, but timing between threads are critical and running on an atomic clock would be awesome

@n00blamer 2 года назад

02:29 lock doesn't make instruction atomic; the instructions the prefix has effect are already atomic read-modify-write operations. SO: "The LOCK prefix ensures that the CPU has exclusive ownership of the appropriate cache line for the duration of the operation, and provides certain additional ordering guarantees. This may be achieved by asserting a bus lock, but the CPU will avoid this where possible." The effect of lock can be quite dramatic performance drop as "ownership of the appropriate cache line" means that on multi-core architecture the most recent writes into the line have to be propagated to every other core accessing the line (when the cores don't share the caches, the level of sharing varies between architectures and cache hierarchy levels but L1 is typically exclusive).

@AlessioSangalli Год назад

Nah. Cache coherency is a small part of memory consistency. What the lock guarantees is that load/store queues are flushed and executed the right way and that the observes in the specified domain see the same thing. But the first part of your message is correct, as in a single core system hardware locks are not necessary.

@n00blamer Год назад

@@AlessioSangalli TL;DR - the instructions are already atomic and lock does nothing along those lines. Short version is that it gives exclusive access to the memory and what you describe is effect of that fact. It just happens to be a detail that it is more efficient to have the exclusivity per cache line rather than all of the available memory so that more shit can happen at the same time. Thanks for clarifications btw.

@AlessioSangalli Год назад

@@n00blamer yeah but what I wanted to clarify is that there are mechanisms that act even before the caches that play a role here. Thanks!

@n00blamer Год назад

@@AlessioSangalli Indeed, I recognized that and thank you! I guess when you write too much past the comment in mind it gets tangled up ... :P anyways, cool stuff

@_-deep_a_b.862 4 месяца назад

i learned something Thanks!

@jacob_90s 2 года назад

This just really makes me hate all the assholes on programming forums I had to deal with all the more, back when I first started learning how to program in 2006. Just asking how anything in the standard library works would immediately get you attacked: "Why are you even asking that; just use the ones in the library" "What, you think you can do it better" Even if you managed to get thru to them that you just wanted to learn how they work and weren't trying to implement it yourself, you would suddenly be met with this wall of silence. Always dozens of people willing to criticize, but never anyone willing to actually answer the damn question. Have learned a lot since then, both about how the underlying mechanics of many programming languages and hardware, as well as that the most vocal people in a group often have little else to contribute that simply being the loudest, but it really made me reluctant to want to deal with other programmers unless it was face to face

@dominiquefortin5345 Год назад

For my part, I never understood why cpu maker create a “CAS” opcode (compare and set) for mutex. Compare operation require the ALU so are slow as opposed to an “XCHG” opcode (for i86 or TAS for 68k) which is much faster because it is only 1 read and 1 write to memory. From there, you build a spinlock mutex to make an OS mutex which can be used to make a full semaphore.

@marcossidoruk8033 Год назад

cas is for other things, not spinlocks. You are indeed right, atomic exchange one time and then atomic load in a loop is better for spinlock implementation.

@whtiequillBj 2 года назад

I was recently reading about quiescent consistency vs liniarliability. The document I was reading was: Quiescent Consistency: Defining and Verifying Relaxed Linearizability tl;dr you wait for all code to end execution (a point of quiescence) so there are no thread collisions. Linearizability is what you explain with the lock whereas quiescence would be programming the start and end of threads such that they wouldn't collide. The documentation calls quiescence soft and linearized hard. Cause calls separated by points of quiescence can not collide yet using a lock is used essentially to fake quiescence. Since you are writing an OS what is it that contributed to using the Linearized paradigm vs using times of quiescence to separator thread timing?

@Bvic3 Год назад

Great video. Now, how does lock work in silicon?

@gabrielbarrantes6946 8 месяцев назад

Is probably quite simple, let's say two cores, you check the instruction on the other core if is not a mutex then execute normally, lock the memory block and continue, if by chance both cores are receiving the mutex in their pipelines at the same, then just choose one and lock that one, the other will wait in the loop until the lock is released... Pretty much is communication within the cores as you would in human time scales, like a core would say, I am locking mutex at address in the next cycle, is any of you going to lock at that address next cycle, if no then no collision, if yes just "hard-print" which core will lock and which one will wait... Is pretty much what is expected but at a cycle/instruction level... "Hard print" because processors at printed not coded lol

@PurpleComrad 2 года назад

Low Level Gang 😎

@lucidmoses 2 года назад

May want to add a sleep step in there so your not wasting the cpu. Other threads could be wanting to run.

@conorstewart2214 2 года назад

It should be a yield step not a sleep step, a sleep step would still waste the CPU, a yield step would allow the RTOS to execute other tasks whilst it is waiting. It is just an example though so what he showed here probably wont be what ends up in the final RTOS.

@SerBallister 2 года назад

@@conorstewart2214 I do like RTOS. Need to be careful with thread priorities. Yield really only works on threads of same priority.

@marcossidoruk8033 Год назад

The context switching may be expensive and the other threads may not be able to do anything meaningful if you are locking for short time periods. What LLL described is a spinlock and it is actually the way to go when you are only locking for short amounts of time and do not want to pay the context switching cost.

@kvelez Год назад

Excellent.

@Meleeman011 4 месяца назад

this is why global state in frontend apps exists i get it now

@joaquinipar6133 2 года назад

0:52 unexpected tc2000 appearance

@marksabelita 5 месяцев назад

so freakin helpful! 👏👏

@AlessioSangalli Год назад

OK im a little confused. The STM32 blue pill should have one CortexM3 core. As it is not a multiprocessor, are hardware atomic operations (as in with the correct memory barriers) even useful?

@AlessioSangalli Год назад

The example at 2:20 uses Intel assembly, that is more likely to have multiple cores

@flaguser4196 2 года назад

been using lock flags in gui apps even before having an idea what a mutex is. i think just calling it "mutex" without understanding how it works behind the scenes makes it unnecessarily mysterious.

@esra_erimez 2 года назад

DIY RTOS video, please.

@rallokkcaz 2 года назад

LOW LEVEL GANGGGGG!

@ross9263 Год назад

Hey can you do a video on concurrent programming: threads, fibers, etc. maybe even job systems

@alexandrohdez3982 Год назад

👏👏👏 great video..

@109Rage 2 года назад

I knew the basic idea of a mutex already, but it didn't occur to me that a mutex itself could run into a race condition as well, into partway into your explanation on what they are lol. Still not a hundred percent on how this works (dunno any assembly) but now I know it's not as straightforward as I naively assumed.

@gabrielbarrantes6946 8 месяцев назад

So, the software mutex is actually done at a processor level 😂😂, makes sense if you think about it, because is a recursive problem if you try to solve it at a software level, so do it at processor (physical) level, is not quite clear from the video tho...

@herrxerex8484 2 года назад

You are a godsend

@kevy1yt 2 года назад

Isn’t there a C directive to define a variable, in this case your mutex var, as atomic so you would not need to drop into assembly?

@kevy1yt 2 года назад

@250CC I never understood what volatile really means or does. I know it has something to do with interrupt routines. Thoughts?

@KaneYork 2 года назад

@250CC do NOT attempt to use volatile to implement atomic access. Your code is wrong and will fail as soon as you stop using x86 processors (free acqrel on every basic memory operation) Atomics also inhibit optimization, but you are told EXACTLY how optimization is inhibited instead of "I dunno it does something I guess" Relaxed atomics are free on ARM, just use those

@KaneYork 2 года назад

I have seen exactly two valid use cases for C volatile: (1) POSIX signal handlers (2) Platform-specific mmio code. The C standard alone gives no useful behavior for volatile and you must combine it with additional guarantees for it to do anything at all

@DFPercush 2 года назад

@@kevy1yt If you do any development on game consoles you'll be using volatile to access I/O registers. It doesn't mean atomic though.

@kevy1yt 2 года назад

@@DFPercush what is its purpose with I/O in consoles?

@tempname-dr2bm 10 месяцев назад

This is not an mutex, this is a spinlock ;)

@lagging_barish3736 Год назад

Atomic Operations are operations on the CPU that cannot be interrupted and are therefore immune to a race condition. What if there are 2 threads running on 2 separate cores of the cpu. Both threads can compare and swap at the exact same time. What happens then?

@marcossidoruk8033 Год назад

One thread will have to wait on the other for the memory access, thats the whole point of atomic operations and I think he mentioned it.

@caedenw 2 года назад

It always bothered me that the plural of mutex is mutexes because it stands for mutual exclusion of which a plural amount would be mutual exclusions. But I admit that if mutex wasn’t a portmanteau then es would be following the general rule.

@kayakMike1000 2 года назад

Hum... I am working on the same thing, though I am targeting a few different chips... Sorta built-in portability. So... Lemme get your opinion. The OS needs to set up each process stack (and initial context frame), each process heap, schedule each process, and provide mechanism for IPC?

@humanrayla4785 5 дней назад

Still confused but progress

@Just_An_Idea_For_Consideration 2 года назад

I may be wrong, but I believe we called this a Semaphore back in the early 1980s.

@DFPercush 2 года назад

I didn't learn the word "mutex" in my computer engineering classes (mid 2000's) either, just semaphores. A mutex is basically a semaphore initialized to 1. But being able to implement that in user space with a spin lock gives certain advantages, I guess, like avoiding a context switch / system call. Sometimes you need to get the scheduler involved so you don't have worker threads eating up cpu cycles when there's nothing to do. But just ensuring exclusive access tends to burn cycles when things are already spun up anyway.

@mytech6779 10 месяцев назад

Ah I suspected locks need some amount of hardware support.

@Stopinvadingmyhardware Год назад

Wait, you wrote a RTOS, and didn’t make the main process the scheduler, which keeps track of these things?

@AvenDonn Год назад

Great, now I'm going to have to find a video how uninterruptable instructions are possible in multi-core CPU architecture. And you didn't even show how mutexes are released. Or how re-entrancy is handled (each thread has an ID)

@LewisCowles 2 года назад

Do semaphore next ;) then yielding (generators)

@petevenuti7355 2 года назад

got the source up anywhere?

@mjan9347 2 года назад

Can you do a video on TSX (HLE/RTM)?

@int16_t 2 года назад

Think of two people which needs to use a bathroom. The people are the threads and they need to shit. If they shit, they lock the bathroom and if they're done, they unlocked it.

@ovsepavakian6109 Год назад

Does it mean all the mutex implementations are actually powered by cpu support of lock commands?

@shadamethyst1258 2 года назад

I heard that you also have to temporarily disable DMA before you perform the atomic operation; is that true on your chip?

@admink8662 Год назад

Greatt

@noahwilliams8996 2 года назад

But how does the assembly instruction work?

@williamdrum9899 2 года назад

I'm not too familiar with x86 but I believe what happens is that if another processor attempts to access it, it will halt until the lock is released and then continue. So if I run the instruction lock inc byte ptr [1234h] then any other processor that tries to do the same will idle until the first processor finishes. Remember this is only really a thing for instructions that need to read, edit, and writeback. You actually can't LOCK any arbitrary instruction from what I remember. Only the ones where it matters

@kap1840 Год назад

Will this implementation work in a multi processor environment where cpu caches are used?

@AlessioSangalli Год назад

The atomic hardware operations will take care to trigger the appropriate memory barriers, otherwise they would.be useless. In a non-multiproceasor environment atomics are not even necessary for mutexes

@satellite964 2 года назад

Yeah but how is it implemented in hardware>?

@eyemotif 2 года назад

// rawr

@Jim_Frost 2 года назад

Is your OS code on Github? It would be great to follow along for learning.

@abdullaalmosalami Год назад

I'm sorry but I don't think there was enough explanation on how this "global variable" actually "locks" some other piece of memory for one task. Let's say I have two tasks, A and B, and they both at some point are attempting to write to a global variable myVar. If within the code of each task you literally have myVar = something, then what? Where does this "global variable mutex" come in to play?

@nothappyz 2 года назад

Wait, wouldn't that be busy waiting? I'm taking a uni class on calculators, please educate me

@KangJangkrik 2 года назад

So Linux' mutex can be improved with that way?

@cvspvr Год назад

just a friendly reminder: templeos is still the greatest operating system of all time

@sayanhanra3029 Год назад

Can anyone give me resources for learning os? as I am a beginner.....

@NoorquackerInd 2 года назад

Does this mean that there are times when there are non-blocking mutex lock functions, like where it'll just fail (silently or not) if it's already in use?

@DFPercush 2 года назад

It's certainly possible. For example in Windows API there is a function, WaitForSingleObject(handle, timeout), which can return with an error code after a specified delay (which could be 0). Or if you're rolling your own assembly, you can just set a flag indicating whether the lock succeeded and not loop. I'm not sure if the POSIX standard says anything about that. If you're stuck with a blocking API, you could also use a native mutex to protect your own custom mutex variable, to get around the problem in the "bad mutex" example. That's not perfect, as the process scheduler could interrupt you at any time with the lock active, but it at least minimizes the time in the critical section, assuming that the real data you're trying to protect takes significantly longer to process.

@efronlicht1043 2 года назад

As another example, Go 1.18 added TryLock() to its various mutexes for exactly this case.

@yokba 2 года назад

Could you explain the difference between mutex and spinlock?

@kennichdendenn 2 года назад

The spinlock is used in conjunction with a mutex variable - it is one possibility of waiting until the mutex var is no longer in the locked state. You can think of the spinlock as a while loop that checks, if the mutex var is currently in the unlocked state. If not, it checks again and again and again until the mutex finally IS unlocked. Thats kinda wasteful because it uses CPU cycles for the constant checking but very easy to implement.

@DFPercush 2 года назад

A spin lock is one possible implementation of a mutex.

@SerBallister 2 года назад

A mutex usually puts the waiting thread to sleep, so the only CPU used is when the mutex is released and the waiting thread is told to wake up. A spinlock will have the CPU in a loop constantly testing a variable.

@marcossidoruk8033 Год назад

A spinlock puts the thread to spin on a loop while a mutex puts the thread to sleep, usually using a special OS system call that tells the OS the thread is waiting on a mutex so that the OS can optimize for that in a way it wouldnt be able otherwise. Generally, spinlocks are faster than mutexes because mutexes pay a rather expensive syscall cost and spinlocks dont. However Spinlocks do not release the threads so if the number of threads is greater than the number of cores in the system the OS shcheduler will probably go nuts switching between the waiting threads and that is bad. Also spinlocks are generally a bad idea if you are waiting for relatively long periods of time. So generally, use a mutex if you are waiting for long periods of time or you have a lot of threads and use a spinlock if you are locking and unlocking really quickly.

@Ryan-xq3kl 2 года назад

Are these only relevant to hardware level operations ?

@bob-ny6kn 2 года назад

Back in the '80s we called it a semaphore. How is this "new" again?

@dtikvxcdgjbv7975 Год назад

Advanced.

@TheExileFox 2 года назад

But what if you want something that isn't a mutex but also isn't a regular data type? something with a "write protect lock" that can be enabled or disabled on-the-fly? (but still allows multiple threads to read from it regardless of it's state) What do you suggest then?