Тёмный

Data Consistency and Tradeoffs in Distributed Systems 

Gaurav Sen
Подписаться 572 тыс.
Просмотров 178 тыс.
50% 1

This is a detailed video on consistency in distributed systems.
00:00 What is consistency?
00:36 The simplest case
01:32 Single node problems
03:35 Splitting the data
04:23 Problems with disjoint data
07:10 Data Copies
12:01 The two generals problem
13:56 Leader Assignment
15:24 Consistency Tradeoffs
15:56 Two phase commit
25:00 Eventual Consistency
Eventual Consistency: interviewready.io/learn/syste...
You can follow me on:
Instagram: / interviewready_
LinkedIn: / interview-ready
Github: github.com/coding-parrot/
Twitter: / gkcs_
#SystemDesign #DistributedSystems #Consistency

Опубликовано:

 

1 июн 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 152   
@shubhamsingh6929
@shubhamsingh6929 2 года назад
For new readers, I think it should be explicitly pointed out that Consistency in ACID is very different from Consistency in CAP. ACID consistency means that the values in DB should be valid i.e. if we are making two transactions both should complete before committing to disk. CAP consistency means read requests to different nodes of a distributed system return the exact same data. A system can exist without CAP consistency but a system should never exist without ACID consistency. ACID consistency is must-have for any RDBMS to be reliable/useful while CAP consistency can be a trade-off as per requirement. To be honest C in ACID doesn't add any proper value. If a transaction is Atomic, Isolated, and Durable, it automatically becomes consistent.
@firoufirou3161
@firoufirou3161 2 года назад
" C in ACID doesn't add any proper value", I don't think so, we don't have the concept of keeping the db in a valid state in Atomic, Isolated, and Durable. None of these mean that we should have the constraints respected.
@madacsg
@madacsg Год назад
"automatically becomes consistent" - if "C" already told you how to utilize "AID" properly...
@ildar5184
@ildar5184 Год назад
" if we are making two transactions both should complete before committing to disk." What does that mean? How I understood consistency in ACID is that constraints like foreign keys, data types etc. must always be fulfilled, i.e. you can't commit a transaction if you're inserting a foreign key that doesn't exist in a referenced table.
@zod1ack_csgo_clips425
@zod1ack_csgo_clips425 Год назад
@@ildar5184 That's referential integrity but in one way it ensures consistency
@rizwan050587
@rizwan050587 9 месяцев назад
Atomicity is you are talking about
@MsSierra9
@MsSierra9 2 года назад
Your system design videos helped me get a job at Amazon! Thank you so much for all the great work you are doing! :)
@gkcs
@gkcs 2 года назад
Congratulations 🍻😁
@Jag144
@Jag144 2 года назад
Congrats Say!!!!
@zhizhou6443
@zhizhou6443 2 года назад
@@gkcs Nice!
@zhizhou6443
@zhizhou6443 2 года назад
And thanks!
@samarthtandale9121
@samarthtandale9121 Год назад
The way you teach concepts from scratch/ bottom-up way is the most favourite thing in this channel !
@gkcs
@gkcs Год назад
Thank you 😁
@nameunknown007
@nameunknown007 2 года назад
One thing I learnt from this video today is you can’t get super computers in super market.
@jedaademola
@jedaademola 6 месяцев назад
😂😂😂
@yt-hq6cu
@yt-hq6cu 2 года назад
You said facebook is down and it was down after a week. what a coincident!
@srikanthmaganty524
@srikanthmaganty524 2 года назад
I like ur energy and the selfless appetite to share with community...
@eduardolacerda1557
@eduardolacerda1557 2 года назад
Thanks a lot for sharing this content! this is very helpful. Although I was aware of a lot of concepts there, your samples help me to figure them out a lot better. :)
@cengizandak4241
@cengizandak4241 2 года назад
Great explanation. However 2 phase commit is not usually recommended in the current industry due to a possible failure on the coordinator or the microservies will mix the things up by keeping locking the resources, timing out etc.. Therefore there are other options like 3 phase commit or SAGA which are widely used in fintech as well as in big companies.
@amit_dwivedi
@amit_dwivedi 2 года назад
The Tshirt Gaurav! 😍 NITS ❣️
@akhilguptagunturu3048
@akhilguptagunturu3048 Месяц назад
Great explanation. Thanks for creating this video :)
@sophianachiba663
@sophianachiba663 2 года назад
very clear as always... what does the prepare statement phase offer that the commit doesn't in the 2 phase commit? since they can both fail and both may need to be rolledback.
@gokuls9929
@gokuls9929 2 года назад
That 'Yeah! I made a joke!' look at 8:32 is priceless!
@bimakumkum277
@bimakumkum277 3 месяца назад
Thank you for this great explanation
@hamsalekhavenkatesh3440
@hamsalekhavenkatesh3440 4 месяца назад
Nice, its also worth knowing more than 2PC, we can also use TC/C (Try Cancel/commit ) which does not explicitly lock the resources; rather it works by issuing compensating tx in case of failures..
@aniketsingh9956
@aniketsingh9956 Год назад
That 'C++' expression was priceless 😁
@gkcs
@gkcs Год назад
Hahaha!
@thisalma
@thisalma 2 года назад
Very well explained ❤️
@robotempire
@robotempire 2 года назад
Thanks bro I have been doing front-end for the past 5 years. I’m just putting myself back on the market now and this video is a great refresher on the fundamentals consistency in distributed systems.
@sankalparora9374
@sankalparora9374 Год назад
Amazing explanation! Thanks a lot! I feel more and more confident in system design daily... Thanks to you, sir!
@gkcs
@gkcs Год назад
Thanks Sankalp!
@benparker8000
@benparker8000 Год назад
Amazing video. So easy to understand complicated topics. I have very limited experience with this and the pace and the content was perfect.
@benparker8000
@benparker8000 Год назад
Did the eventual consistency video ever come out?
@aravindkumaresan2747
@aravindkumaresan2747 2 года назад
Hi Gaurav, thanks for making such great contents regarding consistency in distributed systems. But in case of 2 Phase commit, if one of the follower node machine goes down, then the leader who got commit response from other follwers, also can't commit it and may need to rollup the transactions. So until and unless the single follower comes back, the system doesn't accept any write operations, so it still puts in a single point of failure right ?
@Thrillseeker419
@Thrillseeker419 2 года назад
Here are my concerns: how do you track the prepare statements sent? How will server B differentiate two duplicate prepare statements? Also, as part of the prepare statement execution can we not update a status column of a row to indicate it is in process of being updated? Also how does the leader keep track of all of the data that needs to be sent out? What if there is just too much data to be sent out for one leader to handle?
@pallavgurha
@pallavgurha 2 года назад
Hi Gaurav, just wondering whether high water mark index will help here and the client can be assured that the commit is acknowledged as anyways everything noways a async in nature and once all the nodes in the cluster are up to the speed we can start showing the change to the client.
@harshkatkade4206
@harshkatkade4206 2 года назад
In the 2 phase protocol, consider a scenario. If server C receives a begin transaction order from A, and further fails to receive a commit order, what if we lock the read requests for some amount of time from C, until a "begin" transaction orders isn't committed yet, so that the leader can keep trying to commit the change in C. Latency is increased but I guess consistency can be maintained. Is this a correct? Please let me know
@stephensmithwick7502
@stephensmithwick7502 2 года назад
I loved the pause on the C++ joke >_< as if to say yes I’m going there
@RAJU9622
@RAJU9622 2 года назад
Excellent Gaurav
@hasanulislam3112
@hasanulislam3112 2 года назад
Could you recommend any good system design concepts apart from your interview-focused course?
@sumitkumarmallick6040
@sumitkumarmallick6040 2 года назад
Can you please share the link to the video where I can find the idempotency concept?
@kalpakHere
@kalpakHere 2 года назад
One of my favourite topics in this domain. Up next: Consensus ??
@zaleel
@zaleel 2 года назад
Have a question. If all the commit/prepares fail after multiple retries. Should we roll back on master as well? And in in 2 PC, instead of prepare and commit. Why can't we just: First send the update in phase 1 i.e. commit and then in phase 2 just compare A /master to slaves. If if matches it's done. If it fails we mark the state as disputed or eventual roll back
@raj_kundalia
@raj_kundalia Год назад
Thanks for the video
@Varunshrivastava007
@Varunshrivastava007 2 года назад
How did you solve the latency in a consistent system. Still servers have to send request from UK to EU for db updates (to remain consistent).
@shivprakashy
@shivprakashy 2 года назад
Don't know if you have a video on the books/blogs you follow. 👏👌#goodcontent for #systemdesign
@jeelanyelidandla2477
@jeelanyelidandla2477 Год назад
Gaurav, do the UK, US, and other countries' governments agree to store their data across other regions? what do you think?
@tylerbenton4495
@tylerbenton4495 2 года назад
Would be great if the people who disliked the video commented on why they disliked it. Did you disagree with something he said? If so, write it in the comments to educate us on why you disagree.
@praveensg
@praveensg 2 года назад
Insecurity
@kayodeadechinan5928
@kayodeadechinan5928 2 года назад
A very wise comment! Thanks
@stIncMale
@stIncMale 2 года назад
'C' in CAP is what 'I' in ACID. 'C' in ACID has no analogy in CAP and means that there are constraints and the DBMS makes sure they are not violated. More specifically, 'C' in CAP is linearizability, while 'I' in ACID is serializability (and not even strict serializability, which would be more similar to linearizability). While both are what is called "consistency model" or "correctness condition" (a similar concept in programming languages and in processor instruction set architectures is called "memory model"), they are not the same.
@gkcs
@gkcs 2 года назад
That's a good point. The "Consistency" term lines up a lot more with Isolation in ACID. I think Martin Kleppmann has spoken about this a few times.
@DejaimeNeto
@DejaimeNeto 2 года назад
"Maybe C was updated here; maybe C++", shots fired
@mukunthsenthilkumar8822
@mukunthsenthilkumar8822 2 года назад
This video is pure gold! Great stuff, Gaurav!
@Bibhaw
@Bibhaw 9 месяцев назад
What is the use of kafka if all the communication been hapening directly from Leader to follower ?
@stalera
@stalera 2 года назад
Great work. Thanks for the knowledge. However, in the explanation there's 1 server with write which makes it again a single point failure and if we increase the write servers then we're back to square one. Can you address this?
@TheNayanava
@TheNayanava 2 года назад
in one of my previous organizations we solved this by enabling multi-master, or multiple primary. We had three primary nodes but all of the write requests would only come to one primary node. there was automatic failover in this scenario, there a linux virtual server which takes care of routing the traffic, the LVS was updated with the public IP of the node which was the then primary. and every time the call would land on the LVS it would resolve to the IP which was the primary. there was automatic failover in this scenario. But as you see every write would be successfully ack-ed only when all three servers had the latest commit.
@cengizandak4241
@cengizandak4241 2 года назад
2 phase is not recommended usually. There is a better option like 3 phase commit or SAGA. l would recommend you to check them out
@TheIndianGam3r
@TheIndianGam3r 2 года назад
We cant have more than 1 write servers in all types of DBs. For example, postgreSQL doesn't allow multi-master setup. This issue is resolved by implementing a High availability cluster management tool along with your database. For example , Galera Cluster for Mysql and Patroni for postgreSQL. Here, in case a master fails, one of the read replica servers is chosen as a new master , and a proxy on top handles the change in route. Multi-master setup can have many problems regarding duplicate entries if not handled well.
@TheNayanava
@TheNayanava 2 года назад
@@TheIndianGam3r yeah we were using a gallera cluster on top of Maria db, but although it was a multi master, the writes are accepted only on a single master, the others were exact copies of the master accepting the writes. This helped in automatic failover instead of manual failovers. And true enabling writes on multiple masters actually increased the chatter, that's why we disabled writes on multiple masters.. this was more of a multiple replicas of the same master
@mukeshstorge7384
@mukeshstorge7384 2 года назад
In this cluster each slave will have the knowledge of Master's health by checking the heart beat. Once the Master goes down leader election will takes place and one of the slave becomes master. There are some algorithms involved here like RAFT or Consensus for leader election.
@saiprasad84
@saiprasad84 2 года назад
Isn't the leader a SPOF here? Just curious.. may be if the leader fails, one of the followers become the leader? Wonder how that might happen and if there are databases that make it happen automatically.
@nodirbeksharipov458
@nodirbeksharipov458 Год назад
you're a genious man
@sahilzainuddin7134
@sahilzainuddin7134 2 года назад
Just finished watching Social Network, and right in the next video click, I see an example of the same.
@marcpaguilar
@marcpaguilar Год назад
The very quick smile you gave the camera when you said "c++" 🤣🤣🤣🤣
@suchismitagoswami5609
@suchismitagoswami5609 2 года назад
Great explanation. I am having one questions as followed. Ideally, the followers or the replicas should reside in different regions or data centers to ensure availability. So, this introduces latency in every write operation in transforming the data synchronously from leader to the followers if we follow 2 phase locking. How to handle this ?
@sidderverse
@sidderverse 2 года назад
I believe you will have to go for eventual consistency.
@generationgap416
@generationgap416 Год назад
That is why there are tradeoffs between consistency, availability and performance. There are mutually exclusive. Eventual consistency is the middle ground.
@ThePujjwal
@ThePujjwal 2 года назад
Isn't 2PC used for maintaining atomicity in case of a distributed transaction ? Got me a bit confused about we leveraging it to achieve consistency.
@generationgap416
@generationgap416 Год назад
Atomicity is all or nothing. Consistency is data copies should be the same on different on different bot in a distributed environment
@praveensg
@praveensg 2 года назад
Wow, nice and bright. New camera?:)
@imperfecto7734
@imperfecto7734 2 года назад
08:30 he realises how c++ language got its name. its intuitive :D
@akashtadwai9471
@akashtadwai9471 2 года назад
Hi Guarav, In a two-phase commit, do the transactions of all the servers be rolled back if one of the servers fails to acknowledge?
@pste22
@pste22 2 года назад
yes it should! 2 phase commit is all about establishing consistency between all the related services!
@generationgap416
@generationgap416 Год назад
Good question. There should be a thing called Almost Consistent. Use reference counting. There should also a thing called Regional of Consistency. Data sharding should necessitate this.
@kajalpareek8291
@kajalpareek8291 2 года назад
Hi Gaurav ,Thanks for such informative content. Looking forward for a series on design pattern
@praveenkurapati7300
@praveenkurapati7300 2 года назад
While rollback sent by the master, does it send to all slaves ?
@Amritanjali
@Amritanjali 2 года назад
again learned a lot
@gkcs
@gkcs 2 года назад
Thanks Amritanjali! Good morning 😁
@Amritanjali
@Amritanjali 2 года назад
@@gkcs good afternoon ☺☺
@manveersingh5822
@manveersingh5822 2 года назад
That C++ smile and expression at 8:30 .. haha
@lakeman4101
@lakeman4101 2 года назад
When i started listening to your videos ... i had no idea other e than you know your stuff, right now I can keep up quite alright...i now appreciate the level of your skill...and the chronological argument you give and use to explain. I love this man
@darkstudio3170
@darkstudio3170 2 года назад
He is wearing NITS Hacks T-shirt. I still remember when he came to NIT Silchar as a judge for NIT Hacks and we had a photo with him.
@gkcs
@gkcs 2 года назад
I'm proud to wear it 🤠
@DebajyotiDev
@DebajyotiDev 2 года назад
Is C in ACID properties o SQL same like CAP theorem? What u said is for CAP. But if u check the oracle Docs Consistency in ACID means Data should not violate any DB constraints FK, UK, Col types etc…correct me if am wrong.
@generationgap416
@generationgap416 Год назад
You are right. C in acid relate more to atomicity and idempotency. While the C in CAP refers to data copies in a distribuuted environment must be the same. Hash value of each copy must equal if there are no collision in the hash function.
@ParthPatel-jn6io
@ParthPatel-jn6io Год назад
At 1:24 there's a mis-communication and writing on board for Get profile A
@theSDE2
@theSDE2 2 года назад
Liked the new edit. Kudos to your techie video editor 🤓
@gkcs
@gkcs 2 года назад
Yes, it's amazing 😁
@sagararora2830
@sagararora2830 2 года назад
Now I get to know When we went to banks and they are like servers are not responding, but actually in the backend they are keeping themselves consistent. :D
@BarHemo32
@BarHemo32 2 года назад
Why not use kafka or rabbitmq and store in them all of the updates, and each database will subscribe and make the changes according to the queue. The data will be eventually consistence, is this a good idea?
@AnPham-uz3td
@AnPham-uz3td 2 года назад
But kafka / rabbitmq will need a way to send update to those follower dbs, and to guarantee consistency between those follower dbs it will just like the 2 phase commit
@n4naveeen
@n4naveeen 2 года назад
Hi Gaurav, can you pls make a video on lifecycle of code? i e, what happens after the coding is done?
@gkcs
@gkcs 2 года назад
I have spoken about deploying code and versioning client libraries at InterviewReady. You can check out the first design of "Design an email service" here: get.interviewready.io
@aj-loves-tech
@aj-loves-tech Год назад
8:32 that smile hits me every time 😂😂😂
@gkcs
@gkcs Год назад
C++ 😛
@Lima3578user
@Lima3578user 2 года назад
RU-vid recommend your video at 6am today ..ever since im hooked ... prepping for technical program manager role .. hope you would make some program centric videos too
@surenderthakran6622
@surenderthakran6622 2 года назад
Loved the video. Just to nitpick a little, Kafka belongs more to the pub/sub family rather than the message queue family.
@generationgap416
@generationgap416 Год назад
It does both of those thing well. Dont nikpik
@vinays2493
@vinays2493 2 года назад
What does GKCS stand for
@sagararora2830
@sagararora2830 2 года назад
Hi Gaurav Sen, when the video on Eventual consistency will come. Waiting for it
@gkcs
@gkcs Год назад
Added it here: get.interviewready.io/learn/system-design-course/consistency-in-distributed-systems/Eventual-Consistency
@Tarsemsingh-rd8lx
@Tarsemsingh-rd8lx 2 года назад
Thanks Gaurav, I have question, which tool support 2 phase commits? Kafka can you please suggest open source tool
@sureshbabu8794
@sureshbabu8794 2 года назад
What about partitioning? You didn't explain or I might have missed..
@Raja-kl4op
@Raja-kl4op 2 года назад
Hi Gaurav, Great video! But I have a doubt. Why does communication over a long distance lead to latency? Don't servers use electromagnetic waves(speed of 3 x 10^8 m/s) to communicate with each other. So, the distance of a few 1000KM's will only cost few milliseconds. Is that enough to cause latency? Or did I miss something?
@gkcs
@gkcs 2 года назад
There are more routers to hop on.
@generationgap416
@generationgap416 Год назад
The last mile is the bottle neck
@mrrishiraj88
@mrrishiraj88 2 года назад
🙏🙏🙏
@basitahmad8067
@basitahmad8067 2 года назад
Next Hotstar 🙏... High in demand
@gkcs
@gkcs 2 года назад
Check out the free course preview at InterviewReady 😎
@shakeelahmed8015
@shakeelahmed8015 2 года назад
So we can roll back messages from Kafka 🤔
@sorandom9452
@sorandom9452 2 года назад
In case of single leader, there are high chance of single point failure.
@TheNayanava
@TheNayanava 2 года назад
in one of my previous organizations we solved this by enabling multi-master, or multiple primary. We had three primary nodes but all of the write requests would only come to one primary node. there was automatic failover in this scenario, there a linux virtual server which takes care of routing the traffic, the LVS was updated with the public IP of the node which was the then primary. and every time the call would land on the LVS it would resolve to the IP which was the primary. there was automatic failover in this scenario. But as you see every write would be successfully ack-ed only when all three servers had the latest commit.
@eatcodegame4952
@eatcodegame4952 2 года назад
Incase for example you have 3 nodes in a cluster, incase of any write you can get it ack from more than 1 node so you have more than one node than master which does have the updated copy and it’s gonna take place of master incase of master goes down.
@generationgap416
@generationgap416 Год назад
Use a replica set of masters and use a kubernetes to orchestrate the environment.
@arulantony2137
@arulantony2137 2 года назад
25:00 SAGA pattern ??
@sankalparora9374
@sankalparora9374 Год назад
That gaze 8:32.
@gkcs
@gkcs Год назад
😛
@rajvadheraju3568
@rajvadheraju3568 2 года назад
It seems like you mixed Consistency and Distributed Transactions when you talk 2PC
@adityagoel1738
@adityagoel1738 2 года назад
Retrying every 5 seconds doesn't make sense if the system is down because it's not gonna be up and running soon and hitting it at every 5 seconds leads to wastage of resources.
@samarthagrawal3415
@samarthagrawal3415 2 года назад
Sir i am very much inspired from your english conversation, will u make a video on how to do coding problems means how to build logic for that please sir it's my humble request, please make video on that
@Amritanjali
@Amritanjali 2 года назад
he already made one video on this check-in his channel
@samarthagrawal3415
@samarthagrawal3415 2 года назад
@@Amritanjali ok thanks...
@vaibhavmehta36
@vaibhavmehta36 2 года назад
2PC is used in which Industry?
@gkcs
@gkcs 2 года назад
Textiles. And also some financial transaction systems.
@mohammadhemel6412
@mohammadhemel6412 2 года назад
While I am watching this videos, Facebook, Instagram and WhatsApp are down
@zuowang9881
@zuowang9881 Год назад
I don’t think he released the next eventual consistency video
@gkcs
@gkcs Год назад
interviewready.io/learn/system-design-course/consistency-in-distributed-systems/Eventual-Consistency
@swastikjainsj
@swastikjainsj 2 года назад
how Inertnet work ? GB MA how jio airtel control net speed ? Any new info ? apart from underground wires 😐
@yashwanthd1998
@yashwanthd1998 Год назад
We are trying to solve single point of failure and consistency with distributed databases but ending up with the same . What am i missing here
@generationgap416
@generationgap416 Год назад
If you are refering to the seemingly single point of failure of the master, note that the master would be a member of a set of master. A service kubernetes would orchestrate the master. If one down kuberntes would start a copy of the master. You would an avalaible problem but that is point of the video that perfect consistency and perfect availability is not possible
@rajveersingh2056
@rajveersingh2056 2 года назад
Tsunami in Harvard☺️
@harshshah2791
@harshshah2791 2 года назад
What happens if transaction is to add 100 rs and leader retried 100 times, so 100 rs added 100 times?
@gkcs
@gkcs 2 года назад
The transaction id will make the server ignore 99 of the later requests.
@harshshah2791
@harshshah2791 2 года назад
@@gkcs I would like to ask one more thing that I see your SD course on IR includes many videos which are already there on RU-vid, so can you give an idea what I will get extra if I buy the course? And also what is the source of your SD knowledge? Any goto books, sites, pages?
@gkcs
@gkcs 2 года назад
@@harshshah2791 All the lessons in the "additional free resources" have been picked from this RU-vid playlist. The other three sections (Fundamentals, High level design and Low level design) are only available at InterviewReady. I use various books, blogs and my personal experience as a software engineer in the content. The course is designed to help you get better at software engineering (to get better at design discussions, coding, etc...).
@harshshah2791
@harshshah2791 2 года назад
@@gkcs I will surely buy it
@liatris69
@liatris69 2 года назад
Data sharing shyness is real.
@imperfecto7734
@imperfecto7734 2 года назад
13:38 "I wont explain everything here, I'm lazy too "
@swastikjainsj
@swastikjainsj 2 года назад
one more question When we delete anything it store in our computer only in hard disk just remove the link so that we cannot find that file And now we can store another data on this data it will overlap. but how can we transfer the deleted storage files to cloud that means our disc will can work for more years
@generationgap416
@generationgap416 Год назад
You may able to find using an hex editor
@tze-ven
@tze-ven Год назад
Do you know you don't need supercomputer to scale? You can scale with cloud computing. The best thing is that you don't need physical shopping. You just click and pay. 😜
@generationgap416
@generationgap416 Год назад
He was contrasting virtical scaling with horizontal scaling. Question: Would virtical scalingbe the same as horizontal if we rotate virtical scaling by 90°?
@manjuender6286
@manjuender6286 2 года назад
Super computer at super market Harvard vs Oxford 😂
@VijayInani
@VijayInani 2 года назад
C >> C++ ... LOL
@jitinkumar835
@jitinkumar835 2 года назад
1st
Далее
Мама ударила дочь #shorts #iribaby
00:17
Do you know Distributed transactions?
31:10
Просмотров 225 тыс.
Data Consistency Between Microservices
8:52
Просмотров 23 тыс.