Is sharding and partitioning the data same? As per my understanding, partitioned the data based on column which will create separate file for each partition value, so that it will improve the read performance when we query the data based on partition column
Hi Yogita, Your explanations are awesome but I think you can deep dive once basics are done and also if you plan to have real implementation it would be of great help as most of the developers don't have access to work at such a scale. Keep up the Great Work !!!!
Your explanation is very good but your rate of speech is very fast and maintain a positive tone when you are explaining so that the viewer feels easy to digest the data Thanks
I had one question to ask : let's say that I have sharded the date in to mulitiple machines for desinging facebook, then in this case how the "FEED / TIMELINE" of a particular user be generated? One user can have multiple friends and he can follow many pages as well. So do we need to go to each of the shard where the friend of the user is residing to retureive the post information ?? It will make the system bit slow. How should we overcome this issue??
Hi, when a particular user posts something then there is something called fanout services which push the same data into the timeline of the friends' user which is also maintained using some caching strategy and it happens in the background so that when friend user looks at their timeline can see the post and also one more thing to keep in mind is the data is eventually consistent in most of the social networking platform so it may take some time to update the timeline.
a bright confident precise and concise lesson. you are a great teacher, kept my brain neurons constantly firing. what a class! keep all these good work.
One basic concept that can be added before getting into intermediate concepts like sharding - is Indexing. Often, before scaling up , it's better to index smartly to reduce the reading latency. Overall good video.
Yogita, just wanted to let you know I read database sharding from other popular sources also but your explanation with example is best I found. Keep Going on !!!
Hi Yogita, does adding more shards later on in the development phase also increase a challenge, what I mean is if we add another shard do I again have to redristribute the data from old shard ?
In 45th seconds, you are telling that Partitioning is Sharding. It's completely wrong! Partitioning and Sharding are two different concepts! The worst thing you can do with an RDBMS is Sharding!
If one table is partitioned horizontally to be stored across multiple machines or nodes, it’s called sharding. Please read a bit more else I am happy to be proven wrong if you can share a resource which backs up your claim. I will get to learn something new 🙃
Hello, I really liked your video and your explanation is very clear but I don't agree with that partitioning and sharing is the same. For my knowledge partitioning is when we are splitting a table in database and sharing is when we are splitting data to different databases. So partitioning is done in the same database, we will just split/ partitioning the tables and sharing is done having multiple database, we will sharing data between database.
Nice Explanation of such a great and important topic in system desing. Please keep uploading more video, these are quite informational and it is very useful for cracking system design round for FANG! Thanks a lot for your time and effort.
Thanks for the video, this is really awesome. Though after watching and learning abt this I am more confused with the other various terms. Please make a video to clarify those. Database Partition Vs Table Partition Vs Distributed Database Vs Replication. What I understand is Replication is a Read-Copy with a distributed system. Distributed Database is having multiple database at different servers however in Sharding we are doing the same, if so what is the difference. And how the Database Partition is different than Table Partition, as in table partition we do not have to worry abt accessing it or no downtime is required however for database partition we would need some downtime when adding new partition.
Very Good Tutorial! Simplified the concept like anything. Sharding Key is a crucial here. Choose sharding if its really needed.. not every organization needed it. choosing wrong sharding key could bring lot of complexities and wrong data to the customer.
DB Shards and DB Partitions are two different terms if we take distributed database into the picture. In fact, sharding and partitioning have different meanings in distributed system.
Before I say anything, I will admit I have no experience with sharding, but I do understand what you presented. Scenario. With 1million users, would you say that user info maybe be small but related user information, like images or videos can take up a lot of space. So it's not so much the amount of users as opposed to the space user supported data can take up.
I have a doubt, I see that in vertical partitoning example from 3:49 you have put each column in a different database server. Can we do that for vertical partitining? I read that only horizontal partitioning is spread across database servers whereas vertical partitining is done withing a server. Can we do it for vertical too?
Good overview. My comment is: Sharding and partitioning are not the same though they both are breaking up a large data set into smaller subsets. Sharding implies the data is spread across multiple databases while partitioning is about grouping subsets of data within a single database instance.
Thank you Yogita for making this channel. I went through your primer course videos, and they explain all the concepts in a very detailed manner. However, I am not getting confidence how to give system design interview i.e how to connect all the concepts combined together to form a system. Could you upload a video in an interview manner like starting from a problem, discussion that happen between interviewer and interwee to proceed with the design?
So the strategy to scale in most use cases: Vertically scale -> Separate Read and Write -> Archive and keep it under control -> Try breaking the application into smaller services -> Shard the last resort
Query across the shards is not a disadvantage I mean situation itself will not arise as in shards( horizontal partitioning ) we are going to put whole schema in any particular shard not just a specific table..
Hi Yogita, Thanks for such a vivid explanation. I just have one doubt. You gave an example of Tinder while talking about Sharding sometime around 10:22. We can shard on the basis of cities definitely. I was just wondering what would happen to a particular person's data (in db) when he moves from city X to city Y ? Will that data be lost ?
Yogita,your videos are really very helpful to understand the sharding concept. I have followed the whole system design series.thanks a lot for making this series..