the best solution is not to use K hash functions, but to generate K replica ids for each server id. Designing K hash functions while maintaining random uniformity and consistency is hard. Generating K replica ids is easy: xxx gives K replicas xxx + '1', xxx + '2', ..., xxx + 'K'. Then you take these replicas and generate K points on the ring with the same hash function and this is what is actually used in practice. Chord algorithm is just an example of this technique to add K replicas for each server id
The example that you took mentions xxx+1,+2,+3...+k. Correct me if I am wrong but if you assign k consecutive numbers to the same server the load wouldn't distribute (on adding or removing a server) uniformly. That could be one reason to look for different hash functions ?
@@dudejaa Just a thought : he probably not means +1, +2... instead if xxx is id, M is ring capacity and k is number of servers then second position (after hash(xxx) )will be hash(xxx) + (M/k) OR hash(xxx+M/k).. And probably third position will be hash(xxx) + 2*(M/k) and so on till multiple of 'k'
@Abhishek Dudeja xxx, xxx+1.. are ids for one server to take a hash on and then reach the respective points on the ring, not the points on the ring itself. And then the hash generated on xxx and on xxx+1.. would be completely different and random, and hence would plot k uniformly random points. @CHARCHIT PATODI I dont think that's the case cause if you think about it , if you add multiple servers each with k different points with that technique -> hash(xxx) + 2*(M/k)..till K, then you're not really randomizing and there would be no difference between adding 1 point or k points per server when it comes to choosing a server for a request. It would be like if you multiplied the ring length into k after choosing one point per server which would not get us what we want.
@@gkcs I believe, It went well. I have watched most of your system design videos, they were quite helpful. I am on the junior side 3 YOE so I think they went easy on me in Sys Design. Also, I was able to complete all coding questions in time. Google is always a long shot though. 🤞🤞
I used to see your "Competitive Programming" videos before getting into a company and now after getting learning things there ,I am watching your "System Design" it feels good to grow with this channel. Thank you so much 😊
Couldn't understand a thing within first 4 mins. Like I know its a ring of virtual address but suddenly some random things started getting into the video and I lost the interest. like when did blue area became data and why the data after s4 got served magically by s1 and not by s0. Sorry I m sure you are an expert and trying to help people out there but I guess it can be broken down into more clear diagrams and reasoning.
Notes to self: * The previous video gives the impression that there is a mapping from ranges of integers to server ids, and that consistent hashing is about to mapping request ids to integers in ranges resulting in more consistent routing of requests to same servers. -> I did realize that this would not work very well over time, as you would end up completely changing the ranges for higher-index servers with the addition of multiple servers. * In this video, requests ids map to an index in a ring with `M` indices. The "trick" then, is the map the server indices to indices in the ring using the same hash function that also hashes request ids. Now, to assign a server to a request, one simply looks clockwise for the nearest server. * To make it less likely that load will be unbalanced due to (what I would call) unlucky hashing, another idea is used: simply have multiple hash functions for the servers, such as to map them to multiple locations in the index ring! (clever). * @Andrei Marculescu points out that better than using multiple hash functions for server ids, it is easier to maintain multiple aliases for each server id ("...xxx gives K replicas xxx + '1', xxx + '2', ..., xxx + 'K'.") and thus map servers to multiple locations in the index ring.
@@codingfork6708 Correct. I think there are good heuristics for choosing M (and probably everyone uses the same standard values). Your hash function has to apply modulus M, otherwise you get an index out of range.
You start out with "SO THE PROBLEM IS NOT ACTUALLY LOAD BALANCING!" "ADDING AND REMOVING SERVERS THAT WE SAW?" wait when did we do this? Over a cup of coffee? Feedback: I noticed this with most of your videos, video is not congruent with the title at all. Most of your videos have some pretext as if all the videos are part of a sequence. If someone were to just look for what is consistent hashing this video doesn't make any sense.
Yes, the load balancing and consistent hashing videos were shot on the same day. There is some context missing when this video is seen individually. I think a card link at the start of video will help. Good use of sarcasm btw. "Over a cup of coffee?" made me laugh :P
With multiple hash being applied, can there be case of collisions, i.e. multiple servers ending up on the same bucket? If not , why? If yes, how is it handled?
Gaurav, The Addition/Deletion of Servers using the k-hash functions with the fixed ring size is a hard problem to solve to ensure the correctness. It could be simplified with generating the multiple ids of the same server.
Gaurav, This video is wonderful Have small doubts Let's assume that request R1 is served by server S1. Now we have added a new server S2. Because of this let's assume the request R1 is now coming to S2. How the above scenario gets handled ? Is it like when a new server S2 is added , we have to move some portion of the data from the existing servers (S1) to the new server S2 based on its position on the ring? If it is the case, how can we do the distribution in real time ?
Great video!! I thought that we can add a load factor or load limit like one server can have x requests. So once the load limit is reached, the incoming requests will point to next clockwise server. That way, no server will have too much load. But of course the virtual servers concept is good. Can you please add the code in the desc? Thanks. :)
Sounds interesting. There are variations on consistent hashing which allow this. Code link: github.com/coding-parrot/SystemDesignCourse/blob/master/service-orchestrator/src/main/java/algorithms/ConsistentHashing.java 😁
Great video! I was wondering though, with this architecture, do you have to ensure that the hash functions don't ever collide though right? What would happen if an incoming request suddenly mapped to two servers that fell on the same point?
Hi Gaurav, Great series of videos. Thank you for sharing your experiences. I have one question on consistent hashing.. Which component of the distributed system is responsible for implementing this technique. 1) Is it load-balancer's job because it is a load distribution technique? 2) Or is it application's responsibility.? Curious to hear your thoughts. Cheers!
'n' being the number of servers and 'm' being possible hash values, would spacing out the servers at a value of m/n be a working solution? For ex - with m as 256 and n as 4, first server could be at 64, second be at 128, third at 192 and 4 at 256 - along those lines Understood the possibility of skewed allocations and the need for replicating ids tho. Hooked to your amazing content! kudos
Hi Gaurov, Wont removing / adding servers to the cluster affects the hash function modulo(%) Example: initially we have 4 servers hash(req for same id) % 4 -> s2 if we remove 1 server :- Hash(req for same id) % 3 -> s1 in this way, still the server 2 have stale cache data right?
What would happen if there is a collusion when you calculate the virtual servers? I mean if h1(S0) = h2(S1) = 1. So there are 2 servers with the same ID right?
If we add a new server in this consistent hashing ring then again caching problem will remain same? The requests which was going through s3 before adding new server are now handle by s4.. so, s3 cache for those requests will be useless? Please explain
Gaurav, I have a question: if the hash function h(x) maps values to the range of (0...M-1), why do you need h(Server Number)%M? %M is redundant here, isn't it?
Hi Gaurav, Thanks for sharing such a nice concept.I have one doubt what happen if one server die suppose s1 for 2 hr and then again come back after that so in this case how request are handled.
thanks for the video but i think you can improve overall. specially you din't talk much why it is called "consistent" and how it matters when we loos/add servers
What if the 2 different hashing algorithms for 2 different servers produce same result ? Like for S1, H1 gives 19 and for S4, H2 gives 19. Now both of them will be placed at same location in the ring. What is the solution for this ?
I have got one question. Lets say there are two hash functions. Now each of the M servers are going to have M*2 copies. But, consider a scenario where the hash of one server mod M and the second hash of another virtual server mod M comes out to be the same? Let the hash values be as follows: H1(2) = 40 H2(5) = 20 As we can see above, the hash values are fairly different. But, H1(2) % 5 = 40 % 5 = 0 H2(5) % 4 = 20 % 5 = 0 In this case, both the hashes are different but the virtual server 5 is pointing to the same location as the actual server 2. Isn't it conflicting?
I am thinking of a different method: we have a single hash function h(r) and if there are k servers, we compute all the hash values h(r)%1, h(r)%2, ... , h(r)%k, and store it in an array. Now loop through the array in reverse and the first time we get array[i] = i-1, we send it to the ith server, and of course come out / break the loop Let's prove this via induction. Its trivial for only 1 server. Let's say that we have n servers, and load was equally distributed among them. Let's introduce a new server n+1. The new value added to the array for any given request is h(r) % (n+1). in only 1/n+1 evenly distributed cases, will its output be n, and request will be directed to the n+1th server.. Since we assumed that previous load was evenly distributed, load to the new server will also be taken evenly from all previously existing servers.
I didn't get the point - if hash-space is M - i.e. h = [0,M), then h%M == h - why you need to take a remainder??? how h(0) can be 39 if M is 30??? second - random distribution only averages evenly - there is a big difference between uniformity and evenness - when you have relatively (to M) small number of servers you inevitably in most cases will get uneven segmentation - load on AVERAGE will be 1/N - practically you will have significant differences - and ANY GOOD(!!!) hash function will give this flaw
Suppose there are three payment microservice. Each has its own DB. Now if user1 comes with request. His ID will be unique and he will be assigned to any particular microservice again and again everytime . Until new server of microservice is been introduced or existing microservice is removed (according to consistent hashing). Now my question is, if user1 wants to do payment . He did payment and his payment status will be stored in DB of that particular payment microservice. Now by chance if that DB fails or service fails after the transaction is done. Now here how is this issue handled.
Hi Gaurav, Thankyou for such an amazing video. I was thinking of how can we balane the load in a more better way and I came upto dictionaries. If we use dictionary along with a pointer which will be holding the next server id that will handle the request, so we might be able to balance the load in a more optimized way and the possibilities in the consistent hashing of skewness and uneven balance of load in case of server failures might be handled. I might be wrong over here. I would love to hear your thoughts on my solution.
big fan of your videos, but i think consistent hahshing doesn't use the simple modular operation on the no of servers we have on hash space. it just uses the hash function to map a server and the key object unlike simple hashing.correct me if I'm wrong.
what will be the M for other Hash functions ? it can't be same otherwise we'll get same value...so either M must be changed or input value must be changed so which one we should change ?
I am not a computer science student so the remainder calc. is kinda not intuitive. why do we use remainder here? sorry for the stupid question but I need to know.
Adding more virtual servers to route those requests is fine, but at the end aren't they virtual servers ? so doesn't that mean that the they aren't there in real life, even if we route the request to those virtual servers what will the request do after they reach that point if there are no actual server to process them ?
Does this system design playlist just have 36 videos in total? I have watched about 5 of them and wamted to confirm that if i do these 36 videos and understand the concepts well will i be able enough for a faang interview?
They are great to start with. For strong interview prep (product companies) or upskilling, I have a course at InterviewReady. interviewready.io/course-page/system-design-course
Hi Gaurav how can consistent hashing can solve hot user server load problem. Lets say we are sharding by user id even after consistent hashing hot user server ger the same load
Thanks a lot Gaurav, this was very clear! I was wondering what would happen if there is a clash between different (or the same) hash functions h(x)=h1(y) which server will the load get assigned to?
Question - There are K hash functions to map servers on ring ? Then how do incoming requests uniformly get assigned to K virtual servers ? Are there K hash function to hash request id's as well ?
I think I was ambiguous here. No, the request falls on the ring and is picked up by the nearest clockwise server on the ring. This point is among the server points(N*K in total). Each server has K points on the ring. So the request is mapped to just one server.
hash request id and server id using same or different hash function. create a ring with (0 to m-1 partition called as search space). mark output of hash(req id) into ring. then do h(serverid)%(M) and mark it in ring. we go clock wise and find nearest server that will serve the request.
What an outstanding video! No shortage of tutorials on how to code or write algorithms out there buy not enough on Systems design... This is truly outstanding... been writing software 10 years and fringely do I touch these concepts, heck work within them daily yet either forgot or never knew. Thanks so much!!
So basically you divide the hash space in equal k ranges, k being number of servers. My question is hashing sometimes suffer from clustering. Why take a chance. Why not just divide the entire hash space equally explicitly in k ranges and assign each range to a server?
Hey Gaurav, I may be asking more, but could you please remake the video again in a better way. The magic is missing in this video, previous videos were like I am watching a movie. When I say the movie, means easy to understand and nothing to force my brain to understand something. Those were too easy. But this video is :(
Thanks, Gaurav. Nice work. I have a small doubt. As you told to handle the skewed request by having virtual servers[by having multiple hashing functions for servers], how can we handle the collisions? I mean server S1 and S2 got the same output(say O1) from the hash function. Both will be serving the user request then
Hi Gaurav, But the problem we were facing with traditional hashing is still there(correct me if i am wrong). Suppose req id 1 maps to slot 4 in the ring and the neareast slot on the ring taken by any server is slot 8 by server1, so all the requests of id 1 will be handled by server1, so it has now kind of maintained the cache according to that. But now consider if we are adding a new server (server4) and this server gets the slot7 in the ring after hashing, now all the requests of id 1 will be mapped to this newly added server which i think is the same problem we were facing with traditional hashing. thanks
@@gkcs may be I'm missing something basic, but even if requests are reduced, but after adding that new server all the requests for id1 will go to that new server, and this was the problem we were facing with traditional hashing.
Because you want the requests to stick to a server based on some parameters. h(userId) -> serverId. You can now cache the user details on this particular server, since every time a request comes from this user, the same server will be hit.
Hello Gaurav, when you said 'adding virtual servers' did you mean, adding differently generated hashes of the available servers so that their relative positions on hash ring is uniformly distributed giving us the flexibility of less skewed distribution of requests..? if yes, That implies if 1 physical server goes down, isn't it it's multiple hashed positions will also be off the ring giving more skewed results?
Yes, it will affect the load, but consider this. If a server has, let's say, 4 points uniformly distributed across the hash ring, so when it crashes it will remove those 4 points, and this being uniformly distributed will increase the load other on other servers by 25%.
Regarding the second hash function, do you mean to say, that when request gets mapped to the failed node by first hash function, hashmap it using second hash?
Great video! Question: where do the requests sit in practice? Is there a node acting as a scheduler dispatching request by request, or the requests are mapped immediately to a server and kept internally in memory? Or both, so that the requests can be rescheduled if the server goes down? (I suppose this would require the scheduler to periodically ping each server, or set a timeout). What happens if the scheduler goes down? Second question: would it be possible to use work-stealing instead do reduce inbalance? Whenever a server is out of work, it would steal a request from the back of the queue of another random server. Or could this skew too much the execution order of the requests?
Thanks! The load balancer is a service which needs to tell the other services where a request is to be routed. It can either be queried per request (which is very expensive), or a snapshot of the current assignments can be cached by all services. If the snapshot changes at the load balancer, it can notify all interested clients. The service is distributed and backed by a 'reliable' database, so a single failure won't take the system down. Second answer: It sounds complicated and I have never seen it implemented on a large scale system.
Hi Gaurav, First of all thanks for such a informative session. My question is the problem which we discussed in load balancing video that our cache get cleared if new server is added, how that get solved using consistent hashing?
Hey gaurav, Thanks for this great video. i have one question, can we achieve the same results using a stick-table (which will keep user/IP and server mapping) in loadbalancer with some nondeterministic load balancing algorithm like RoundRobin or Least connection. if not then can you explain why.?
The main objective here to reduce the "rebalancing", the total number of cache loads and evictions. This is useful for load balancing on a cache cluster. The RoundRobin or Least connection algorithms are also useful in different scenarios.
I am a 3rd sem student and I guess I should not be bothering about these things but your explanations are sooooo gooood that I always wanna watch them :D
My concern is what if the result of any two hash function return the same value, in that case for a single request r1 the point on pie chart would be say 5 but the nearest point 6 will be having more than 1 server which can serve the request
That's a very good question Anmol. However, the chances of that happening are really small, as you have N servers and a key space which is much larger. N would be typically at most 1000 and the key space can be as large as 10^18. Can it happen though? Yes absolutely. However, I don't think anyone has seen it happening in theory. Also, we can keep changing the server's ID till the conflict is resolved.
Why can't the servers push "ready!" states to the load balancer to initiate requests for tasks, so then the tasks are only sent to servers that have made requests? In addition, if the servers or balancer is able to calculate time to completion they can inform the algorithm as to when each server is predicted to be ready to handle the next task/request. Would this then create a predictive balancing hash that adjusts to the environment?
This works for distributing "tasks" among the servers, bt this is for distributing data among the servers. If servers are enabled to request for data, then when the data is actually to be read, how can you deterministically pick the server that has that data?
What if we have a server or service which acts as a Dealer in a poker game and route requests to the next available server one after another? In this way, every server will have equal load. If any server goes down just mark that node as dead until the next push message or hello from that server. By the way, I am not a system design engineer, I just watched the poker scene from casino royale movie and wondering about its implementation in this context. :-D
Thanks for making this video and its very useful. You mentioned about creating virtual servers by using different K hash functions. What are the chances of hash collisions across different if that happened inside M slots? ie many servers got assigned to the same slot in the ring from same hash output despite the hash algorithms being different... !
@@gkcs H_k(S_i) % M can take M possible values. If our hash function is good we can ensure each server will have a different value but how can we guarantee that collisions across the hash functions doesn't happen. In the extreme case (dense) of large K and large number of servers, collisions happen with large probability. You said that to ensure uniformity we need to increase density, which IMO causes more collisions.
After doing all this , suppose there may be a case where some server might have a long request queue and some of servers have no load , can't the request from former server be assigned to next available server node?
Thanks a ton for explaining the concept clearly Gaurav. I attempted to write a code for this logic, and have put the initial draft at github.com/saravan-prathi/Algo_Practise/blob/master/Algorithms/src/ConsistentHashing.java Could you please take a look at it once and comment. I initially tried to use a circular linked list as the data structure for the loop, however I realized it wouldn't be a good idea as this functionality is going to be search intensive. So, I then used just an array and the code got simple(in fact too simple for a concept of this level of complexity). Hence I am dubious if I am in the right direction.