For the redundant load balancer section, you mentioned that it isn't a great solution because it introduce latency and less flexibility because of DNS cache. Then I'm wondering what is a good solution or some alternative solutions?
These problems are why we introduced a load balancer--failure of a load balancer should be much less likely than failure of an API node, so we can take advantage of that solution in most cases. Thanks for watching!
Yep, definitely a concern. A network partition between the two regions could cause a "split-brain" situation where the two regions end up with different states. Often we'd just have one region elected as a master to handle writes, with reads from other regions having eventual consistency. We have a cool video about this kind of stuff on interviewpen.com :)
Latency (and thus eventual consistency) is also a thing even if you use a single region replication. This needs to be also handled properly by the application.
Great video! Would be cool to deep dive on a multi country/region E-Commerce solution. We have multiple issues that are only noticed at larger scales. --Like implementing the search feature or a ML powered product ranking.
Great video, I have a few questions thought if you don't mind: 1- How do the load balancers know the ip's of the API servers? Do the API servers ping the load balancer or they always on the same local network or something else? 2- Would the private DNS that routes from API to database, just be a simple intermediatry server hosted locally? Like a local mini load balancer? Thank you for the valuable information!
1--Yes, the load balancer will health-check the API nodes by pinging or making HTTP requests to ensure liveness. 2--Essentially yes, although it's important to note that requests are not being routed through this server, it's just responsible for notifying the API about what is online. Thanks for watching!
How often you will have DB outage on cloud providers? And will another DB instance work at the same time. Also didn't see any options for multiplication of DB instances, only DB replicas for read operations. As for Payment API failor only way is a retry logic or error message to try later. Maybe there are also open websites we in live mode people can see failure status of the system
Depending on what services you use for your cloud infrastructure, some of this will be managed for you. However, it’s always important to understand fault tolerance and ensure the service you’re using meets your needs.
Could another load balancer be on standby for when a load balancer goes down and take its external IP address? Or is this scenario assuming something happened where that’s not possible like the data center going down