your videos are very educational, entertaining and really good visualisations. Would you mind creating a discord server where people who watches your content can get together and exchange ideas, I think that would benefits all of the viewers
Great video!! Loved this level of detail along with the animations. This is a differentiating factor from many other videos on such topics that don't go into detail but cover such topics at a very high level. You could link to explanations of some of the concepts mentioned for understanding but continue keeping this level of detail as that is what makes it great in the first place!
Subscribed!! Hey listen buddy Idk if you'll read this comment but god! I hope you will!! Make videos about optimization, optimization stories of companies and stuff....revolve around it!! And I bet you in no time you'll have a community of amazing developers!!
Hey thank you so much for letting me know! I definitely will be continuing to make these type of videos. I think that these videos provide values, even for myself and I just want to share it with you guys as well. Definitely stay tuned for more to come!
8:30 You should have mentioned that all the downstream layers need to be rebuilt as well. As changing one layer can have an effect on the other layers below it.
this is over engineered and needlessly complex! a much better approach would have been to have sort of a gateway infront of the entire app, which redirects logins to the correct instance. redirecting does not mean the entire traffic is routed through the gateway, but rather the client gets meta information from the gateway, which instance it needs to talk to. from then on the client directly speaks to its instance. new registrations are only allowed on the latest instance. each instance has a max number of users. much easier to scale. all you gotto do is spin up a new instance (app+db) when the latest instance reaches 90% capacity and declare it as the latest in the gateway. this can be automated. gateway needs to be load balanced. that's it. something that will become necessary at some point is how to deal with a large dataset per user. the strategy is to delete data older than X years (legally usually 7 years). you will still have that data in your backups anyway. stupid people like complexity. because they are stupid and have the need to prove they are not, they cause mayhem.
Awesome in-depth video. As stated in some other feedback comment, it might be a bit overwhelming for beginners or people with non-expert level of tech understanding (who are majority of the target audience on RU-vid). You could maybe incorporate some short explainations in about a concept (shard, pgbouncer, etc.). People who are interested in learning that concept can always go to a more detailed in-depth video (you can also route them to your topic related videos if available) More power to you and good luck! Subscribed
I disagree, it's nice to see a channel just tell an animated story like an engineering blog without watering everything down to a tutorial like every other channel
Just continually sharding their DB across more and more machines seems like a linear solution to their exponential user growth. Isn't there something they can change in their architecture to avoid needing 96 separate DB instances? That is sort of ridiculous.
My thought too. I suspect they could make the application much smarter by putting in-progress work into a non-sql database to avoid frequent writes to postgres. Also, one row for each text block seems over normalized. End armchair analysis.
Their team is big (It says that they are around 500 total employees), probably around 200, working on different parts of the app. Most of them probably fall into "this is not my job" or "I don't have enough power to say" type of situation and they keep patching.
@@KenSnyder1 seems like it would just shift the problem to another system. OK, your pgsql isn't getting hammered with writes, but now your redis, mongodb, etc. is and then it's still going to push all that data to pgsql anyway and also you have to pull down from both pgsql for committed data and then reconcile that with uncommitted data in your intermediate store in order to get consistency for the user. For users they also tend to notice read delays more than write delays unless the write delay is substantial or catastrophically fails. Besides which, this video is narrowly focused on how they fixed specifically a database problem. We don't know if they already had other performance solutions in place such as caching unchanged blocks or whole documents to avoid database reads.
If only my teammates could understand. I'm trying to explain the same to them and it looks like talking to a wall. I even had 1 person saying (btw. he has 7y of exp) "unit tests are more important than integration tests". Dude, if your graphics card, cpu, ram work separately - it doesn't mean they will work properly joined together. Then another person keeps creating mocks in integration tests or keeps verifying that serviceA was called x times in a unit test. I tried to explain to them multiple times that changing the implementation shouldn't break a unit test because we can't fix/improve the code and we should be able to verify that it still behaves the same. The answer always was "yeah, then in this case you should fix the code" and the entire team is like "yeah, I agree". Man, I'm already sending my resume and can't wait to change my job... I'm just tired of that shit... I feel like I'm working with interns but these people are almost 10 years in the business...
Incredible video. I can't help but feel like, at some point, humanity will stumble upon the right paradigm of data distribution. There's no way this is the ideal method of scaling, right? It seems to me like much of the effort companies put into scaling is simply undoing flawed assumptions made by the underlying database.
I mean, a lot of issues described here are limitation of postgres specifically. For example, the whole pgbouncer nonsense is not needed in something like mysql. The reason it exists is because postgres team took some idiotic philosophical stance on it and instead of including its functionality into postgres itself it now lives on its own. Read scaling is also straightforward with something like percona xtradb or any other galera type cluster in mysql land. Want more performant reads? no problem, just join the new node into the cluster. Heck write scaling there is also pretty trivial, you just setup as many of these clusters as you want and migrate the clients' data to new cluster. It's just postgres that is the problem here.