The Basics of Database Sharding and Partitioning in System Design

Exponent

Подписаться 362 тыс.

Просмотров 70 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

27 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 49

@tryexponent 8 месяцев назад

Make sure you're interview-ready with Exponent's system design interview prep course: bit.ly/3YTjsjH

@harshita9936 6 дней назад

This video was great. Short. Crisp. To the point.

@chellamgmoorthy Год назад

I didn't knew what a database sharding was. This video gave me good amount of topics for me to research and learn. Thanks for the video!

@jay_wright_thats_right 9 месяцев назад

Animations to visualize what she is saying would make this video perfect!

@oefzdegoeggl 10 месяцев назад

a few things to add. i prefer partitioning based on a guaranteed key in the sense it will not distribute badly ... so the "first letter of name" is a bad idea. better use the record id and group 100k of them or what into a partition. then before storing partitions on different servers, there are a few more things to do first. one is to split modifying queries from read-only queries (which has to be done on the application level) so a simple read-replica-server (which is trivially to be setup in postgres) can be used. next what is possible is a db split on the logical level. i mean for example keep the user's core data on db1 and chat messages on db2. leaving out foreign keys and using weak references instead, with a periodic cleanup job that resolves broken links is a good idea, eliminating issues on backup restore when cut in a bad moment as well.

@goofballbiscuits3647 6 месяцев назад

Coming from a decade+ of data work with health records, I have to bump this comment. Name, location and birthdate combined still aren't unique. Messing up data with potential tromps like this is straight up lethal in some fields. Remember, friends: bad data is worse than no data.

@Deepz007 11 месяцев назад

Great video on sharing, but partitioning wasn't mentioned or discussed.

@primalplasma 11 месяцев назад

This was exactly the information that I needed. Thank you!

@AbhishekKumar-b1j1x 7 месяцев назад

Some people are very beautiful with a helping hand , thanku❤

@DesireStockhausen 6 дней назад

Thanks for the interesting content! 😍 Just a small off-topic question: 😅 I have these words 🤨. (behave today finger ski upon boy assault summer exhaust beauty stereo over). Not sure how to use them, would appreciate help. 🙏

@octavian0704 Год назад

very well described, thanks for sharing.

@ayushgogna9732 7 месяцев назад

you guys are amazing i recently found your channel i am learning a lot and i am loving it

@IuisZeledon 15 дней назад

Thanks for sharing such valuable information! I have a quick question: I have a SafePal wallet with USDT, and I have the seed phrase. (air carpet target dish off jeans toilet sweet piano spoil fruit essay). How should I go about transferring them to Binance?

@pieter5466 10 месяцев назад

Good video but confusing use of the term 'partition', which is different than 'shard'.

@devkiosk Год назад

Awesome explanation.

@tryexponent Год назад

Thanks!

@altruistization Месяц назад

you did not mention eventual consitency as a drawback of sharding?

@SankalpCollege-f2o 7 месяцев назад

Great video!

@samislam2746 4 дня назад

you're strong

@vladyslavsosnov8412 Год назад

Awesome, thanks

@caitlinmclaren2695 Год назад

Monolithic Databases??

@junyulu4648 Месяц назад

今天的油管就看到这儿了

@goofballbiscuits3647 6 месяцев назад

Sorry, everyone... I parted *_and_* sharded 😢

@marcello4258 Год назад

It sounds you messed up partitioning with sharding. And commodity hardware does not have ECC - don’t run a db on it.

@mick7827 Год назад

Each partition is stored within the same database server SO it's easier because sharding require multiple database servers ?

@deleater 2 месяца назад

"commodity hardware does not have ECC - don’t run a db on it" SQLite is a file based database. It doesn't have to reside into the non-paged part of the RAM. High energy cosmic radiation can corrupt only the volatile memory cells, not the storage. Also modern commodity hardware have some level of ECC for CPU cache memory. Single bit ECC support for L2 cache, and multi-bit ECC for L1 cache (at least my 10 year old Intel i7 has). A whole query operation will probably fit into the cache size of the CPU unless the data size for columns exceeds the L2 cache size of the CPU (good luck exceeding that, for example say L2 cache is 256 KB and even if we have half of it available for our query operation at this moment with all the data for columns, it would take more than 100 columns each containing >1000 bytes to surpass that cache boundary, domain corresponding these kinda large query is not a thing of commodity hardware anyways. Hospital billing, hotel management, restaurant billing? Nah). Taking worst case memory access time say 100 nano-seconds to fetch the data from RAM to L2 cache memory. Radiation will have to corrupt those exact memory bits inside the RAM within that 100 nano-seconds during the fetching cycle. Then it will take another 100 or so nano-seconds to write the data back to the disk (worst case disk access time of 50ms (0.005 ns) is assumed). It's extremely unlikely; almost next to impossible for that radiation to randomly flip those specific memory cells inside the RAM out of billions of memory cells pertaining to the SQLite update/delete query executing function that will complete it's execution and save the data into the disk within like 10 milliseconds at most (including all network overhead of system calls). SQLite for Desktop is your friend. However, if you intend to use any of the client-server architecture based database like MySQL etc then your statement is valid indeed.

@AvinashRaj Год назад

Well thanks for reading the script.

@vivekkaushik9508 7 месяцев назад

😂😂😂

@codermccoderson 6 месяцев назад

A lot of these YT educators write down the material before speaking to the camera. What’s your point?

@daphenomenalz4100 3 месяца назад

Every single youtuber has to be prepared bruh, they can't just speak everything from mind and stutter when thinking :| It's not a reaction video

@sriranjitharaghuraman1646 8 месяцев назад

Some visualization would have gone a long way

@tryexponent 8 месяцев назад

Thanks for the feedback!

@lakshminarayanacharan837 4 месяца назад

You are looking so cute 🥰

@junyulu4648 Месяц назад

her name pls

@MJ-cf9nl Месяц назад

It is: NoneOfYourBusiness

@sk-vs9nt 3 месяца назад

am in love with this lady what her id

@satvikkhare1844 Год назад

reading for a teleprompter is not teaching!! sure it gave me topics that I can refer myself

@codermccoderson 6 месяцев назад

A lot of youtube educators have their material scripted before speaking to the camera? What’s your point?

@MuhammadAsif-nx7om Год назад

Great and to the point explanation, No bluff Thanks

@tryexponent Год назад

Glad you liked it!

@rmuneeb1 5 месяцев назад

Untill her hands moved I thought she was an AI robot 😂

@josephkabemba3211 Год назад

Crystal clear

@mandydawson6199 Год назад

Who is she and how do we get more videos with her?

@robbybankston4238 8 месяцев назад

I would think that another potential disadvantage would be if you are using commercial rather than OpenSource operating systems or databases where the licensing costs increase as the number of servers increase also.

@cristinasanchez9029 10 месяцев назад

Greatly explained, I subbed

@KeshavmurthyRamachandra 6 месяцев назад

you got the definition of Sharding wrong. understood you never did sharding in your life.

@edmoregosha8937 5 месяцев назад

The video script explains the basics of database sharding and partitioning in system design. It discusses how sharding can help manage large amounts of data by breaking it up into smaller partitions spread across multiple servers. The script also highlights the advantages and disadvantages of sharding in terms of scalability, performance, and operational complexity. Key moments: 00:32 Traditional databases encounter limitations with increasing data size, necessitating sharding to enhance scalability and performance. -Geobase sharding partitions data based on user locations, reducing latency by routing users to the closest node. -Range-based sharding divides data by key value ranges, simplifying partition computation but potentially leading to uneven splits. -Hash-based sharding uses hashing algorithms to evenly distribute data across partitions, reducing hotspots but potentially separating related rows. -Automatic sharding dynamically manages data partitioning for higher performance and scalability, but manual sharding at the application layer increases development complexity. 03:55 Sharding enables scaling, faster queries, and system availability, but poses challenges like complex management, hot spots, and high operational costs. -Advantages of sharding include scalability, faster queries, and improved system availability during outages. -Disadvantages of sharding involve complex data relationships, potential hot spots, and operational costs for maintaining high availability. Generated by sider.ai