How databases scale writes: The power of the log ✍️🗒️

Gaurav Sen

Подписаться 593 тыс.

Просмотров 130 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

29 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 227

@gkcs 3 года назад

If you are preparing for a system design interview, try get.interviewready.io. All the best!

@metfan46 3 года назад

I'm reading 'Designing data intensive applications' and needed a reminder of what exactly LSM trees were, this video is amazingly helpful! Everything I've learned so far is clearer!

@phildinh852 2 года назад

Same here. I wish there was a channel that summarizes DDIA concepts chapter-by-chapter, so we can watch the videos after reading the text to further solidify our understanding.

@extremespartan117 4 года назад

Man this guy really knows how to teach. Keep it up Gaurav!

@420_gunna 4 года назад

You look very happy in this video, Gaurav! Nice to see this as I'm starting interviews again -- you helped me out last round, glad to have you onboard this time too :)

@gkcs 4 года назад

All the best!

@vishvedula1 4 года назад

"Think of a Database as a DataStructure" is like the keyword for us to imagine and understand things better...thanks a lot dude !!

@BHUPESHROHILLA 2 года назад

I think the meaning of Compaction which you explain missing a major point here. Compaction means throwing away duplicate keys in the log as there can be multiple updates for a single key. With compaction we keep only the most recent onces.

@sidazhong2019 3 года назад

OMG this guy gives a much much much better explanation than my professor.

@karthiknair3077 3 года назад

It would be good if you can make a video on "which storage engine to use and when", technically telling us when to use traditional sql vs nosql. I know there is no straightforward answer to it but just to know your view.

@harisridhar1668 3 года назад

Hi Gaurav - this video makes a lot of sense after solving for how to merge k sorted lists .

@nitinissacjoy5270 3 года назад

Correct me if wrong, but this is for Analytics DB. For transactional DBs which needs to change/delete value, sorted array would slow down writes. Transactional DBs still need to use B+ trees (at least SQL variants). Database with heavy write doesn't translate to Analytics DB, imagine a scenario, if you're storing a background job's state in DB and the number of jobs scale up to 1 billion for 1 billion users every 2 hours - you have 1 billion records that are ingested every hour but at same time need to be updatable.

@vinnieworks1854 3 года назад

You are the best. I’ve learned so much from your videos! Thanks so much for sharing!

@gkcs 3 года назад

Thank you Vinnie!

@yeshwanthreddy5185 3 года назад

you did not consider duplicates caused by updating records. found ur video helpful while reading the Designing Data-Intensive Applications book, I'm exactly at this topic, thanks for explaining it in short yet covering the most.

@gkcs 3 года назад

Thanks! I have talked a little more about this in the "Introduction to NoSQL" video. It's mentioned with the SST tables, I think.

@glsruthi6522 4 года назад

Great video. You are incredible in presenting stuff. Keep up the good work!

@vishalydntn 3 года назад

an example maybe you could give here for a bloom filter is - considering the 0th and nth element of the sorted array. If I need to search for a number and if that number is within the range of the smallest and highest, there is a chance it could be there (it may not be there as well), however if the search number is either lesser than the a[0] (smallest number in the sorted array) or greater than a[n] (largest number in the sorted array) there is no way it could be there. Hope that helps.

@gkcs 3 года назад

That's an interesting way to think of it!

@nipunpruthi7624 2 года назад

Nice editing with ONCE MORE

@ishaangupta4125 4 года назад

I guess this is basically how Cassandra works as it uses SSTs?

@gkcs 4 года назад

Yes :)

@adarsh7604 4 года назад

Isn't this how Cassandra works. Great concept

@gkcs 4 года назад

Yes :D

@advaitchabukswar4163 4 месяца назад

Great video!!🤯

@sairamakrishna5995 3 года назад

Can one video performance tuning in oracle will the best way of u throughout

@mananamin8840 4 года назад

Why. I feel that it's same as b+ tree overall.. because if we see b+ tree abstractly, it's also maintaining sorted chunks of array...but ya we don't have control over size and when to merge two sorted chunks in b+.. in b+ tree.. insertion is always log(n).. While here it most of the time . constant..(I guess) reading operation is bit heavy here..but I guess that filter can help to reduce time.. Anyway, nice video..

@Sushil2874 4 года назад

Nice informative video..!! Nice editing as well...))

@hix0071 3 года назад

@Guarav sen. I have a doubt. Say a read query comes in for a data point which is NOT yet persisted in database and still waiting to be flushed, wouldn't it be a 404 for client ?

@arnabmukherjee5840 4 года назад

Awesome bro...keep going

@klementtan6850 4 года назад

Could you make a user authentication, authorisation, roles and privileges system design video? Cant find any good tutorial on this!

@gkcs 4 года назад

I am on it!

@nagavijaykumarprathi8531 4 года назад

When notification arrives from gaurav.. Me: Hey, you don't know anything about this concept. Don't waste your time by watching this. My Brain : you stop, gaurav makes the thinkgs clear..

@OmprakashSharma-id3oj 4 года назад

Same way Cassandra works, uses SSTable flushes to persist... So this way can make our mysql db also faster, is it...

@messipsatamazouzt3940 3 года назад

good

@AolaDIY 4 года назад

ek number

@kamalhm-dev 4 года назад

Just to clarify this doesnt apply to nosql right?

@gkcs 4 года назад

It applies to any database. NoSQL databases usually use LSMT instead of the B+ tree. Have a look at the Google BigTable paper in the description.

@TommyJefferson1801 4 года назад

@@gkcs Also Cassandra uses the concept in this video. docs.datastax.com/en/archived/cassandra/3.0/cassandra/dml/dmlIntro.html?hl=reads

@ajitsoman2864 4 года назад

Cassandra uses this technique

@gkcs 4 года назад

Yes :D

@KiLLvenom1 4 года назад

*Explaining Bloom filter* - example word "cat", let's think of another word with c... "corona" 🤣I had to check the date of the video and indeed it's post the COVID-19 outbreak. Keep up the good work! Amazing content!

@shob6094 4 года назад

even i checked the date lol :P

@martingallegos917 4 года назад

"You can think of your database as a data structure" Man... I wish I've had someone explain things so comprehensive when I was in school

@dz_s 4 года назад

Great channel, thanks a lot. I personally would love to see some real world examples of that (open-source code etc.) maybe as a posts/short vids. I know it’s very time consuming stuff, but I bet it’s very interesting to see and understand how real world frameworks/projects use fundamental knowledge under the hood.

@creativcoder 4 года назад

You can go through cassandra's docs: docs.datastax.com/en/archived/cassandra/3.0/cassandra/dml/dmlHowDataWritten.html

@tejassardana6266 4 года назад

There are very few sights more beautiful than the thumbnail of Gaurav Sen in front of a white board

@KrishnaKishore 4 года назад

Good one. This video summarizes most of the 3rd chapter in Designing Data Intensive Applications.

@gkcs 4 года назад

Thanks :)

@sid007ashish 4 года назад

which book?

@romajain2425 3 года назад

We can divide the searches into range of chunks also. First chunk ranges from say 1 to 30, second chunk 30 to 60 ..and so on .If we want to know if a number 45 is present in the DB or not, we will do binary search on the second chunk

@danieldsouza8754 4 года назад

If you're here because you heard LSTM at the end and were like "Wait, what ?", He meant LSMT ( Log Structured Merge Trees ) :)

@gkcs 4 года назад

Yes, I messed up with neural networks :P

@Dattebayo-404 3 года назад

what is the difference between a SST and LSM tree ? both are doing the same thing after all.

@rohandvivedi 4 года назад

Would like a video on internals of in-memory databases that also offer persistence.

@avinashyadav1 4 года назад

was going through "data intensive applications" book and things were like plain text, came here - watched the video , and now the same text is forming images in my mind. Thanks man, you have created a perennial resource for system design topic.

@arjuns8391 4 года назад

same here :)

@guillermokuster414 2 года назад

Same :here 😅

@mangeshshikrodkar6192 4 года назад

Hi Gaurav, Thanks for video. With B+ tree we have read and write complexity as O(logn) where as you claim with your approach we can make writes O(1) or less than log(N). The concern i am having is, it will vary a lot depending upon how many sorted chunks you have along with how many merges you would be doing and size of sorted chunks etc. Along with that we have a lot of processing and complexity of implementing/maintaining /monitoring these sorted chunks and bloom filters+corner cases etc. There could be certain cases this approach may be better than B+ trees but there could be some cases where B+ trees will stand out. example could be like fb we have billions of records to process. How many chunks would you make and then how many would you merge and apply bloom filters on ? you cannot go with 6 chunks in your example but may be you need to go with 60k to 6M chunks and then apply processing 1st level search on 6M chunks and second level processing within that chunk + bloom filter etc. How would you handle record deletion and how many things you need to change (chunks) , bloom filter and even more may be.

@somnathbhowmick7946 4 года назад

YT comments must be stored in a lsm tree database, as they load so slow..

@MoonStruckHorrorsX 4 года назад

Persistent segment tree? 🤣🤣🤣

@obinator9065 3 месяца назад

2:17 it'll actually be slower if you do it in C/C++ (etc) if you do dynamic allocation for every element at runtime. You'll also lose out massively on caching performance. You should use a linked list with consistent size arrays (chunks). O(1) insertion + cache optimization.

@ddonddon2430 4 года назад

Hi Gaurav, Thanks for the video. I have a doubt regarding constant time for write operation. 1) Whenever a write request is triggered it takes log m(where m is size of memtable) to insert in the memtable right? 2) write query is comprised of 2 operations right? i) memtable ii) log file

@saikrishnatelukuntla233 4 года назад

Hey Gaurav, when you are condensing multiple queries( to save number of I/O calls ) and sending multiple queries together at once how are we ensuring consistency (especially in SQL), like an user makes a write operation and within no time he is trying to retrieve the record again. since condensed queries might not get executed by that time, the read operation says no record found error... may be using some kind of cache (a mini database table at application server level with schema replicated) to serve scenario I mentioned above would help? what are your thoughts on this?

@gkcs 4 года назад

Check out Log Structured Merge Trees. This video is an intro to the topic.

@pollathajeeva23 Год назад

"Revisiting is always great". I revisit this video after encountering real-world implementation from "RocksDB Internals".

@imnick8222 Год назад

Appending at the end of the LinkedList, the time complexity is O(N) (we need to traverse through the linked list and we need to append), Correct me If i'm wrong

@yashsolanki4565 Год назад

If I send a write query which is batched in the server, and before the batch is processed, I send a read for the same row, then the data is still not written on DB, but the user might expect it to be present. How does this work in such cases? The same thing goes for if the data is written on the log (or linked list), but not on the sorted array

@kashishsharma6809 2 года назад

I am so happy to watch this playlist. Most of time i read on platforms like quora or reddit that developers never used knowledge of DSA in their work. And everything was just to pass the interview test. But this series proves it wrong. If someone is working on developing a database or some services thst will be used as blackbox by others. Then knowledge of dsa will be so useful. Please correct me, if i am wrong. And add some more egs if i am right.

@Sameer-yq5gh 4 года назад

We need more videos this time if u keep uploading everyday...that might help

@nozimmehrubonov8818 2 года назад

@Gaurav amazing video, I wish I could hit like million times. Keep up with amazing content.

@tetsuoshiva 3 года назад

2:29 I don't get it, why do you care about read speed on the linked list? Doesn't the database have to read it in order? Or is it used as some kind of cache?

@KPTalksStuff 3 года назад

Honestly, this was a bit confusing. Also, this kind of seems to give an idea that we can have very fast writes and reads too, which I don't think is possible from what I have been reading and learning. Based on the RUM conjuncture that is. But I'll go check it out. The bloom filter...isn't that an extra cost to read? In terms of space that. Also, I was wondering how the bloom filter would be checked and what's the time complexity for that. There's a lot of words and interesting things, but I couldn't make sense out of the whole thing. It felt like simply shooting words in the air.

@KPTalksStuff 3 года назад

I'll go read the papers on LSM tree and more and implement them and come back and see if I can make sense out of this. I did hear all these words in many other videos though, the sorted string table, bloom filter, LSM tree, B tree, B+ tree etc

@Shorts_n_Laughs Год назад

Is it a better approach if we implement Bloom filter using TRIE data structure?

@sugyansahu9120 4 года назад

How can we scale the database, if my database is something like RDBMS one, say Oracle. We can't do all this stuff i guess.

@gkcs 4 года назад

Postgres also has this internally, with a switch to start it I think.

@sugyansahu9120 4 года назад

@@gkcs Thank you 😊

@songmeo 3 года назад

what if when we want to search for something that is not in database yet but still in the logs? the search time won't be O(log(n)) in that case right

@harshitmehta23 4 года назад

In a prod system, if the compaction is in progress, then how can we make sure that the LL which are being compacted are still available for read operation?

@scarface548 2 года назад

algorithmic complexity isn't for counting "number of operations" .

@Amritanjali 3 года назад

which is better tri or bloom filter?????

@deepak21903 2 года назад

Thanks gaurav for wonderful video... im your fan at first video itslf

@mohammedabulsoud1819 2 года назад

I like the theory you explain it. But how this can be applied. Is log and the sorted array something supported by RDMS ?

@gkcs Год назад

B+ tree implementations are used in most popular RDBMS.

@Hmm1298-8 4 года назад

I just finished STORAGE and RETRIEVAL from Designing Data Intensive Application and this video came. It gives more insights and flavor to our knowledge.

@colinmaharaj 2 года назад

I actually want to write my own database engine. This can help.

@jaleelpasha3301 3 года назад

Really nice video, few questions 1. What happens if the server goes down before we get a chance to flush the data from memory to the DB? 2. If we already know the ideal size of the segment to read the data efficiently, why not start with having the memory of the same size instead of opting in to merge the sorted segments of a smaller size?

@malaybiswal 2 года назад

Databases keep those entries in logs and they are called redo logs. In this situation, you apply redo logs to recover those data.

@rishiraj1616 3 года назад

The problem with Gaurav Sen's videos is that if its 17:22 minutes long then you need watch it exactly for 17:22 minutes to understand the concept. The moment you skip forward for 5 seconds anywhere then you need to rewind for 20 seconds.

@DK-ox7ze Год назад

A large DB will typically be having millions of entries, so will the merging not be very inefficient in that case? Since merging two sorted arrays has TC of O(n1+n2). Even if this compaction happens async, it can still slow down the DB because it will happen frequently.

@gkcs Год назад

Merging millions of entries is very reasonable. It's "trillions of entries" which should concern you 😛

@DK-ox7ze Год назад

@@gkcs No of entries will depend on data size of each unit, but let's say that the data is 100GB. Sorting that much of data frequently will not be efficient.

@zaoozka 4 года назад

I don't understand something at 7:50 If we have N records in a sorted list and we want to add another sorted list of 6 elements - we don't have to sort it again, we just need to merge it. That's O(n), not O(nlogn), right?

@mr.nobody2317 2 года назад

Got confused over how to do binary search on linked list...cause I couldn't think of any way to randomly access middle node in O(1)..With 2 pointers technique we can find the middle node but it would require O(n/2) complexity....Which would be ~2x faster but still cannot be O(logN) Will appreciate if someone could help me here.

@debaratimajumder5867 3 года назад

Hi Gaurav, Pardon me if I understood something wrong but: 1. Around 7:45, if we have 10,000 sorted records (n) in SST and we are trying to insert 6 new records (m), why do we need to sort (10,000 + 6) records? Instead, can't we do binary search on 10,000 records to find insertion position (log n) of the 6 records? Then time complexity will be = O(m * log n) = 6 * log(10,000) = 60 2. Also, if we go with the approach of merging the sorted arrays of size n (two arrays of 6 elements each merging into one array of 12 elements), why do we need to traverse all the arrays to insert a record x? Since the arrays are already sorted, we can easily compare starting and ending element of the array with x. If starting element is more than x or ending element is less than x, we don't need to traverse that array and can check another array where a[0]

@mahanteshambali 4 года назад

Zooming into your eye was goooood so was once more. I think your’s and Clement’s videos are like fun with learning. No jokes on Card Tricks though.

@boomboom-9451 Год назад

8:03 It's still O(nlog(n)) imo. Because in the worst-case scenario you'll go over every single chunk (n) and binary search on them (log(n)) as you can't really separate them into sequentially sorted chunks. (e.g first chunk holds sorted values from 1 to 50, a second chunk from 51 to 100...). This would require sorting every chunk in O(nlog(n)) time because we're getting random values in every query block.

@bostonlights2749 4 года назад

I was asked about Bloom Filters recently in a interview!

@royalacer 3 года назад

Can we have merging BG job on going on in background?

@Amritanjali 3 года назад

u r inspiration

@MollishreeSoor 4 года назад

I went through Bloom Filter video, and looking at the false positive probability, why are we not using trie in this LSMT scenario. Atleast error rate would be zero, even if it is not optimal.

@gkcs 4 года назад

LSMT uses the bloom filter because it takes a fixed amount of memory. We can also adjust the filter size as per our requirements. Tries are notorious for taking large amounts of memory. However, if the key range is small, it could be used.

@MollishreeSoor 4 года назад

@@gkcs thanks for the explanation. We need to weigh in our datastructures based on the requirements that we have

@eformance 3 года назад

I have a good understanding of how MySQL does all of this, and the only real advantage to this queuing technique is multi-row inserts. Inserting multiple rows at once is faster than doing the same number of individual row inserts. When you insert data into the (InnoDB) storage engine, it stores the data records directly in the primary key BTree, so the sorting is done upon insert. If you want to optimize bulk inserts, simply create a queue of records and periodically dump that queue to the database. The types of workloads that benefit from queued inserts would primarily be ETL data transfers.

@AnkitAgrawala 3 года назад

We can store updates in memory and send it to DB in bulk that part is clear, but how do we make sure that our data is divided into chunks and each chunk is sorted, isn't it internals of DBMS ? In short say I have a postgres DBMS, how do I implement this concept ?

@SahilThakur26 3 года назад

Thats how elasticsearch merge.

@debugloop 4 года назад

05:08 looks like Warning after Hiroshima incident :)

@Dan-tg3gs 4 года назад

why would a hashmap not be used though in which the read and write be O(1)?

@gkcs 4 года назад

A log is faster to write into than a Hashmap.

@ShabnamKhan-cj4zc 4 года назад

Wow..never thought abt the database optimization from write and read perspective...always thought abt query optimzation Thanks a lot for providing such a useful information..thanks again and keep doing this great work

@AntonyXavier-v9j 4 месяца назад

WOW this is a great video, thanks man !!!! great help u did

@gkcs 4 месяца назад

Cheers!

@sahithyalakshmi476 3 года назад

Hi Gaurav - Could you please recommend good books for learning DS and Algo. This is first time I am hearing LSM and want to know more on other structures..

@vibhorpareek8179 3 года назад

I think thats how cassandra works to make writes faster .. mem table in memory and flushed to sst for good write latencies and using bloom filters/row and key caches to reduce read latency!

@KanishTheVista 4 года назад

but searching on bloom filter means need to iterate query string and get hash value...... as the size of string increasing the operation will increase which can be greater than our binary search operation.

@gourabpanda 4 года назад

When is sorted chunk created? Is it during insertion or sometime later ? What happen when its get a query for the data which is not yet in the sorted chunk ?